a transcoding proxy for html web pages: web page sampling and conversion evaluation. andrew stone...

Post on 29-Dec-2015

214 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Transcoding Proxy for HTML Web Pages: Web Page

Sampling and Conversion Evaluation.

Andrew StoneCS525m

Worcester Polytechnic Institute

2

Overview

• Proxy Goal and Scope

• Related Work

• Project scope

• Testing Methodology

• Demo

• Conclusions

• Future Work

Worcester Polytechnic Institute

3

Proxy Goal

• Reduce data traffic– Get content displayed faster– Save bandwidth (and money)– Reduce power consumption

• Change content to suit device– Browser properties

Worcester Polytechnic Institute

4

Related Work

• HTML to WML Transcoding Proxy– http://zoo.cs.yale.edu/classes/cs490/00-01b/dugas.robert.rfd8/rfd8cs490.pdf

• iMobile EE– http://portal.acm.org/citation.cfm?id=778492&coll=portal&dl=ACM&CFID=71256236&CFTOKEN=91425173

• RSVP Browser– http://portal.acm.org/citation.cfm?id=591429&coll=portal&dl=ACM&CFID=71256236&CFTOKEN=91425173

• Navigating a Mobile XHTML App– http://portal.acm.org/citation.cfm?id=642669&coll=portal&dl=ACM&CFID=71256236&CFTOKEN=91425173

• http://www.skweezer.net

Worcester Polytechnic Institute

5

Project Scope

• Create component to transcode web pages using HTML Tidy and XML Stylesheets

• Measure web page size reduction

• Evaluate web page readability on PC with IE and Firefox and on Windows Mobile 5 Pocket IE

Worcester Polytechnic Institute

6

Issue Get Request

Internet

Proxy

Get Request

HTML Tidy

XSLTTransform

Return Content

xHTML

Transformed Content

Worcester Polytechnic Institute

7

Web Page Reduction

• Data Set: 5852 pages from 403 domains– From Paul Timmins and Sean McCormick’s “Characteristics of Today’s Mobile

Web Content”

• HTML Tidy produced 2730 transformed pages– 2417 successful XSL Transformations from 266 domains

• Before– Average Page Size including images: 46.9 KB– Average Page Size excluding images: 23.3 KB

• After– Average Page Size including images: 43.0 KB– Average Page Size excluding images: 19.4 KB

Worcester Polytechnic Institute

8

Web Page Layout Demo

Worcester Polytechnic Institute

9

Conclusions

• Real gains are in image manipulation

• ~50% of web pages have non standard HTML or HTML Tidy

• Another HTML fixing tool should be tested

• Image compression should be evaluated

top related