subproject 4: html-wml transcoding system jia-shung wang computer science department national tsing...
Post on 19-Dec-2015
213 views
TRANSCRIPT
Subproject 4: HTML-WML Transcoding
System
Jia-Shung Wang Computer Science DepartmentNational Tsing Hua University
March 27, 2001
Outline
• Motivation and Issues• Examples of Transcoding • System Overview and Translation
Flow• Some HTML to WML Conversion
Strategies
Information Appliances
• Different design constraints based on intended use, enhances ease of use– Desktop PC– Mobile PC– Desktop “Smart” Phone– Mobile Telephone– Personal Digital Assistant– Set-top Box– Digital VCR– …
• Implications: – Shift from computer design to consumer design– Heterogeneous “standards,” hybrid networking– Interactive networking, access on demand, QoS
Motivation Rapidly growing diversity of wireless
communication devices
The incredible growing of the amount of available HTML web pages on the Internet
Solutions for mobile devices with WML browsers to access the existing HTML or WML pages on the Internet.
IssuesDevice-enabled service for WML
mobile devices with different types of screen
Bandwidth-driven transmission for rapid response and fast delivery speed
The usage of browsing behaviorThe resizing of images /iconsThe compression of the resulting
WML data
Demos of Transcoding
•Contents fromenYES 鉅亨網USAtodayCS, NTHUNTHUVOD
DiscussionsenYES provides two versions: regular HTML
and WAP to serve PC users and mobile device users separately.
USAtoday also provides content (simplified version) for users with Palm.
NTHU, CS-NTHU homepages : If we keep the original figure for saving the link information, then the page layout becomes old. (using HTML browser with:Browse-It).
VOD homepage, one-column text: no significant difference after transcoding.
Usage of Browsing Behavior
The automatic translation seems complicated because of the diversity of content posted on an HTML page.
It is unlikely to have a universal conversion strategy to translate every HTML page to sequences of WML decks effectively.
However, it seems a good idea to categorize the browsing behavior to classify the HTML page to be translated first.
Usage of Browsing Behavior (cont’d)
After doing that we may realize what the client requires. Then we can have a corresponding conversion to extract the acquired content step-by-step and translate them into some predictable and small sized WML documents.
We believe that there would be some adequate conversions for some kinds of web pages after classification.
Related Works Transcoding Proxy of IBM alphaWorks
It has a goal to manager different version of contents with different fidelities and modalities in order to adapt the delivery to different client device.
Related Works Intel Quick Web Technology
• New software capability that helps Internet providers and digital distribution companies increase the delivery speed of Web pages containing photos, drawings and other graphics.
• It uses two key techniques, “Compresses” and “Caches”.
Related WorksSpyglass Prism
• Spyglass Prism dynamically adapts Web content to match various non-PC devices.
• It functions as a proxy server, caches the converted content, and dynamically converting standard HTML to WML.
Related WorksProxy Architecture for Efficient Web
Browsing over Cellular Networks
• Decreases the access time of browsing WWW in narrow-band wireless environment.
• It adopts persistent connection and pipelining technique based on proxy architecture to improve the HTTP process between the client and the proxy server.
Comparisons betweenHTML and WML
• Both make use of tags and attributes.• Similar character set, syntax and data
types.• Two special elements of WML structure
– Deck and Card
• Different design goal– HTML: To Publish hypertext on the World Wide
Web– WML: For narrow network bandwidth devices
with small displays, limited memory and fewer computational resources.
Examples of HTML and WML
WML<wml> <deck> <card> <p> <do type="accept"> <go href="#card2"/> </do> This is the first card... </p> </card> <card id="card2"> <p> This is the second card. </p> </card> </deck></wml>
HTML<html> <head> <title> Example page. </title> </head> <body> <h1> This is a headline. </h1> <p> This is a paragraph. </p> </body></html >
System Overview
Web Server
MultimediaContent
CGIScripts
etc.
Translation Server
WML Generator
Client
WML
WMLBrowser
Etc.
HTTP
HTML ParserWAP
HTML-WMLTranslator
HTML, WML
Documents
HTTP
Features
• An HTML-WML Translator on the Translation Server
• Both HTTP and WAP requests are acceptable.
• Java Servlet API compatible• Server- and platform-independent
Translation Server: Components
and Flow
NetworkProtocol
Proxy
HTMLParser
FilterDocumentAnalyzer
Decks &Cards
WMLGenerator
LinkBuilder
Request Request
Response Response
Components
• Gateway– Accept requests from clients – Return appropriate responses
• Proxy Servlet– Get the requested remote documents– Determine to pass or convert– Cache the converted results
Components (cont’d)
• HTML Parser– Parse the HTML document as a parse
tree• Document Analyzer
– Analyze the parse tree• Filter
– Filter any objects unnecessary or not supported by the client device
– Image/icon resizing
Components (cont’d)
• Content Divider– Split a document into multiple,
small-size documents • Link Maker
– Insert extra links to make small documents reach one another
• WML Generator– Produce well-formed WML documents
and return them to Proxy Servlet
HTML to WMLConversion Tools
• Semi-automatic:– Used for rich HTML documents – The conversion form is designated
manually with the help of analysis and editing tools.
– The resulting forms are distributed to the gateway servers.
• Automatic:– Used for simple documents, such as News
and BBS, …
HTML to WMLConversion Strategies
• Strategy I: Tables to Lists– Simply removing all layout elements
such as table– Let all the contents arrange into only
one column with a fixed width• Strategy II: One Table One Deck
– Extracting each table to form a deck
HTML to WMLConversion Strategies (cont’d)
• Strategy III: Preview Firsta. One Table One Deckb. Collect all the first card of every deck
as preview cardsc. Arrange these preview cards to form
an preview deck, which will be transmitted first, every preview card will have a link to its corresponding deck
Original Document
<document>
<table>
<table>
<table>
< section 4>
<section 1>
<section 2>
< section 3>
<content 1_1>
<content 1_2>
<content 4_1>
<content 2_1><content 2_2><content 2_3><content 2_4>
<content 3_5><content 3_6><content 3_7>
<content 2_5><content 3_1><content 3_2><content 3_3><content 3_4>
Tables to Lists
<document> <deck>
<content 1_1><content 1_2><content 2_1><content 2_2><content 2_3>
<deck>
<deck>
<content 2_4><content 2_5><content 3_1><content 3_2><content 3_3>
<content 4_1>
<content 3_5><content 3_6><content 3_7>
<content 3_4>
One Table One Deck
<document>
<deck>
<content 1_1>
<content 1_2>
<content 2_1><content 2_2><content 2_3><deck>
<deck>
<content 2_4><content 2_5><content 3_1><content 3_2><content 3_3>
<content 4_1>
<content 3_5><content 3_6><content 3_7>
<content 3_4>
<deck>
<deck>
Preview First
<document>
<deck>
<content 1_1>
<content 1_2>
<content 2_1>
<content 2_2><content 2_3>
<deck>
<deck>
<content 2_4><content 2_5>
<content 3_1>
<content 3_2><content 3_3>
<content 4_1>
<content 3_5><content 3_6>
<content 3_7>
<content 3_4>
<deck>
<deck>
Strategy Evaluation
• Assuming we have S sections in a document and the document is translated to N WML cards.
• Every deck contains at most C cards.
• Assuming that the contents in the same tables are similar.
Evaluation of Searching After Translation
Preview FirstOne TableOne Deck
Tables to Lists
GoodBestWorstUser Friendly
S/2CS/2N/2Average DeckAccess Time
Performance Evaluation
5.4%57.2%16,891
7.4%46.7%11,232
3.5%22.0%7,440
280,7278,32521,203
126,7406,13717,937
176,3619,47124,359Experiment #1
Experiment #2
Experiment #3
Headers Text
Source (bytes) Images(bytes)
WithImages
WithoutImages
ReductionHTML PagesWMLDecks(bytes)
25.2%40.3%12,06217,96620,3639,568Experiment #4
Performance Evaluation (Experiment #1: What’s WAP)
Preview
Deck 1
Deck 3.2
Deck 3.1
What’s WAP
Preview
Deck 3Deck 2Deck 1
WAP Forum
Performance Evaluation (Experiment #2: NTHU Web Page)
Preview
NTHU
Preview
Deck 1
Preview
Deck 1 Deck 2.1
Deck 2.2
Current Status
Preview
Deck 1 Deck 2.1
Deck 2.2
History
Deck 3.1
Deck 3.2
About NTHU
Performance Evaluation (Experiment #3, NTHU CS Web
Page)
Preview
Deck 1
Deck 3.2
Deck 3.1
Faculty
Preview
Deck 1
NTHU CS
Deck 3.4
Deck 3.3
Deck 3.6
Deck 3.5
Performance Evaluation (Experiment #4, IETF Web Page)
Preview
Deck 1
IETF
Preview
Deck 1 Deck 2.1
Deck 2.2
Internet-Drafts
Preview
Deck 1
Deck 2.2
Deck 2.1
Internet-Drafts Index
Deck 2.4
Deck 2.3Deck 2.5
Preview
Deck 1
Deck 2.2
Deck 2.1
DNSOP
Deck 2.4
Deck 2.3Deck 2.5
Implementation
Goal: Portability, reusability, and crash protection.
Translation server: under Java environment with Java Servlet, Java HTML Tidy, and XML Parser for Java.
Servlet-enable server: Avenida Web Server and Nokia WAP Server
Microsoft Windows NT Workstation 4.0 with Service Pack 5
Summary
• Design an HTML to WML transcoding system with1. Analyzing and filtering HTML contents2. Image/icon resizing3. WML browsing mode design and WML
conversion tool4. compression and decompression
modules of the WML data. 5. WML transmission control