how webkit works
DESCRIPTION
How WebKit WorksTRANSCRIPT
How WebKit WorksAdam Barth (abarth)October 30, 2012
What is WebKit?
WebKit is a rendering engine for web content
WebKit is not a browser, a science project, or the solution to every problem
HTML
JavaScript
CSS
WebKitRendering of a
web page
WebCore(HTML, CSS, DOM, etc, etc)
Major Components
WTF(Data structures, Threading primitives)
Platform(Network, Storage, Graphics)
JavaScriptCore(JavaScript Virtual Machine)
Bindings(JavaScript API, Objective-C API)
WebKit and WebKit2(Embedding API)This talk
Life of Web Page
Network
Loader
HTML Parser
DOM Script
Render Tree
CSS
Graphics Context
Page
Main Frame
Document
Pages, Frames, and Documents
Frame Frame
Frame
Document Document
Document
Frame
Document
Lifecycle of a Frame
Uninitialized
Initial Document
Provisional
Ready to Commit
Committed
Checking Policy
● Committed is the quiescent state
How the Loader Works (Idealized)
MemoryCacheCachedResourceLoader
CachedResource
ResourceRequest
ResourceLoader
ResourceHandle
CachedResourceRequest
The Loader is actually very messy and complicated, but we have a long-term project to clean up its nuttiness
Platform-specific code
Tokenizer
TreeBuilder
How the HTML Parser Works
Bytes
Characters
Tokens
Nodes
DOM
<body>Hello, <span>world!</span></body>
StartTag: body Hello, StartTag: span world! EndTag: span
body Hello, span world!
body
Hello, span
world!
3C 62 6F 64 79 3E 48 65 6C 6C 6F 2C 20 3C 73 70 61 6E 3E 77 6F 72 6C 64 21 3C 2F 73 70 61 6E 3E 3C 2F 62 6F 64 79 3E
Preload Scanning for Fun and Profit
Mary had a little lamb Tokenizer TreeBuilder
document.write("<textarea>");
Script execution can change the input streamPreload scanner tokenizes ahead● When parser is blocked on external scripts● Starts resource loads earlier
XSSAuditor
Tokenizer TreeBuilder
HTTP Request
HTTP Response
XSSAuditor
XSSAuditor examines token streamLooks for scripts that were also in the request● Assumes those scripts were reflected XSS● Blocks them
DOM + CSS → Render Tree
body
Hello, span
world!
html
head
title
Greeting img
#footer { position: fixed; bottom: 0; left: 0 }body > span { font-weight: bold; }
RenderBlock
RenderInline
RenderText
RenderImage
RenderText bold
Layout
RenderBlock
fixed
Anonymous RenderObjects
div
Hello, div
world!
RenderBlock
RenderBlock
RenderText
RenderBlock
RenderText
Anonymous
● Not every RenderObject has a DOM Node● Every RenderBlock either:
○ Has all inline children○ Has no inline children
LayerTree
RenderBlock
RenderInline
RenderText
RenderImage
RenderText bold
RenderBlock
fixed
RenderLayer
RenderLayer
● Sparse representation of RenderTree● Enables accelerated compositing, scrolling
Yet Another Tree: LineBoxTree
<div>An old silent pond...A frog jumps into the pond,splash! <b>Silence again.</b></div>
InlineTextBox
InlineTextBox InlineTextBox
RootInlineBox
RootInlineBox
RootInlineBox
InlineTextBoxRenderBlock
RenderText
RenderInline
RenderText
bold
● One RootInlineBox per line of text● List of inline flow and inline text boxes
Conclusion
● WebCore's main processing pipeline:○ Loader and Parser○ CSS, DOM, and Script○ RenderTree, LayerTree, and InlineBoxes
● Other major subsystems○ Accessibility, Editing, Events, CSS, Web Inspector○ Plugins, SVG, MathML, XSLT...
● Other components○ WebKit, Bindings, Platform, JavaScriptCore, WTF○ ... 1.5 MLOC of C++
● Learn more:○ http://www.webkit.org/coding/technical-articles.html