Download - Web Scraping using Diazo!
Web Scraping@alvaro_aguirre
Saturday, November 5, 2011
In search of our cosmic origins...
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Data Scraping vs
Web Scraping
Saturday, November 5, 2011
<html>
<header></header>
<body>
.....
</body>
</html>
Data Scraping
Saturday, November 5, 2011
Web Scraping
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
DeliveranceXDV
Diazo
Saturday, November 5, 2011
Diazo
Saturday, November 5, 2011
Saturday, November 5, 2011
<replace css:content=”h1” css:theme=”#main” />
Saturday, November 5, 2011
<drop css:content=”h1” />
<drop css:theme=”breadcrumbs” />
Saturday, November 5, 2011
<replace css:theme=”#header” content=”#header-element” if-content=”” />
Saturday, November 5, 2011
<drop css:theme="#info-box" if-path="/news"/>
Saturday, November 5, 2011
<theme/><notheme/><replace/><before/><after/><drop/><strip/><merge/><copy/>
Saturday, November 5, 2011
<replace css:theme="#details"> <dl id="details"> <xsl:for-each css:select="table#details > tr"> <dt><xsl:copy-of select="td[1]/text()" /></dt> <dd><xsl:copy-of select="td[2]/node()"/></dd> </xsl:for-each> </dl></replace>/></dt>
<table id="details"> <tr> <td>One</td> <td>1</td> </tr> <tr> <td>Two</td> <td>2</td> </tr></table>
<dl id="details"> <dt>One</dt> <dd>1</dd> <dt>Two</dt> <dd>2</dd></dl>
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Saturday, November 5, 2011
Tools
Saturday, November 5, 2011
External Content
Saturday, November 5, 2011
Saturday, November 5, 2011
• development of web & mobile interfaces
• legacy apps integrations
• prototypes
• low coupling
Saturday, November 5, 2011
from diazo.compiler import compile_themefrom lxml import etreefrom diazo.compiler import compile_theme
absolute_prefix = "/static"
rules = "rules.xml"theme = "theme.html"
compiled_theme = compile_theme(rules, theme, absolute_prefix=absolute_prefix)
transform = etree.XSLT(compiled_theme)content = etree.parse(some_content)transformed = transform(content)
output = etree.tostring(transformed)
Saturday, November 5, 2011
github/aaguirre
Saturday, November 5, 2011
diazo.org
Saturday, November 5, 2011
plone.org
Saturday, November 5, 2011
gracias!
Saturday, November 5, 2011