grddl in a nutshell v1
DESCRIPTION
Takahashi style tutorial on GRDDL (version 1)TRANSCRIPT
fabien, gandon, inria
GRDDLin a nutshell
2
I want my data backfrom my web pages.
many datain my web page
3
4
I want your data backfrom your web pages.
many datain many web pages
5
6
open your datato anyone who might use it
W3C ©
deep web
7
in particular…
<ccxml version="1.0" xmlns="http://www.w3.org/2002/09/ccxml"> <eventprocessor> <transition event="connection.alerting" name="evt"> <log expr="'The number called is' + evt.connection.remote + '.'"/> <if cond="evt.connection.remote == 'tel:+18315551234'"> <log expr="'Go away! we do not want to answer the phone.'"/> <reject/> <else/> <log expr="'We like you! We are going to answer the call.'"/> <accept/> </if> </transition> <transition event="connection.connected"> <log expr="'Call was answered,Time to disconnect it.'"/> <disconnect/> </transition> <transition event="connection.disconnected"> <log expr="'Call has been disconnected. Ending CCXML Session.'"/> <exit/> </transition> </eventprocessor></ccxml>
<mroot> <mrow> <mn>1</mn> <mo>-</mo> <mfrac> <mi>x</mi> <mn>2</mn> </mfrac> </mrow> <mn>3</mn></mroot>
<users> <person login="fgandon" uid="19536"> <home>/net/user/fg</home> <pref>/sys/19536.inf</pref> <access_level>8</access_level> </person> <person login="fgandon" uid="19536"> <home>/net/user/fg</home> <pref>/sys/19536.inf</pref> <access_level>8</access_level> </person></users>
<p:pipeline name="fig2" xmlns:p="http://example.org/PipelineNamespace"> <p:input port="doc" sequence="no"/> <p:output port="out" step="xform" source="result"/> <p:choose name="vcheck" step="fig2" source="doc"> <p:when test="/*[@version < 2.0]"> <p:output name="valid" step="val1" source="result"/> <p:step type="p:validate" name="val1"> <p:input port="document" step="fig2" source="doc"/> <p:input port="schema" href="v1schema.xsd"/> </p:step> </p:when> <p:otherwise> <p:output name="valid" step="val2" source="result"/> <p:step type="p:validate" name="val2"> <p:input port="document" step="fig2" source="doc"/> <p:input port="schema" href="v2schema.xsd"/> </p:step> </p:otherwise> </p:choose> <p:step type="p:xslt" name="xform"> <p:input port="document" step="vcheck" source="valid"/> <p:input port="stylesheet" href="stylesheet.xsl"/> </p:step></p:pipeline>
many dialectsof XML are in use
8
<HTML> <HEAD> <TITLE>title</TITLE> <LINK REL=STYLESHEET TYPE="text/css" HREF="http://style.com/cool" TITLE="Cool"></HEAD> <BODY> <H1>Headline is blue</H1> <P STYLE="color: green">While the paragraph is green. </BODY></HTML>
resourcesmany ways to weave data with the web
9
embedding explicitseveral initiatives to make data
10
your dataGRRDL = an easy way to open
11
XML dataGRRDL = an easy way to extract RDF from
12
http://www.flickr.com/photos/cho45/1402634073/
13
imagine…
guitar,Stephan wishes to buy a
14
reviews,he visits a site offering
15
reviews & profileshe uses GRDDL to aggregate
16
GRDDLat work
17
GRDDL step 1declare a document is a source
18
GRDDL step 2link it to one or more extractors
19
GRDDL step 3let GRDDL agents extract RDF from the document
20
21
generic profiledeclare an XHTML document is a source
<head profile="http://www.w3.org/2003/g/data-view"profile="http://www.w3.org/2003/g/data-view">
<title>The man who mistook his wife for a hat</title>
<link rel="transformation"
href="http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl" />
<meta name="DC.Subject" content="clinical tales" />
…
</head>
22
transformationlink an XHTML document to a
<head profile="http://www.w3.org/2003/g/data-view">
<title>The man who mistook his wife for a hat</title>
<link rel="transformation" link rel="transformation"
href="http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl"href="http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl" />
<meta name="DC.Subject" content="clinical tales" />
…
</head>
23
GRDDL agentwhat is extracted by a standard
<head profile="http://www.w3.org/2003/g/data-view">
<title>The man who mistook his wife for a hatThe man who mistook his wife for a hat</title>
<link rel="transformation"
href="http://www.w3.org/2000/06/ dc-extract/dc-extract.xsl" />
<meta name="DC.Subject" content="clinical tales" meta name="DC.Subject" content="clinical tales" />
…
</head> # dc:title "The man who mistook his wife for a hat" # dc:title "The man who mistook his wife for a hat"
# dc:subject "clinical tales"# dc:subject "clinical tales"
custom profiledeclare a source and a transformation at once
24
25
custom profileno transformation, just reference the
<head profile="http://purl.org/NET/erdf/profile"profile="http://purl.org/NET/erdf/profile">
<title>Fabien’s agenda</title>
(…)
</head>
26
transformationprofile document = GRDDL source giving the
<html xmlns="http://www.w3.org/1999/xhtml">
<head profile="http://www.w3.org/2003/g/data-view"profile="http://www.w3.org/2003/g/data-view">
<link rel="transformation" href="rel="transformation" href="http://www.w3.org/2003/g/glean-profilehttp://www.w3.org/2003/g/glean-profile" " />
</head>
<body>
<p><a rel="profileTransformation"<a rel="profileTransformation"
href="http://purl.org/NET/erdf/extract-rdf">GRDDL transform</a>href="http://purl.org/NET/erdf/extract-rdf">GRDDL transform</a>
</p>
</body>
</html>
XML documentdeclare a source and a transformation for an
27
28
generic profiledeclare an XML document is a source
<book xmlns="http://example.org/book/"
xmlns:grddl="http://www.w3.org/2003/g/data-view#"xmlns:grddl="http://www.w3.org/2003/g/data-view#"
grddl:transformation="http://example.org/book/getAuthor.xsl" >
<title>The man who mistook his wife for a hat</title>
…
</book>
29
transformationlink an XML document to a
<book xmlns="http://example.org/book/"
xmlns:grddl="http://www.w3.org/2003/g/data-view#"
grddl:transformation="grddl:transformation="http://example.org/book/getAuthor.xslhttp://example.org/book/getAuthor.xsl"" >
<title>The man who mistook his wife for a hat</title>
…
</book>
30
messagetake away
Don't buryyour data in some HTML page orXML document
31
data…when you publish a document that contains
32
do reference GRDDLprofiles and/or transformations
33
34
fabien, gandon