Tetherless World Constellation
Open Government Data
Jim HendlerTetherless World Professor of Computer and Cognitive Science
Assistant Dean of Information Technology and Web Science
Rensselaer Polytechnic Institutehttp://www.cs.rpi.edu/~hendler
@jahendler (twitter)
Tetherless World Constellation
Current state (academic)
• Lots of data is being opened• But much of it is opaque and contains
(sometime) significant errors• Smart mark-up (including annotation) is needed• But also needed are information and visual
presentation capabilities to really put people in the loop
• Technical approaches are helping but curation (by people and computers) is sorely needed
Tetherless World Constellation
Linked Data + Semantics
• "Linked Data" approach finds its use cases in Web Applications (at Web scales)– A lot of data, a little semantics– Finding anything in the mess can be a win!
• Example– Declare simple inferable relationships and apply, at
scale, to large, heterogeneous data collections• eg. Use InverseFunctional triangulation to find the entities
that can be inferred to be the same– These are "heuristics" not every answer must be right
(qua Google) – But remember time = money!
Tetherless World Constellation
RDFTripleStore
DynamicContentEngine
HTTP
RDF
Web App(w SPARQL)
RDFTripleStore
Fits Web Architecture
• ~2006: Web app developers discover the Semantic Web
…
HTML
2008 examples include sites from "regular" Web players such as Dow Jones, Reuters and Yahoo!
Tetherless World Constellation
What’s promising
• Linked open data (data-gov.tw.rpi.edu, data.gov.uk)
• Open (access) commons and data publishing (and citation)
• Markup languages and semantics and tools to enable transparency
• Web 2.0 to put people in the loop and use and contribute to annotations
• Lower barriers to internet visualization, e.g. Google graphics
Tetherless World Constellation
Moving data.gov to linked data (UK)
• Built around linked data with top-down push from “Number 10”
Tetherless World Constellation
Moving data.gov to linked data (US)
• Third parties (like RPI) translate the govt data into Sem Web forms and link to sources
• Plans for a semantic.data.gov in OGD implementation plans,, but unfunded
Tetherless World Constellation
Adding some Web magic
Web Analytics
Social Data Networks
External Links
Tetherless World Constellation
Visualization can help identify data errors
Correlates fires, acres burned, and agency budgets
Tetherless World Constellation
Visualization can help identify data errors
Were there really no fires in 1985?
Tetherless World Constellation
And many other interesting issues
• Trust– Government data is controversial, and potentially biased
• How do we confirm or dispute?
• Combination– When we combine data we need to keep the provenance of
information (see trust)• How can we show and use?
• Scaling– Data-gov Wiki has already converted 5,448,693,510 triples
• Versioning and updating• Archiving• Searching• …
Tetherless World Constellation
Summary
• The Open Govt data is a great play ground– Government data released as RDF (UK)– Government data converted to RDF (US)– Government data that can be found in many forms
and used or converted (WWW)
• Great showcase for the web nature of the Semantic Web– Mashups
• But many challenges remain– Scaling, Trust, Provenance, Archiving, Curation, …