epa oei linked data process
DESCRIPTION
EPA OEI Linked Data Process presentation - 2012.TRANSCRIPT
Publishing EPA Data as
Linked Data
A brief by Michael Pendleton
EPA Office of Environmental [email protected]
“We’re moving from managing documents
to managing discrete pieces of open data
and content which can be tagged, shared,
secured, mashed up and presented in the
way that is most useful for the consumer
of that information.”
-- Report on Digital Government: Building a 21st Century Platform to
Better Serve the American People
What is driving us?
Goal: Make Open Data, Content, and Web APIs the New Default
Slide Credit: David G. SmithAug 16, 2011 presentation U.S. Environmental Protection Agency
Linked DataWhat’s It All About?
• Speak the Language of the Web• Just as you surf web pages, linked data lets you surf
data.• SOAP was about making the web try to work like
applications; REST was about making applications work like the web.
• Linked Data is about making your DATA work like the web.
4
RDF is a lingua RDF is a lingua franca for data franca for data
exchangeexchange
Slide Credit: David G. Smith U.S. Environmental Protection Agency
Linked Data Basics
•Tim Berners-Lee: 5-Star model for publishing data
• http://www.w3.org/DesignIssues/LinkedData.html
6
•Linked Data is about publishing and consuming data using international data standards
•Based on 20 year old idea (the Web)
•A system of linked information systems
Global requirements
•Comprehensively link legislation & regulations for more effective government
•Explain context, source, version & publication date with the data itself
•We need global standards for metadata
The mission of the Government Linked
Data (GLD) Working Group is to provide
standards and other information which
help governments around the world
publish their data as effective and usable
Linked Data using Semantic Web
technologies.
Best Practices
Vocabulary Guidance
Community Building
US EPA publishes lots of CSV files ...
And now, Linked Open Data ...
• A proof-of-concept launched 2011 with 5 Star Linked Data
• Publication of 1.3M facilities (FRS) and the substances (SRS) regulated by the EPA
• TRI program links to 25 years of data on major polluters
• Additional pilots in 2012 incorporating EPA and anonymized electronic medical records (EMR) data from Sentara Healthcare
• 5 Star Linked Open Data to be hosted & accessible on an EPA production Web site in summer 2012
• Empower users to create their own views of data to satisfy different applications
• Build a community around the data in which users help each other to curate and connect as needed
• Skip the supermodel - Leave data in the multiple “best of breed” systems; wrap and expose on the Web of Data
Increase re-use by publishing Linked Data
There is a Process
PublishPublish PublishPublish
ConvertConvert ConvertConvert
DescribeDescribe DescribeDescribe
NameName NameName
ModelModel ModelModel
IdentifyIdentify IdentifyIdentify
MaintainMaintain
• Identify a dataset others are likely to want to re-use
•Modeling
•Onsite modeling session (half day)
• Linked Data modeling supported by experts
• Validate the model with data owners/stewards
• Publish data on the Web (opendata.epa.gov) per Best Practices
• Produce automated scripts to maintain current data
• Announce Linked Open Data sets *
• Review usage reports to support relevance & user feedback
7 steps to publishing Linked Data
* Pending EPA Systems Security Plan approval
Open Data Platforms• We’re using Callimachus, a Web platform for data-driven applications based on Linked Data principles.
• It is hosted on Amazon EC2 and we have 24x7x365 data & application support.
• There are other data platforms, we selected this one because it is fully W3C standards compliant, no vendor “lock in”
• It’s Open Source (Apache 2.0)
•Linked Data promotes goals of transparency & economic development during times of fiscal austerity
•Publish in reusable format (RDF family of standards)
•Use OPEN vs proprietary in data formats
•Define a URI Policy and Strategy
•Use best practices and vocabularies exist -- don’t recreate the wheel
Recommendations
Publishing Linked Data will require continual nurturing but the rewards are worth it
Resources
• VisibleGovernment.ca Website http://visiblegovernment.ca
• Hack, Mash and Peer: Crowdsourcing Government Transparency, Jerry Brito, George Mason University, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1023485
• Blog on UK Environment Agency Water Quality, see http://data.southampton.ac.uk/datasets.html
• Southampton Open Data Service, see http://data.southampton.ac.uk/datasets.html
• Blog post on Clean Energy data from Reegle, see http://blog.semantic-web.at/2012/04/13/reegle-info-linked-open-energy-data-cloud/
• Blog post on Publishing Linked Open Data in Tight Economic Times, 30-Jan-2012, http://3roundstones.com/2012/01/30/publishing-linked-open-data-makes-good-sense-in-tight-economic-times/
• Blog post on HealthData.gov from US Health & Human Services, 4-June-2012, http://www.healthdata.gov/blog/welcome-new-healthdatagov
• Blog post on US HHS Domain Challenge 1: Metadata, 2-June-2012, http://www.healthdata.gov/blog/domain-challenge-1-metadata
Coming soon ...• Best Practices for Publishing Linked Data (editor’s
Draft 20-Apr-2012), see https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html
• Linked Data Cookbook, see http://www.w3.org/2011/gld/wiki/Linked_Data_Cookbook
• Linked Data Directory, see http://dir.w3.org
• Attend the 2012 International Open Government Data Conference co-sponsored by data.gov & The World Bank 10-12 July 2012, Washington DC, see http://www.data.gov/communities/conference
This work is Copyright © 2011-2012 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/
You are free:
to Share — to copy, distribute and transmit the work
to Remix — to adapt the work
Under the following conditions:
Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).
Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.
CreditsJennifer Bell,
VisibleGovernment.ca(CC-BY-SA)
http://www.slideshare.net/jenniferbell
1-5 Star Linked Data image
http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/
LOD Cloud DiagramsRichard Cyganiak, Anja
Jentzsch, (CC-BY-SA)http://lod-cloud.net/
Book covers © their respective owners and used under Fair Use for educational purposes
© 2012 Bernadette Hyland, released under a CC-BY-SA license