20141030 linda workshop echallenges2014 - state of the art in open data infrastructure
TRANSCRIPT
State of the Art in Open Data Infrastructures for Public Sector
Information
ENGAGE & LinDA Project
Dr. Spiros MouzakitisENGAGE Project Manager
National Technical University
of Athens
Decision Support Systems
Laboratory
eChallenges 2014
Open Data Landscape
• National Statistical Offices / Eurostat
• Data Gov Initiatives• European
• Country
• City
• Ministries / Public bodies / Organizations
• World Data Banks / Indicators
• Research data repositories
• Data Marketplaces
• Data aggregators / Catalogs
• Linked Open Data Cloud
EM-DAT
Open Data Types
• Catalog - File based
� Non-processable format (PDF,images,flash)
� Processable format (Excel)
� Processable and open format (CSV)
� Standard / Tool specific format (SPSS,SDMX,GML)
• Relational Database• Data sources are already processed - streamlined allowing
advanced services on them (visualizations, aggregations,etc)
• Linked Data (RDF, turtle)
• APIs
• Static / Archived information / Snapshots
• Real-time information
Data provision structure
Data provision structure
Data creation
• Harvest from others
• Backoffice / Service operations
• Survey
• Internal Research / Experiments
Standards
• W3C - DCAT
• EC - DCAT Application Profile
• Linked Open Data Vocabularies
• Schema.org
• SDMX
• DDI
• INSPIRE
• ……..
Domain-specific examples
Data catalogue
Vocabularies
Varietions of CKAN metadata, DC, UK
eGovernment Metadata Schema,
Obstacles of utilizing data in everyday business
• Number of datasets vs Quality
• Re-usability• A directory of offices / phones has no re-usability / research interest
• Aggregated vs Microdata
• Vital context around published datao Methodology
o Time span
o Internal / External conditions
o Sample
• Linked Data has huge potential but needs
• Commercial focus
• Maintainance
• Trust
Open Data Platforms
• CKAN
• DKAN
• Junar
• Socrata
• NESSTAR
• DataPublic
• OpenColibri / ENGAGE
Under development
• Catalog system
• API
• Data Store API
• Python-based
framework (pylons)
• Faceted Search
(SOLR)
• Visualizations
Features
Data.gov.uk, publicdata.eu and many other data.gov initiatives including data.gov (US)
• Focus on easing the publication workflow
• Visualizations tables, charts, and maps. Ability for real-time data
• Analytics
Features
• Find, browse,
visualize and
analyse data online
• Social sciences
• Surveys
• DDI
Features
• Publisher – freeware with no support, Server - Commercial
Norwegian Social Science Data Services