homer: a case study of federation among open data portals · pdf filehomer: a case study of...

29
Homer: a case study of federation among open data portals Nives Alciato - CSI Piemonte [email protected]

Upload: trinhque

Post on 10-Mar-2018

218 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Homer: a case study of federation among open data portals

Nives Alciato - CSI Piemonte [email protected]

Page 2: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •
Page 3: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •
Page 4: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

•Regional law on Open Data •Guidelines for reuse •Adoption of a standard licence model • Creation of a working group • Diffusion to other Public Administrations • Reuse at national level • European Projects • Metadata catalogues • Data uploading platform • A portal as an access point for data and information

The initiative of Piedmont Region

Page 5: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Regional Law n. 24 dated 23/12/2011 •First regional law in Italy on Open Data

Basic principle: • Data belong to people Cornerstones of reusability of data: • Diffusion without restriction and in open

and standard digital formats • Use of standard legal tools Creative

Common Licences • Re-use and re-distribution of data is free

of charge

Legal framework

Page 6: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Organizational framework

Regional level An initiative whith ANCI Piemonte (association of municipalities): dati.piemonte.it is the infrastructure for all the regional territory (120 Municipalities and other bodies like ARPA Piemonte and Unioncamere) National level Re-use of the platform and joint project with Emilia Romagna Region and Milano Municipality European level HOMER project to transfer methodological / technical standards and increase circulation and re-use of public data OPENDAI project to improve a new architectural model to increase digital services and business opportunities

Page 7: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Technological framework: a permanent beta

Si riesce a trasformare queste scatole in una grafica più carina? Portal Search

DATA

Operational data bases of PAs

New platform

from Open data

to Data Services

and a Federated

Search Engine

• Harmonize policies and licenses for the re-use of data

• Federation of Open Data Portal

• Open data silos PA • Cloud architecture • Open data Services

Page 8: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •
Page 9: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

HOMER is the acronym of

Harmonising Open data in the MEditerranean through better access and

Reuse of public sector information

www.homerproject.eu

It is a project within the MED Programme financed by the EU Commission

Implementation Starting date 01/04/2012

Implementation End date 31/03/2015

Page 10: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Who are the Homer’s Partners 13 Partners as territorial government and 6 Partners as technological support

Country Partner Mission Spain SARGA - Agencia de Gestion Agraria y Pesquera de Andalucia Territorial Gov. AGAPA - Sociedad Aragonesa de Gestión Agroambiental Territorial Gov. FUNDITEC – Foundation for Development, Innovation and Technology Technical Support France Région Provence-Alpes-Côte d'Azur, Territorial Gov. Région Corse Territorial Gov. AVITEM – Agency for sustainable Mediterranean cities and territories Technical Support FING – Fondation Internet Nouvelle Generation Technical Support Italy Piedmont Region Project Leader Sardinia, Emilia-Romagna and Veneto Regions Territorial Gov. CSI Piemonte Technical Support Slovenia Geodetic Institute Territorial Gov. Montenegro Mediterranean University of Montenegro Territorial Gov. Greek GFOSS – The Greek Free Open Source Software Society Technical Support Crete Decentralized Administration of Crete Territorial Gov. University of Crete Technical Support Cyprus Sewerage Board of Limassol – Amathus Territorial Gov. Malta Local Council Ass. of Malta Gozo Territorial Gov.

Page 11: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

HOMER’s objectives a federation of Open Data portals among partners, sharing

common datasets related to MED strategic domains (agriculture, culture, energy, environment, tourism),

ensuring long sustainability and exploiting a huge number of harmonized and federated datasets, enhancing the e-participation and digital market opportunities of the MED

citizens

CSI Piemonte’s responsabilities in HOMER it is the developer of a Federation of Open Data Portals among

partners providing ICT and legal support and

it is the promoter of the reuse of the technological solutions underlying the portal, developed in the context of the

project

Page 12: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

What we intend for federation of open data portals? “Federation” means the virtual system composed by a software

able to collect and retrieve the metadata of published data derived from the 5 categories (agriculture, culture, energy, environment, tourism) exposed and searched by Open Data

Partners Portals ‘

Look at this symbol: it represents the metadata catalogue

Page 13: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

•Memorandum of Understanding

• Definition of a metadata common structure for federation • Use of EuroVoc • The cross lingual search • The federated search multi-language engine • The indexing scenario • The searching scenario

Design, methodology, and approach

Page 14: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Legal framework - Memorandum of Understanding Partners have been involved upon signing a Memorandum of Understanding where technological, organizational and legal boundaries have been defined as common understanding for everybody and referring to the Directive 2013/37/EU

It is indicated that all technological components of the solution for the Federation (Index, Semantic Search Engine, Translator) are provided and managed – under the conditions and the coordination of CSI Piemonte – that releases them on the basis of an open source philosophy

Page 15: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Data framework – the metadata structure

Each Open Data Portals share metadata common fields: this structure builds the Federated Index

title description url metadata source package_id topics language tags geographic bounding box refresh date creation date spatial scale resolution license id owner

Inspire

DCAT

CKAN

Dublin Core

Intersecting the Protocols and Directives in the schema, it has been identified the minimun common set of fields for the definition of a metadata structure and to federate, indipendent from the type of dataset geographical or alphanumerical

Page 16: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Data framework – the use of EuroVoc (1)

Homer, now, speaks 7 languages (spanish, french, italian, slovenian, serbian-montenegrin, greek and english) with 4 different alphabets and we must share a dictionary to communicate

title description url metadata source package_id topics language tags geographic bounding box refresh date creation date spatial scale resolution license id owner

iso code 639-1 to identify the language

Page 17: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Data framework – the use of EuroVoc (2)

EuroVoc is a multilingual, multidisciplinary thesaurus of the EU conformant to W3C recommendations and in it a specific concept of the 5 categories involved has the same classification and meaning in the domains and languages

title description url metadata source package_id topics language tags geographic bounding box refresh date creation date spatial scale resolution license id owner

iso code 639-1 to identify the language

Homer’s categories = EuroVoc domains

WATER νερό VODA вода AGUA EAU ACQUA

Each ODP inserts tags in the metadata cards in its own language without the burden of translation The same concept

is identified in all languages

Page 18: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

The semantic search multi language engine needs a specific common structure to index and retrieve the metadata of all metadata catalogues of the Homer’s Partners’ Open Data Portals. The search engine is like a librarian who finds books only if the request form is filled out in a specific way Field_0

Data framework – The cross lingual search

Page 19: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

The technological solution for indexing and searching among all the federated open data portals has 4 components:

1. Fed-Index Homer: the federated index file component containing the

complete list of metadata

2. Fed-Translator: the component that translates every tags of the datasets via EuroVoc

3. Fed-Searcher: the centralized semantic search engine component

4. Fed-Loader API: the loader that calls the API o Webservices exposed by each Open Data Portal to create the federated Index

Based on the open source project Apache Sorl Released open source on sourceforge

Technological framework: the federated search multi language engine

Page 20: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Technological framework: the indexing scenario (1)

The indexing process requires that each federated portal exposes the metadata cards of the data using 2 types of url

url1 that returns the list of the data id: Package List 1

url2 that returns the attributes for the single data: Package Dataset

2

It is a stand alone process scheduled, which could be nightly

Page 21: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Technological framework: the indexing scenario (2)

Scheduled

Eau

Voda

Agua

Water

Opendata Portals

Search Engine

Page 22: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Technological framework: the indexing scenario (3) 3 ways supported to expose the metadata: API CKAN compliant: Package List >url1 that returns a xml file1 with the list of the data id Package Dataset > url2 that returns a xml file2 with the attributes for the single data Web services dati.piemonte.it compliant: Package List >url1 that returns a xml file1 with the list of the data id

http://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_list2&format=xml&layout=xml

Package Dataset > url2 that returns a xml file2 with the attributes for the single data

http://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_item2&format=xml&layout=xml&itemid=1083

API Catalogue Service for the Web compliant: Package List > url1 that returns a csw file1 with the list of the data id Package Dataset > url2 that returns a csw file2 with the attributes for the single data

Page 23: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Field_0

Technological framework: the indexing scenario (4) 3 ways supported to expose the metadata:

API CKAN compliant

Web services dati.piemonte.it compliant

API CSW compliant

Page 24: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Technological framework: the searching scenario User

1 Search in lang of the portal

Open Data Portal (ODP)

Sear

ch E

ngin

e (S

E)

2 ODP call SE adding lang

3 ODP use EuroVoc and search in the index in all lang

3

5 The User chooses a data and goes on the corresponding portal portal

5

4 SE return a list of result

Page 25: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Results and ongoing activities The Federation in terms of: • shared knowledge, experiences and relationships among the Partners

• open hundreds of public datasets enhancing digital heritage

transparency and promoting open data culture across the Mediterranean

• looking for new stakeholders as it is possible to configure new categories and new languages

Page 26: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Nives Alciato – CSI Piemonte nives.alciato @csi.it

www.dati.piemonte.it www.homerproject.eu

Thank you !

Page 27: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Step 4- technical requirements API Web Services like ‘www.dati.piemonte.it’ An open data portal like dati.piemonte.it exposes 2 urls

url1 that returns a xml file1 with the list of the data id: Package List http://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_list2&format=xml&layout=xml

<urlOggetti totale="434" baseUrl="http://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_item2&format=xml&layout=xml&itemid=" data=""> <urlOggetto>1083</urlOggetto>

1

url2 that returns a xml file2 with the attributes for the single data Package Dataset http://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_item2&format=xml&layout=xml&itemid=1083

<package> <package_id>1083</package_id> <url>http://www.dati.piem..</url> <title>DWUMA DW Utenti ..</title> <description> Base dati decisionale ... </description>

2

Page 28: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Step 4 - technical requirements API set interface like CKAN A Ckan compliant API expects 2 urls

url1 that returns a json file1 with the list of the data id: Package List http://data.gov.uk/api/rest/package

[ "human-resources-datasets", "veterinary-residues-data", ... ]

1

url2 that returns a json file2 with the attributes for the single data Package Dataset http://data.gov.uk/api/rest/package/human-resources-datasets

{ license_title: "", maintainer: null, maintainer_email: null, id: "00029d8d-1be7-4435-9ef8", metadata_created: "2013-08-30", relationships: [ ], ...

2

Page 29: Homer: a case study of federation among open data portals · PDF fileHomer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it •

Step 4 - technical requirements Catalogue Services for the Web (CSW) A Geoportals exposing metadata with 2 methods of CSW protocols:

url1 that returns a csw file1 with the list of the data id: Package List http://webgis.arpa.piemonte.it/geoportalserver_arpa/csw?REQUEST=GetRecords

1

url2 that returns a csw file2 with the attributes for the single data Package Dataset http://webgis.arpa.piemonte.it/geoportalserver_arpa/csw?request=GetRecordById&service=CSW&version=2.0.2&id=ARLPA_TO_16.08.01-D_2011-11-03-9:58

<csw:GetRecordByIdResponse> <gmd:MD_Metadata xsi:schemaLocat <gmd:fileIdentifier> <gco:CharacterString> ARLPA_TO_16.08.01-D_2011-11-03-9:58 </gco:CharacterString> </gmd:fileIdentifier> <gmd:language> ...

2

<csw:GetRecordsResponse> <csw:SearchStatus timestamp="201 <csw:SearchResults ... <gmd:MD_Metadata> <gmd:fileIdentifier> <gco:CharacterString> ARLPA_TO_16.08.01-D_2011-11-03-9:58