open educational data - datasets and apis (athens green hackathon 2012)

19
Linked Data for Education – Datasets & APIs Stefan Dietze - Green Hackathon, 14 December, Athens, Greece -

Upload: stefan-dietze

Post on 27-Jan-2015

106 views

Category:

Documents


0 download

DESCRIPTION

A summary of some datasets and APIs in the field of linked data in education. Presented at Athens Green Hackathon, 14 December, Athens, Greece.

TRANSCRIPT

Page 1: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Linked Data for Education – Datasets & APIsStefan Dietze

- Green Hackathon, 14 December, Athens, Greece -

Page 2: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

TEL data vs Linked Open Data

Linked Data for Education

Relevant knowledge and data

Publications: ACM, PubMed, DBLP (L3S), OpenLibrary

(Cross-)domain knowledge & resources: BioPortal, historic artefacts in Europeana, Geonames, DBpedia, Freebase, …

Media resource metadata: BBC, Flickr, …

Explicit educational data

University Linked Data: eg The Open University UK, http://data.open.ac.uk, Southampton University, …

OER Linked Data: mEducator Linked ER (http://ckan.net/package/meducator), Open Learn LD

Schemas: LRMI (http://www.lrmi.net/), mEducator OER schema (http://purl.org/meducator/ns)

Linked Open Data

Vision: well connected graph of open Web data

W3C standards (RDF, SPARQL) to expose data, URIs to interlink datasets

=> vast cloud of interconnected datasets

Crossing all sorts of domains

32 billion triples (September 2011)

http://linkededucation.org; http://linkeduniversities.org

Page 3: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Data/services integration & retrieval/search APIs

Early work: educational service integration SmartLink: Linked Data registry of (educational) datasets / stores and their APIs

Discovery and lifting of educational data out of heterogeneous repositories

Transformation of heterogeneous data formats (XML, JSON...) and schemas (eg. IEEE LOM, Dublin Core) into RDF (pre-requisite for LOD compliancy)

http://ckan.net/package/smartlink & http://purl.org/smartlink

Stefan Dietze 3Green Hackathon 2012

Page 4: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Data/services integration & retrieval/search APIs

Early work: educational data integration

Linked Educational Resources

http://linkededucation.org/meducator

Stefan Dietze 4Green Hackathon 2012

Page 5: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Dereferencable resource URIs

Stefan Dietze 5tele-TASK Symposium 2012

Page 6: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

http://ckan.net/package/smartlink

> 2000 triples so far

> 300 links to iServe

APIs (=> wiki) used by several applications

http://ckan.net/package/meducator

> 35000 triples so far

> 1000 links to DBpedia & Bioportal ontologies

APIs (=> see wiki) used by 4 applications

Data so far: SmartLink/mEducator in LOD cloud

Stefan Dietze 6Green Hackathon 2012

Page 7: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

TEL data vs Linked Open DataChallenges

Still limited take-up (applications usually focused on small set of datasets)

Key issues

Scalability and robustness (distributed data access & retrieval, Big Data integration)

Data quality (heterogeneous providers, lack of trust)

Legal and licensing issues

Lack of benchmarks and evaluation

Stefan Dietze 7Green Hackathon 2012

Page 8: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

“LinkedUp” Support Action

Linking Web Data for Education Project – Open Challenge in Web-scale Data Integration

EC Support Action, kickstarted in November 2012 => http://linkedup-project.eu

Goals

Push forward adoption of Web data/Linked Data in educational context

Drive technological advancement of Web data integration technologies

Approach

Open data competition (initial calls expected early 2013) incl. technical, legal and financial support

Open data curation !

Partners

+ network of associated institutions (eg BBC, Commonwealth of Learning, Talis UK, …)

Stefan Dietze 8Green Hackathon 2012

Page 9: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Educational data gathering - community-approach: Linked Education cloud “LinkedUp/Linked Education cloud” as subset of LOD cloud CKAN – “The DataHub” (ckan.net, most important data registry) for data collection

(analog to Linked Open Data approach) Dedicated group (“linked-education”) for cataloging educational datasets

Educational data integration & infrastructure: Linked Education graph Linked Education cloud => Linked Education graph & dataset Integration of (selected) datasets into coherent (RDF) dataset Infrastructure, unified (SPARQL) endpoint & APIs => http://linkededucation.org

LinkedUp data curationLinked Education Cloud & Linked Education Graph

Educational Data

Stefan Dietze 9Green Hackathon 2012

Page 10: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Linked Education graph & dataset(s)

Stefan Dietze 10Green Hackathon 2012

Page 11: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Stefan Dietze 11Green Hackathon 2012

<dc:title> <akt:has-title>?

OER

Publication

VideoLecture

LinkedUniversities educational videos

http://data.linkededucation.org/ns/linked-education.rdf

http://data.linkededucation.org/.... (details at the end)

6 million distinct (but linked) resources

97 million RDF triples

21.6 GB of data

Linked Education graph & dataset(s)

Page 12: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Stefan Dietze 12Green Hackathon 2012

Linked Education graph & dataset(s)

Page 13: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Entity enrichment => disambiguation & correlationVia DBpedia/Freebase

Stefan Dietze 13Green Hackathon 2012

<led:Resource-OpenLearn-2139393292>…<led:title>…laws of gravity…</led:title>…</led:Resource-OpenLearn-2139393292>

<led:Resource-BBC-519215>…<led:title>…gravitating…</led:title>…</led:Resource-BBC-519215>

Page 14: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Stefan Dietze 14Green Hackathon 2012

Linked Education graph & dataset(s)

Page 15: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Stefan Dietze 15Green Hackathon 2012

Example resource: => http://data.linkededucation.org/resource/led/92C8A5E7-7B4D-12A6-F4F2-76A6A8DC7C0A

Example query (schema alignment & categorisation):

SELECT ?resource ?title WHERE { ?resource led:title ?title FILTER regex(?title, "linear equations", "i")}

returns 1102 resources from different datasets: 659 DBLP items, 397 ACM publications, 10 LinkedUniversities educational videos

Example query (disambiguation & correlation):

SELECT distinct ?entity WHERE {?entity led:hasEnrichmentContext ?dbp_context. ?dbp_context rdf:type led:EnrichmentContext. ?dbp_context led:hasEnrichment <http://data.linkededucation.org/ontology/Enrichment/Gravitation>}

returns 5 resources (LinkedUniversities, mEducator, BBC) enriched with DBpedia concept Gravitation (even though their descriptions refer to "gravity" or "gravitational" or "laws of gravity").

Linked Education graph & dataset(s)Enabling cross-dataset queries

Page 16: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

How to access the data (1/2)Registries and federated access to dataCKAN – The DataHub

THE public registry for open Web datasets (almost 5000 distinct datasets)

CKAN: http://thedatahub.org; LOD group: http://datahub.io/group/lodcloud

Linked Education dataset

Over 21 GB /6 million educationally relevant resources

SPARQL endpoint: http://data.linkededucation.org/openrdf-sesame/repositories/linked-learning[-selection]?query

Schema: http://data.linkededucation.org/ns/linked-education.rdf

Example resource: http://data.linkededucation.org/resource/led/92C8A5E7-7B4D-12A6-F4F2-76A6A8DC7C0A

SmartLink

SmartLink dataset: registry of educationally relevant APIs=> http://ckan.net/package/smartlink, http://purl.org/smartlink

SPARQL: http://smartlink.open.ac.uk/smartlink/sparql (dedicated APIs for search & retrieval available)

Stefan Dietze 16Green Hackathon 2012

Page 17: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

ACM Learning Analytics and Knowledge (LAK) Dataset

Corpus of extracted metadata and full-text from ACM LAK conference series papers and related publications (expanding)

Dataset & schema description: http://www.solaresearch.org/resources/lak-dataset/

LAK Challenge: win fame, an iPad, cash rewards!

SPARQL endpoint: http://data.linkededucation.org/openrdf-sesame/repositories/lak-conference?query=%5BQUERY%5D

mEducator Linked Educational Resources

Over 600 OER (36.000 triples) from different providers

mEducator dataset: http://ckan.net/package/meducator

SPARQL: http://meducator.open.ac.uk/resourcesrestapi/rest/meducator/sparql

Schema: http://purl.org/meducator/ns

Dedicated search & retrieval APIs available (see http://linkededucation.org/meducator/)

How to access the data (2/2)Some individual datasets

Stefan Dietze 17Green Hackathon 2012

Page 18: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Conclusions and Outlook

Summary, ongoing work & outlook Wide range of relevant data sources & APIs available

Early cataloging (http://linkededucation.org, http://linkeduniversities.org) and integration/federation (SmartLink, mEducator Linked Educational Resources)

LinkedUp (http://www.linkedup-project.eu): data curation, assessment and exploitation

Data cataloging: http://datahub.io/en/group/linked-education for collection of “educationally relevant” datasets, categorisation and tagging

Data integration & infrastructure: unified endpoints and APIs at http://data.linkededucation.org

Getting involved Submit your own data or tools: LinkedUp Challenge, LAK Challenge, LinkedUp Call for Data

Participate as LinkedUp evaluation panelist, use case or data contributor & benefit from access to large network of organisations in Linked Data and TEL

Stefan Dietze 18Green Hackathon 2012

Page 19: Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)

Contact & links

http://purl.org/dietze / [email protected]

http://linkededucation.org

http://linkedup-project.eu

Thank you!

Stefan Dietze 19Green Hackathon 2012

Credits

Davide Taibi (CNR ITD, Italy)

Harry Yu & Dong Liu (The Open University, UK)

Besnik Fetahu (L3S, Germany)

mEducator and LinkedUp teams