enrich, link, search

39
ENRICH > LINK > SEARCH The lean approach for advanced search applications over linked data Michiel Hildebrand Semantics Conference Vienna 2015

Upload: michiel-hildebrand

Post on 22-Jan-2018

616 views

Category:

Data & Analytics


0 download

TRANSCRIPT

ENRICH > LINK > SEARCHThe lean approach for advanced search applications over linked data

Michiel HildebrandSemantics Conference Vienna 2015

2

Do you see value in open data?

3

Do you think that open data could improve

the access to your own data?

4

Have you integrated open data with your own data?

5

Have you created an application on top of your

integrated data?

6

The billion $ Open Data example

7

Cultural Heritage: advanced access through (Open) Data

multi-lingual

location-based

recommendation

personalization

advanced ranking

analytics

http://www.getty.edu/research/tools/vocabularies/aat/ 8

multi-lingual

location-based

recommendation

personalization

advanced ranking

analytics

Cultural Heritage: advanced access through (Open) Data

http://www.vistory.nl/9

Cultural Heritage: advanced access through (Open) Data

multi-lingual

location-based

recommendation

personalization

advanced ranking

analytics

query logs

content-based10

Cultural Heritage: advanced access through (Open) Data

multi-lingual

location-based

recommendation

personalization

advanced ranking

analytics

http://manovich.net/11

Historic newsreels and photographs

12

Demo: Linked Open Images

13

http://link.spinque.com/openbeelden

Can we build this in a day?

14

Factory metaphor

PUSH: make to stock

PULL: make to order

Output and efficiency oriented

exact needs of user secondary

User needs oriented

production costly

15

How can we reduce the time

and cost?

Data factory

PUSH: make to stock

PULL: make to order

16

How good is the data for

your application?

The lean approach

17

Your data Integrate Access Deploy

API

Enrich

Open Data Node platform

http://opendatanode.org/

Methodology for publishing Open Data

http://www.comsode.eu/index.php/deliverables/

Moving from one-off to sustainable data publishing

18

http://unifiedviews.eu/

Key requirements for integration step

Sustainable

Quality control

19

Your data Integrate Access Deploy

API

Enrich

Integrating historic newsreels with photographs

GTAA thesaurus (SKOS)NIOD subject terms (SKOS)

20

preferred label

antisemitisme

spionage

amnestie

...

preferred label

antisemitisme

spionage

amnestie

...

NIOD subject terms GTAA thesaurus

preferred label = preferred label

21

prefered label alternative label

politieagenten agenten

militaire parades parades

optochten parades

prefered label

agenten

parades

NIOD subject termsGTAA thesaurus

Introduces ambiguity

preferred label = alternative label

22

prefered label

dodenherdenking

hamsteren

NIOD subject terms GTAA thesaurus

Introduces errors

prefered label

dodenherdenkingen

hamsters

singular label = plural label (stemming)

23

prefered label

dieren

graven

NIOD subject terms

GTAA thesaurus

filter sources

prefered label concept scheme

dieren subject terms

dieren geographical names

graven subject terms

grave geographical names

subject ≠ location (noise)

24

Other alignment techniques

fuzzy string matching

join matches on multiple attributes

similarity in the hierarchy (skos:broader)

select best candidate (most generic/specific term)

....

25

Key requirements integration step checked

Quality control• Model link strategy out of (simple) building blocks• Iterative process (trial and error)• Exploration of the source data• Direct access to the results• Evaluate the subsets

Sustainable• Export links and link strategy• Provenance of the process is explicit in the strategy• Rerun after update of datasets

27

Dutch National Strategy Digital Heritage

28

CultuurLINK a free service for the cultural heritage domain

29

http://cultuurlink.beeldengeluid.nl/

Rijksmuseum Amsterdam integrated multilingual vocabularies

http://www.rijksmuseum.nl/nl/collectie/BK-NM-1010 http://www.getty.edu/research/tools/vocabularies/aat/ 30

Key requirements for access step

31

Your data Integrate Access Deploy

API

Enrich

Model complex access (search)

Combine graph queries and ranking

Already three types of search in a simple app

32

keyword search location-based search recommendation

multilingual

location-based

recommendation

personalization

ranking

analyticsProbabilistic Graph Database

Building blocks (SPINQL)

Search by Strategy

Advanced search applications with Spinque

33

Demo Spinque Search

34

Key requirements access step checked

Model complex search problems• Search strategy out of (simple) building blocks• No programming required

Combine graph queries and ranking• Integrated triple store and search index• Probabilistic graph database• Building blocks for graph queries• Building blocks for search and ranking

35

Your data Enrich Link strategy

API

DeploySearch strategy

36

The lean approach

Breakout

What kind of functionality would you like to provide to your users?

1. What kind of data do you want to make accessible in a richer way?

2. What additional (open) data can you use for this enriched access?

3. What type of (search) functionality is required?

37

Other applications: Restaurant inspections

38

Other applications: Community platform

39