ENRICH > LINK > SEARCHThe lean approach for advanced search applications over linked data
Michiel HildebrandSemantics Conference Vienna 2015
2
Do you see value in open data?
3
Do you think that open data could improve
the access to your own data?
4
Have you integrated open data with your own data?
5
Have you created an application on top of your
integrated data?
6
The billion $ Open Data example
7
Cultural Heritage: advanced access through (Open) Data
multi-lingual
location-based
recommendation
personalization
advanced ranking
analytics
http://www.getty.edu/research/tools/vocabularies/aat/ 8
multi-lingual
location-based
recommendation
personalization
advanced ranking
analytics
Cultural Heritage: advanced access through (Open) Data
http://www.vistory.nl/9
Cultural Heritage: advanced access through (Open) Data
multi-lingual
location-based
recommendation
personalization
advanced ranking
analytics
query logs
content-based10
Cultural Heritage: advanced access through (Open) Data
multi-lingual
location-based
recommendation
personalization
advanced ranking
analytics
http://manovich.net/11
Historic newsreels and photographs
12
Demo: Linked Open Images
13
http://link.spinque.com/openbeelden
Can we build this in a day?
14
Factory metaphor
PUSH: make to stock
PULL: make to order
Output and efficiency oriented
exact needs of user secondary
User needs oriented
production costly
15
How can we reduce the time
and cost?
Data factory
PUSH: make to stock
PULL: make to order
16
How good is the data for
your application?
The lean approach
17
Your data Integrate Access Deploy
API
Enrich
Open Data Node platform
http://opendatanode.org/
Methodology for publishing Open Data
http://www.comsode.eu/index.php/deliverables/
Moving from one-off to sustainable data publishing
18
http://unifiedviews.eu/
Key requirements for integration step
Sustainable
Quality control
19
Your data Integrate Access Deploy
API
Enrich
Integrating historic newsreels with photographs
GTAA thesaurus (SKOS)NIOD subject terms (SKOS)
20
preferred label
antisemitisme
spionage
amnestie
...
preferred label
antisemitisme
spionage
amnestie
...
NIOD subject terms GTAA thesaurus
preferred label = preferred label
21
prefered label alternative label
politieagenten agenten
militaire parades parades
optochten parades
prefered label
agenten
parades
NIOD subject termsGTAA thesaurus
Introduces ambiguity
preferred label = alternative label
22
prefered label
dodenherdenking
hamsteren
NIOD subject terms GTAA thesaurus
Introduces errors
prefered label
dodenherdenkingen
hamsters
singular label = plural label (stemming)
23
prefered label
dieren
graven
NIOD subject terms
GTAA thesaurus
filter sources
prefered label concept scheme
dieren subject terms
dieren geographical names
graven subject terms
grave geographical names
subject ≠ location (noise)
24
Other alignment techniques
fuzzy string matching
join matches on multiple attributes
similarity in the hierarchy (skos:broader)
select best candidate (most generic/specific term)
....
25
Demo Spinque LINK
26
http://cultuurlink.beeldengeluid.nl/app/#/tutorial/tutorial_niod_start
Key requirements integration step checked
Quality control• Model link strategy out of (simple) building blocks• Iterative process (trial and error)• Exploration of the source data• Direct access to the results• Evaluate the subsets
Sustainable• Export links and link strategy• Provenance of the process is explicit in the strategy• Rerun after update of datasets
27
Dutch National Strategy Digital Heritage
28
CultuurLINK a free service for the cultural heritage domain
29
http://cultuurlink.beeldengeluid.nl/
Rijksmuseum Amsterdam integrated multilingual vocabularies
http://www.rijksmuseum.nl/nl/collectie/BK-NM-1010 http://www.getty.edu/research/tools/vocabularies/aat/ 30
Key requirements for access step
31
Your data Integrate Access Deploy
API
Enrich
Model complex access (search)
Combine graph queries and ranking
Already three types of search in a simple app
32
keyword search location-based search recommendation
multilingual
location-based
recommendation
personalization
ranking
analyticsProbabilistic Graph Database
Building blocks (SPINQL)
Search by Strategy
Advanced search applications with Spinque
33
Demo Spinque Search
34
Key requirements access step checked
Model complex search problems• Search strategy out of (simple) building blocks• No programming required
Combine graph queries and ranking• Integrated triple store and search index• Probabilistic graph database• Building blocks for graph queries• Building blocks for search and ranking
35
Your data Enrich Link strategy
API
DeploySearch strategy
36
The lean approach
Breakout
What kind of functionality would you like to provide to your users?
1. What kind of data do you want to make accessible in a richer way?
2. What additional (open) data can you use for this enriched access?
3. What type of (search) functionality is required?
37
Other applications: Restaurant inspections
38
Other applications: Community platform
39