data-mining the semantic web @tcd

Post on 28-Jul-2015

46 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data-mining the Semantic Weband spatially visualising the resultsDAH workshopTrinity College Dublin 27 May 2015

2 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Workshop overview

• Morning session : Data-mining– Open Data– Linked Data– Linked Open Data implementation– Semantic Web and ontologies– Hands-on practical exercises

3 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Workshop overview

• Afternoon session : Data visualisation– Data visualisation concepts introduction– Web maps and geo-tagging– Hands-on practical– Interpretations– Hermeneutic circle

4 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

But first, a very quick survey

• Your occupation– UG student– PG student– Professional academic– Non-academic

5 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Your age group– Under 16– 16-24– 25-34– 35-44– 45-54– 55 and over

6 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Open Access?– 1 - Not familiar at all– 2– 3– 4– 5 – Very familiar

7 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Open Data?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

8 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with Linked Data?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

9 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• How familiar are you with the Semantic Web?– 1 – Not familiar at all– 2– 3– 4– 5 – Very familiar

10 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Have you ever published Open Data?– Yes– No

11 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Have you ever consumed Linked Open Data services?– Yes– No

12 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A quick survey

• Please fill in your…– Name– Email address

Don’t worry – I’m not going to pass them on to anyone

13 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

From the horse’s mouth

(source: www.ted.com/talks/tim_berners_lee_on_the_next_web)

14 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

15 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Open Access

TerminologyOpen Data

Big Data

The web of data

The Semantic WebLinked Data

data mining

16 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Asking questions of digital datasets

Terminology

17 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Open Access

Terminology

18 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Design by Julie Beckfor the Harvard University Neuroinformatics dept(source: www.juliebcreative.com/portfolio/open-data-logo/)

19 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Linked DataTerminology

The linkages between the major Linked Data datasets (source: lod-cloud.net)

20 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Big DataTerminology

Wordle of terms associated with Big Data activity (source: sfdata.startupweekend.org)

21 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

5 Stars of Open Data

put your data online under an open license

make it structured (e.g. as an Excel file)

use non-proprietary formats (e.g. XML and not Excel)

use URIs to identify resources

link your data to external datasets

22 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The RDF Triple

23 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

A Triple Example

‘…the boy’s name is Tom…’

subject

predicate

object

24 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Triple Linking

‘…Tom is short for Thomas…’

subject

predicate

object

25 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Graph data

26 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Serialising RDF

• Turtle

• JSON

• RDF/XML

• N-Triples

27 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

RDF Turtle@base <http://example.org/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix rel: <http://www.perceive.net/schemas/relationship/> .

<green-goblin> rel:enemyOf <spiderman> ; a foaf:Person ; # in the context of the Marvel universe foaf:name "Green Goblin" .

<spiderman> rel:enemyOf <green-goblin> ; a foaf:Person ; foaf:name "Spiderman", "Человек-паук"@ru .

1

2

3

28 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As N-Triples

<http://example.org/green-goblin> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/spiderman> .<http://example.org/green-goblin> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/green-goblin> <http://xmlns.com/foaf/0.1/name> "Green Goblin" .<http://example.org/spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/green-goblin> .<http://example.org/spiderman> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "Spiderman" .<http://example.org/spiderman> <http://xmlns.com/foaf/0.1/name> "\u00D0\u00A7\u00D0\u00B5\u00D0\u00BB\u00D0\u00BE\u00D0\u00B2\u00D0\u00B5\u00D0\u00BA-\u00D0\u00BF\u00D0\u00B0\u00D1\u0083\u00D0\u00BA"@ru .

29 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As JSON{"http:\/\/example.org\/green-goblin":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/spiderman"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"Green Goblin"}]},"http:\/\/example.org\/spiderman":{"http:\/\/www.perceive.net\/schemas\/relationship\/enemyOf":[{"type":"uri","value":"http:\/\/example.org\/green-goblin"}],"http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type":[{"type":"uri","value":"http:\/\/xmlns.com\/foaf\/0.1\/Person"}],"http:\/\/xmlns.com\/foaf\/0.1\/name":[{"type":"literal","value":"Spiderman"},{"type":"literal","value":"\u0427\u0435\u043b\u043e\u0432\u0435\u043a-\u043f\u0430\u0443\u043a","lang":"ru"}]}}

30 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

As RDF/XML<?xml version="1.0" encoding="utf-8" ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:ns0="http://www.perceive.net/schemas/relationship/">

<foaf:Person rdf:about="http://example.org/green-goblin"> <ns0:enemyOf> <foaf:Person rdf:about="http://example.org/spiderman"> <ns0:enemyOf rdf:resource="http://example.org/green-goblin"/> <foaf:name>Spiderman</foaf:name> <foaf:name xml:lang="ru">Человек-паук</foaf:name> </foaf:Person> </ns0:enemyOf>

<foaf:name>Green Goblin</foaf:name> </foaf:Person>

</rdf:RDF>

31 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Visualised as a Graph

32 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Triplestoresand

InfrastructureA server farm (source: www.cirrusinsight.com)

33 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Making RDF

http://www.franklynam.com/blog.aspx?id=85

Q: Create RDF representations of yourself and your relationships

34 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Semantic Web and Ontologies

The stages of the Web (source: urenio.org)

35 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Ontological Classes and Properties

36 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The British Museum data mapping onto the CIDOC CRM(source: confluence.ontotext.com/display/ResearchSpace/BM+Mapping)

37 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The CIDOC CRM basic entity types and their relationships(source: www.cidoc-crm.org/)

38 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Vocabularies

39 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Graph data

40 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Minna Sundberg (source: www.sssscomic.com/comic.php?page=196)

41 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Querying using SPARQL

SELECT *WHERE {

?s ?p ?o} LIMIT 10

42 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

More complex SPARQL

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX letters1916: <http://letters1916.linkedarc.net/ontology/>PREFIX letters1916data: <http://letters1916.linkedarc.net/data/>PREFIX schema: <http://schema.org/>

SELECT DISTINCT ?letter ?letterName ?recipientPostalAddressName ?recipientLongitude ?recipientLatitudeWHERE {

?letter rdf:type letters1916:Letter ;schema:name ?letterName ;letters1916:recipientLocation ?recipientPostalAddress .

?recipientPostalAddress schema:addressRegion ?recipientPostalAddressRegion ;FILTER regex(?recipientPostalAddressRegion, 'Galway', 'i')?recipientPostalAddress schema:name ?recipientPostalAddressName .

?recipientPlace schema:address ?recipientPostalAddress ;schema:geo ?recipientGeoCoordinates .

?recipientGeoCoordinates schema:longitude ?recipientLongitude ;schema:latitude ?recipientLatitude

}

1

2

3

43 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Universities on DBpedia

http://www.franklynam.com/blog.aspx?id=86

Q: Get a list of all of the universities that DBpedia knows about

44 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SKOS

@prefix dct: <http://purl.org/dc/terms/> .@prefix skos: <http://www.w3.org/2004/02/skos/core#> .@prefix cc: <http://creativecommons.org/ns#> .

<http://linkedarc.net/vocabs/vessel-jar> a skos:Concept ;cc:license <http://creativecommons.org/licenses/by/3.0> ;cc:attributionURL <http://linkedarc.net> ;cc:attributionName "linkedarc.net" ;skos:inScheme <http://linkedarc.net/vocabs> ;skos:prefLabel “Jar" ;skos:scopeNote ”A jar concept. Pottery. This isn’t a great scope note." ;dct:publisher <http://linkedarc.net> ;dct:identifier <http://linkedarc.net/vocabs/vessel-jar> ;dct:issued "2015-02-23"^^xsd:date ;skos:exactMatch <http://purl.org/heritagedata/schemes/mda_obj/concepts/97609> .

45 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SPARQL + FILTER

SELECT * WHERE { ?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en”)}

46 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SPARQL + FILTER

SELECT * WHERE { ?s rdfs:label ?label .

FILTER langMatches(lang(?label), "en") .

FILTER regex(?label, ”bell", "i”)}

47 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

SPARQL + FILTER

SELECT * WHERE { ?s dct:dateCreated ?dateCreated .

FILTER (?dateCreated > '1900-01-01'}

48 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: British Museum Sarcophagi

Q: Get the find spots of all of the sarcophagi in the British Museum collection

SPARQL endpoint: http://collection.britishmuseum.org/sparql

49 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Archaeological stratigraphy

Q: Get the stratigraphic relationships between the contexts excavated at Priniatikos Pyrgos

SPARQL endpoint: http://linkedarc.net/sparql

50 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Stratigraphy explained (very briefly…)

Sample stratigraphic sequence (source: www.lparchaeology.com)

51 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Priniatikos Pyrgos ontology

52 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Archaeological stratigraphy

Q: Get the stratigraphic relationships between the contexts excavated at Priniatikos Pyrgos

SPARQL endpoint: http://linkedarc.net/sparql

Hint: you will need to traverse 2 levels of the ontology’s hierarchy to get at the stratigraphy data

53 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Nomisma and Ancient Coins

Q: Get the geo-coordinates of all of the coin hoards stored in the Nomisma triplestore

SPARQL endpoint: http://nomisma.org/sparql

54 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Geo-coding the Find Spotswith Google Refine

55 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

The Google Maps API

Address String

Geo-coordinates as JSON

56 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Export as CSV

57 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Practical: Getty Concepts

Q: Get all of the Getty URIs that represent concepts related to amphorae

SPARQL endpoint: http://vocab.getty.edu/sparql

58 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Additional Linked Data Resources

http://www.franklynam.com/blog.aspx?id=89

59 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Please arrange the practicals in terms of how easy they were to complete (1 for hardest and 5 for easiest)?– Making your FOAF profile– DBpedia universities– British Museum sarcophagi hunting– Getty vocabularies– Nomisma coin hoards

60 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Would you consider publishing Linked Open Data in the future?– 1 – Absolutely not – 2– 3– 4– 5 – Definitely

61 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Would you consider using Linked Open Data resources (using SPARQL or otherwise) in the future?– 1 – Absolutely not – 2– 3– 4– 5 – Definitely

62 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Is Linked Open Data a feasible platform on which to undertake humanities research?– 1 – Absolutely not– 2– 3– 4– 5 – Definitely

63 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

One final quick survey

• Any final comments?

64 of 63@flynam @bilusaurusData-mining the Semantic Web and spatially visualising the resultsDAH workshop

Thank you!

Martin Lemay (source: twitter.com/martinlemay)

top related