semantic graphs in the multi - model context · 1 august 2019© marklogic corporation john snelson....
TRANSCRIPT
-
1 August 2019© MARKLOGIC CORPORATION
John SnelsonPrincipal Engineer
Semantic Graphs in the Multi-Model Context
-
Data Hub Semantic Graph
-
Data Hub Semantic Graph
-
01 Technical FoundationsSEMANTIC GRAPHS IN A MULTI-MODEL DATA HUB
-
Document Model Natural way to model an entity
Self-describing / Human-readable
Hierarchy, sequences, sparse data
Schema is flexible within / across documents
{"Customer_ID": 1001,"Fname": "Paul","Lname": "Jackson","Phone": "415-555-1212","SSN": "123-45-6789","Addr": "123 Avenue ","City": "Someville","State": "CA","Zip": 94111
}
{"Cust_ID" : 2001 ,"Given_Name" : "Karen" ,"Family_Name" : "Bender" ,"Shipping_Address" : {
"Street" : "324 Some Road" ,"City" : "San Francisco" ,"State" : "CA" ,"Postal" : "94111" ,"Country" : "USA" } ,
"Billing_Address" : {"Street" : "847 Another Ave" ,"City" : "San Carlos" ,"State" : "CA" ,"Postal" : "94070" ,"Country" : "USA" }
}
JSON/XML DOCUMENTS
-
Document Queries Query though search (“cts” queries)
Score, document positions, etc.
Range indexes
Fetch whole documents
{"Customer_ID": 1001,"Fname": "Paul","Lname": "Jackson","Phone": "415-555-1212","SSN": "123-45-6789","Addr": "123 Avenue ","City": "Someville","State": "CA","Zip": 94111
}
{"Cust_ID" : 2001 ,"Given_Name" : "Karen" ,"Family_Name" : "Bender" ,"Shipping_Address" : {
"Street" : "324 Some Road" ,"City" : "San Francisco" ,"State" : "CA" ,"Postal" : "94111" ,"Country" : "USA" } ,
"Billing_Address" : {"Street" : "847 Another Ave" ,"City" : "San Carlos" ,"State" : "CA" ,"Postal" : "94070" ,"Country" : "USA" }
}
JSON/XML DOCUMENTS
-
A
B
Locations
Field Report
Finances
Investigation
Criminal Operations
Graph Model Everything is a relationship
Sparse data, simple merging
No sequences, hierarchy by convention
Super-normalized
-
A
B
Locations
Field Report
Finances
Investigation
Criminal Operations
Graph Queries Query with SPARQL
Graph pattern matching
Triple index
Returns tables
-
select * where {?author :wrote ?book ;:speaking-at “SemTechBiz” .
:jsnelson :likes ?book .}
Dean Allemang Semantic Web for the
Working Ontologist
wrote
Stephen Buxton
JohnSnelson
knowslikes
Sem Web Meetup
attends
SemTechBiz
speakingat
MarkLogic
works for
knows
speakingat
speakingat
works for
attends
?author ?bookwrote
JohnSnelson
likes
SemTechBiz
speakingat
Which authors speaking at SemTechBiz wrote a book I
like?
This slide is derived from a presentation by Dean Allemang, and is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
-
?author ?book
:dallemang “Semantic Web for the Working Ontologist”
Dean Allemang Semantic Web for the
Working Ontologist
wrote
Stephen Buxton
JohnSnelson
knowslikes
Sem Web Meetup
attends
SemTechBiz
speakingat
MarkLogic
works for
knows
speakingat
speakingat
works for
attends
Which authors speaking at SemTechBiz wrote a book I
like?select * where {?author :wrote ?book ;:speaking-at “SemTechBiz” .
:jsnelson :likes ?book .}
This slide is derived from a presentation by Dean Allemang, and is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.
-
02 Data Modelling in a Multi-Model Database
SEMANTIC GRAPHS IN A MULTI-MODEL DATA HUB
-
Where should my data go?
In Triples
• If the information is a relationship
• If the information belongs to more than one entity
• If a “delete” shouldn’t remove other information
• If you need to do graph queries
• If the data is already in triples
In Documents• If the information is about a single entity
• If a “delete” will need to remove all the information together
• If you need to do document search
• If you often need to fetch all the information together
• By default
Remember: Progressive Enhancement
-
subclassof
Working Together
S. Shady
is member of claims credit for
rented in the vicinity ofBlue Van
Noise Comp.
SuspiciousActivity
Event
D12
-
Working Together
S. Shady is member of D12
-
Use TDE to turn document content into triples
Index only – no document modifications
Removed when the document is removed
Multiple formats can create the same kinds of triples
Document Data as TriplesTemplate Driven Extraction
-
/runs/nameRaceRuns
sem:iri( xdmp:node-uri(.) )
sem:iri($EX||"hasAge" )
xs:integer(./age)
…
{"name": "10K","runs": [{"name": "John","age": 36
}]
}
select distinct ?p {?s ?p ?o
}
Documents Template-DrivenExtraction
Triples
op:from-triples((op:pattern(op:col("s"),op:col("p"), op:col("o"))
))=>op:select("p")=>op:where-distinct()=>op:result()
-
Use TDE to turn document content into table rows
Index only – no document modifications
Removed when the document is removed
Multiple formats can contribute rows to the same table
Tables also available from SPARQL
Document Data as TablesTemplate Driven Extraction
-
…/runs/name…RaceRuns
ageinteger../age
…
{"name": "10K" ,"runs": [{"name": "John","age": 36
}]
}
select agefrom Race.Runs
Documents Template-DrivenExtraction
Tables
op:from-view("Race","Runs")=>op:select("age")=>op:result()
-
op:from-view("Race","Runs")=>op:join(
op:from-SPARQL("select * { ?race ?p ?o }")
)=>op:select(("race","p","o"))=>op:result()
Multi-Model QueriesOptic API
-
03 Triples Inside MarkLogic
SEMANTIC GRAPHS IN A MULTI-MODEL DATA HUB
-
Acme Corp
3427USDannual
http://youruri.com/orders/12345http://youruri.com/predicates/expires2016-12-31
http://youruri.com/orders/12345http://youruri.com/predicates/TsAndCshttp://youruri.com/terms/34567
....
MarkLogic Multi-ModelEverything is a Document
-
Triple Index
subject predicate object doc ID position:person4 :first-name “John” 11 5 - 9:person5 :alma-mater :Brown 4 25 - 40:person5 :birth-year 1929 9 13 - 17…
-
Triple Data and Triple Values
subject predicate object doc ID position4 3 6 11 5 - 95 0 2 4 25 - 405 1 7 9 13 - 17…
ordinal tag value…3 IRI :first-name4 IRI :person45 IRI :person56 STRING “John”7 DECIMAL 1929…
-
Triple Index Permutation
Major Order Minor Order Use Case
PSO Column ID Row ID Column access ordered by row ID
SOP Row ID Row access
POS *new* Column ID Value Column access ordered by value
Tables in the Triple Index
Row ID Column ID Value
-
04 Use Cases and DemoSEMANTIC GRAPHS IN A MULTI-MODEL DATA HUB
-
Relationships and Linking
School Report:Mark has been absent several times this term and his teacher is concerned that …
-
Semantic SearchBSI Compliance Navigator
-
Entity Model Entities
A customer is something that exists as part of
my business/mission
PropertiesCustomer entities have a name that is of type
string and is required
RelationshipsCustomers place Orders
Entities Properties Relationships
Governance Provenance …anything else
-
Entity ServicesEntity Model Descriptor Derived Graph
-
Provenance Graph
GENERATED BYHarmonize-1
APPROVED BY
RAN AT
2018-01-10 11:45
DEPT
Sr. Data Scientist
Eng-1
ROLE
Source-1
FROM
Source-2
FROM
CONSUMED CONSUMED
-
Semantic graphs compliment a data integration project
Choose the right data model for your data and application needs
- There are good ways to get the benefit of all data models
Use cases:
- Linking and Relationships
- Semantic Search
- Extending entity models
- Provenance
Conclusion
-
Semantic Graphs in the Multi-Model ContextSlide Number 2Slide Number 3Slide Number 4Slide Number 5Slide Number 6Slide Number 7Slide Number 8Slide Number 9Slide Number 10Slide Number 11Where should my data go?Working TogetherWorking TogetherDocument Data as TriplesSlide Number 16Document Data as TablesSlide Number 18Multi-Model QueriesSlide Number 20MarkLogic Multi-ModelTriple IndexTriple Data and Triple ValuesTables in the Triple IndexSlide Number 25Relationships and LinkingSemantic SearchSlide Number 28Entity ServicesProvenance GraphConclusionSlide Number 32