scalable nonmonotonic reasoning over rdf data using mapreduce

24
Ilias Tachmazidis 1,2 , Grigoris Antoniou 1,2,3 , Giorgos Flouris 2 , Spyros Kotoulas 4 1 University of Crete 2 Foundation for Research and Technolony, Hellas (FORTH) 3 University of Huddersfield 4 Smarter Cities Technology Centre, IBM Research, Ireland

Upload: planetdata-network-of-excellence

Post on 26-Dec-2014

155 views

Category:

Technology


2 download

DESCRIPTION

by IliasTachmazidis, Grigoris Antoniou, Giorgos Flouris, Spyros Kotoulas

TRANSCRIPT

Page 1: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Ilias Tachmazidis1,2, Grigoris Antoniou1,2,3, Giorgos Flouris2, Spyros Kotoulas4

1 University of Crete2 Foundation for Research and Technolony, Hellas (FORTH)

3 University of Huddersfield4 Smarter Cities Technology Centre, IBM Research, Ireland

Page 2: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

MotivationBackground◦ Defeasible Logic◦ MapReduce Framework◦ RDFMulti-Argument Implementation over RDFExperimental EvaluationFuture Work

Page 3: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Huge data set coming from◦ the Web, government authorities, scientific

databases, sensors and moreDefeasible logic ◦ is suitable for encoding commonsense knowledge

and reasoning◦ avoids triviality of inference due to low-quality dataDefeasible logic has low complexity◦ The consequences of a defeasible theory D can be

computed in O(N) time, where N is the number of symbols in D

Page 4: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Reasoning is performed in the presence of defeasible rulesDefeasible logic has been implemented for in-memory reasoning, however, it was not applicable for huge data setsSolution: scalability/parallelization using the MapReduce framework

Page 5: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Facts ◦ e.g. bird(eagle)Strict Rules◦ e.g. bird(X) → animal(X)Defeasible Rules◦ e.g. bird(X) ⇒ flies(X)

Page 6: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Defeaters◦ e.g. brokenWing(X) ↝ ¬ flies(X)

Priority Relation (acyclic relation on the set of rules)◦ e.g. r: bird(X) ⇒ flies(X)

r’: brokenWing(X) ⇒ ¬ flies(X)r’ > r

Page 7: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Inspired by similar primitives in LISP and other functional languagesOperates exclusively on <key, value> pairsInput and Output types of a MapReduce job:◦ Input: <k1, v1>◦ Map(k1,v1) → list(k2,v2)◦ Reduce(k2, list (v2)) → list(k3,v3)◦ Output: list(k3,v3)

Page 8: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Provides an infrastructure that takes care of◦ distribution of data◦ management of fault tolerance◦ results collectionFor a specific problem◦ developer writes a few routines which are following

the general interface

Page 9: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Rule sets can be divided into two categories:◦ Stratified◦ Non-stratifiedPredicate Dependency Graph

Page 10: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Consider the following rule set:◦ r1: X sentApplication A, A completeFor D ⇒ X acceptedBy D.◦ r2: X hasCerticate C, C notValidFor D ⇒ X ¬acceptedBy D.◦ r3: X acceptedBy D, D subOrganizationOf U ⇒

X studentOfUniversity U.◦ r1 > r2.Both acceptedBy and ¬acceptedBy are represented by acceptedBySuperiority relation is not part of the graph

Page 11: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Initial pass: ◦ Transform facts into <fact, (+Δ, +∂)> pairsNo reasoning needs to be performed for the lowest stratum (stratum 0)For each stratum from 1 to N◦ Pass1: Calculate fired rules◦ Pass2: Perform defeasible reasoning

Page 12: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

INPUTLiterals in multiple files

File02‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

<John hasCerticate Cert, (+Δ, +∂)><Cert notValidFor Dep, (+Δ, +∂)>

<Dep subOrganizationOf Univ, (+Δ, +∂)>

File01‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

<John sentApplication App, (+Δ, +∂)> <App completeFor Dep, (+Δ, +∂)> 

<key, < John sentApplication App, (+Δ, +∂)>>

< key, < App completeFor Dep, (+Δ, +∂)>>

< key, < John hasCerticate Cert, (+Δ, +∂)>>

< key, < Cert notValidFor Dep, (+Δ, +∂)>>

< key, < Dep subOrganizationOf Univ, (+Δ, +∂)>>

MAP phase Input<position in file, literal and knowledge>

Page 13: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Group

ing/Sorting

<App, <(John,sentApplication,+Δ,+∂),(Dep,completeFor,+Δ,+∂)>>

<Cert,<(John,hasCerticate,+Δ,+∂),(Dep,notValidFor,+Δ,+∂)>>

Reduce phase Input<matchingArgValue, 

List(Non‐MatchingArgValue,Predicate, knowledge)>

MAP phase Output<matchingArgValue, 

(Non‐MatchingArgValue,Predicate, knowledge)>

<App,(John,sentApplication,+Δ,+∂)>

<App,(Dep,completeFor,+Δ,+∂)>

<Cert, (John,hasCerticate,+Δ,+∂)>

<Cert, (Dep,notValidFor,+Δ,+∂)>

Page 14: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

<John acceptedBy Dep, (+∂, r1)>

<John acceptedBy Dep,(¬, +∂,r2)>

Reduce phase Output (Final Output)

<literal and knowledge>

Page 15: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

INPUTLiterals in multiple files

File01‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

<John sentApplication App, (+Δ, +∂)> <App completeFor Dep, (+Δ, +∂)> <John hasCerticate Cert, (+Δ, +∂)><Cert notValidFor Dep, (+Δ, +∂)>

<Dep subOrganizationOf Univ, (+Δ, +∂)>

File02‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

<John acceptedBy Dep, (+∂, r1)><John acceptedBy Dep, (¬, +∂, r2)>

MAP phase Input<position in file, literal and knowledge>

<key, < Dep subOrganizationOf Univ, (+Δ,+∂)>>

< key, < John acceptedBy Dep, (+∂, r1)>>

< key, < John acceptedBy Dep, (¬, +∂, r2)>>

<key, < John sentApplication App, (+Δ, +∂)>>

< key, < App completeFor Dep, (+Δ, +∂)>>

< key, < John hasCerticate Cert, (+Δ, +∂)>>

< key, < Cert notValidFor Dep, (+Δ, +∂)>>

Page 16: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

< Dep subOrganizationOf Univ, (+Δ,+∂)>

< John acceptedBy Dep, (+∂, r1)>

< John acceptedBy Dep, (¬, +∂, r2)>

MAP phase Output<literal, knowledge>

Group

ing/Sorting

< Dep subOrganizationOf Univ, (+Δ,+∂)>

< John acceptedBy Dep, <(+∂, r1), (¬, +∂, r2)>>

Reduce phase Input<literal, list(knowledge)>

Page 17: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

< John acceptedBy Dep, (+∂)>

Reduce phase Output (Final Output)

<Conclusions after reasoning>

No output

Page 18: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

LUBM (up to 1B)Custom defeasible ruleset

IBM Hadoop Cluster v1.3 (Apache Hadoop0.20.2)40-core serverXIV storage SAN

Page 19: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Page 20: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Page 21: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce
Page 22: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Challenges of Non-Stratified Rule SetsAn efficient mechanism is need for –Δ and -∂◦ all the available information for the literal must be

processed by a single node causing:main memory insufficiencyskewed load balancing

Storing conclusions for +/–Δ and +/-∂ is not feasible◦ Consider the cartesian product of X, Y, Z for

X Predicate1 Y, Y Predicate2 Z.

Page 23: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce

Run extensive experiments to test the efficiency of multi-argument defeasible logicApplications on real datasets, with low-quality dataMore complex knowledge representation methods such as: ◦ Answer-Set programming◦ Ontology evolution, diagnosis and repairAI Planning

Page 24: Scalable Nonmonotonic Reasoning over RDF Data Using MapReduce