federated query processing universidad simón bolívar...
TRANSCRIPT
![Page 1: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/1.jpg)
Dagstuhl Seminar “Federated Semantic Data Management”, 26-30 June 2017
Federated Query Processing over RDF Graphs
Maria-Esther VidalFraunhofer IAIS, Germany
Universidad Simón Bolívar, Venezuela
1
![Page 2: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/2.jpg)
Motivating Example
Vessels Vessel Observations
Ports
Vessels that have arrived at port of Funchal on 2017-06-21
2
![Page 3: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/3.jpg)
Motivating ExampleDistributed Database
Vessels
Vessel Observations
Ports
mmsi country type
v1 ES tanker
v2 PT tanker
v3 US shipmmsi lat long date
v1 -26.3 56.9 2017-06-21
v1 30.8 70.8 2017-06-23
v2 -26.3 56.9 2017-06-23
v3 -26.3 56.9 2017-06-21
port lat long name
p1 -26.3 56.9 Funchal
p2 30.8 70.8 Lisbon
p3 40.9 89.8 Vigo
Interrelated databases distributed over a computer networkComponents are designed and tuned to work together.
3
![Page 4: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/4.jpg)
Dimensions of Distributed Database Systems
Vessels
Vessel Observations
Ports
Greece
Canada World-Port Index USA
Distribution: Physical distribution of the data over multiple sites.
4
![Page 5: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/5.jpg)
Dimensions of Distributed Database Systems
Vessels
Vessel Observations
Ports
Maritime Traffic
Evergreen Marine World-Port Index
Autonomy: Distribution of the control; degree to which individual systems can operate independently.
5
![Page 6: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/6.jpg)
Dimensions of Distributed Database Systems
Vessels
Vessel Observations
Ports
Maritime Traffic
Evergreen Marine World-Port Index
CSV
JSON
RDF Heterogeneity: Different forms ranging from hardware, differences in network protocols, data models, query languages.
6
![Page 7: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/7.jpg)
Dimensions of Distributed Database Systems [1]Distribution
AutonomyHeterogeneity
7[1] Tamer Ozsu and Patrick Valduriez. Principles of Distributed Database Systems (Third Edition). Springer, 2011.
![Page 8: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/8.jpg)
Agenda1) Distributed Data Management Systems
a) Client-Server Systemsb) Peer-to-Peer Systemsc) Federated Data Management Systems
2) Challenges in Federated Data Management3) Distributed SPARQL Systems 4) Conclusions
8
![Page 9: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/9.jpg)
Distributed Data Management Systems
9
![Page 10: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/10.jpg)
Client-Server SystemsDistribution
AutonomyHeterogeneity
Client-Server Systems: ● Clients run user
applications and interfaces.
● Servers run data management tasks, e.g., query processing and storage.
10
![Page 11: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/11.jpg)
Client-Server Systems
Server Engine
Vessels
Vessel Observations
Ports
Client
Queries Query Answers
11
![Page 12: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/12.jpg)
Peer-to-Peer SystemsDistribution
AutonomyHeterogeneity
Peer-to-Peer Systems: ● Massive distribution● May used different data
models.● Each system manages a
different dataset. ● Peers can
communicate.
12
![Page 13: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/13.jpg)
Peer-to-Peer Systems
Vessels
Vessel Observations
Ports
Maritime Traffic
Evergreen Marine World-Port Index
Peer Peer
Peer
Peers communicate.
13
![Page 14: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/14.jpg)
Federated Data Management (FDM) SystemsDistribution
AutonomyHeterogeneity
FDM Systems: ● Fully autonomous and
have no concept of cooperation.
● May use different data models.
● Each system manages a different database.
14
![Page 15: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/15.jpg)
Federated Data Management (FDM) Systems
Vessels
Vessel Observations
Ports
Maritime Traffic
Evergreen Marine World-Port Index
Federated Engine
CSV RDF
JSON
15
![Page 16: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/16.jpg)
Challenges in Federated Data Management
16
![Page 17: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/17.jpg)
Challenges for Query Processing in FDM
17
Given a query Q in a formal language, i.e., SPARQL● Identify the relevant data sources for Q (Source Selection)● Decompose Q into subqueries on relevant data sources (Query Decomposition)● Plan evaluation of subqueries against relevant data sources (Query Planning)● Merge data collected from relevant data sources (Query Execution)
![Page 18: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/18.jpg)
Challenges for Query Processing in FDMVessels that have arrived at port of Funchal on 2017-06-21 SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
18
![Page 19: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/19.jpg)
Example
SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
19
SELECT ?mmsi ?name ?flagWHERE { SERVICE C1 { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag.} SERVICE C2 { ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. } SERVICE C3 { ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }}
![Page 20: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/20.jpg)
Challenges for Query Processing in FDMVessels that have arrived at port of Funchal on 2017-06-21 SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
20
![Page 21: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/21.jpg)
Challenges for Query Processing in FDMVessels that have arrived at port of Funchal on 2017-06-21 SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
21
![Page 22: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/22.jpg)
Challenges for Query Processing in FDM
22
Given a query Q in a formal language, i.e., SPARQL● Identify the relevant data sources for Q (Source Selection)● Decompose Q into subqueries on relevant data sources (Query Decomposition)● Plan evaluation of subqueries against relevant data sources (Query Planning)● Merge data collected from relevant data sources (Query Execution)
![Page 23: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/23.jpg)
Challenges for Query Processing in FDMVessels that have arrived at port of Funchal on 2017-06-21 SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
23
![Page 24: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/24.jpg)
Challenges for Query Processing in FDM
24
Given a query Q in a formal language, i.e., SPARQL● Identify the relevant data sources for Q (Source Selection)● Decompose Q into subqueries on relevant data sources (Query Decomposition)● Plan evaluation of subqueries against relevant data sources (Query Planning)● Merge data collected from relevant data sources (Query Execution)
![Page 25: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/25.jpg)
Challenges for Query Processing in FDM
?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi.?vessel bdo:flag ?flag.
?observ bdo:lat ?lat.?observ bdo:long ?long.?observ bdo:date "2017-06-21" .?observ bdo:mmsi ?mmsi.
?port bdo:lat ?lat.?port bdo:long ?long.?port bdo:name “Funchal”
25
![Page 26: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/26.jpg)
Challenges for Query Processing in FDM
26
Given a query Q in a formal language, i.e., SPARQL● Identify the relevant data sources for Q (Source Selection)● Decompose Q into subqueries on relevant data sources (Query Decomposition)● Plan evaluation of subqueries against relevant data sources (Query Planning)● Merge data collected from relevant data sources (Query Execution)
![Page 27: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/27.jpg)
Challenges for Query Processing in FDM?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi.?vessel bdo:flag ?flag.
?observ bdo:lat ?lat.?observ bdo:long ?long.?observ bdo:date "2017-06-21" .?observ bdo:mmsi ?mmsi.
?port bdo:lat ?lat.?port bdo:long ?long.?port bdo:name “Funchal”
CSV
RDF
JSON
MongoDB
MongoDB
Virtuoso
27
SELECT name, mmsi, flagFROM VESSELS
SELECT lat, long, mmsiFROM OBSERVATIONSWHERE date=”2017-06-21”
SELECT ?lat, ?long, ?mmsiWHERE {?port bdo:lat ?lat.?port bdo:long ?long.?port bdo:name “Funchal”}
![Page 28: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/28.jpg)
Challenges for Query Processing in FDM
?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi.?vessel bdo:flag ?flag.
?observ bdo:lat ?lat.?observ bdo:long ?long.?observ bdo:date "2017-06-21" .?observ bdo:mmsi ?mmsi.
?port bdo:lat ?lat.?port bdo:long ?long.?port bdo:name “Funchal”
28
MappingintoSPARQL answers
MappingintoSPARQL answers
Operators consume tuples from children and pass output tuples to their parents.
![Page 29: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/29.jpg)
Complexity of FDM Query Processing Tasks
NP-Hard Finding Optimal Plan for Distributed Data Sources [1]
Evaluation of SPARQL Queries is NP-Complete [2]
NP-Hard Finding Optimal Query Decomposition over Distributed RDF Data Sources [3]
29
[1] Tamer Ozsu and Patrick Valduriez. Principles of Distributed Database Systems (Third Edition). Springer, 2011.[2] Jorge Pérez, Marcelo Arenas, Claudio Gutierrez: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3): 16:1-16:45 (2009)[3] Maria-Esther Vidal, Simón Castillo, Maribel Acosta, Gabriela Montoya, Guillermo Palma: On the Selection of SPARQL Endpoints to Efficiently Execute Federated SPARQL Queries. Trans. Large-Scale Data- and Knowledge-Centered Systems 25: 109-149 (2016)
![Page 30: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/30.jpg)
Traditional Federated Query Processing
30
Source Selection and Query Decomposition
Query Optimizer
Query Execution Engine
Statistics Generator
CatalogTraditional Optimize-Then-ExecuteArchitecture
Query
![Page 31: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/31.jpg)
Ideal Federated Management Systems
● Systems able to change their behavior by learning behavior of data providers.
● Receive information from the environment.● Use up-to-date information to change their behavior.● Keep iterating over time to adapt their behavior based
on the environment conditions.
31
![Page 32: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/32.jpg)
Adaptive Query Processing [4]
Adaptivity is needed:▪ Misestimated or missing statistics.▪ Unexpected correlations.▪ Unpredictable costs.▪ Dynamically changing data, workload, and source availability.▪ Changes at rates at which tuples arrive from sources
• Initial Delays.• Slow Delivery.• Bursty Arrivals.
32[4 ] Amol Deshpande, Zachary G. Ives, Vijayshankar Raman: Adaptive Query Processing. Foundations and Trends in Databases, (2007)
![Page 33: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/33.jpg)
Adaptive Federated Query Processing
33
Source Selection and Query Decomposition
Query Optimizer
Query Execution Engine
Statistics Generator
Catalog
Query
Re-optimize
Re-optimize the original plan on-the-fly according to the source conditions.
![Page 34: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/34.jpg)
Levels of Adaptation
34
AdaptationAdaptation Levels
Source Selection
Query Execution
![Page 35: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/35.jpg)
Adaptation Levels
35
Source Selection: searching strategies to select the sources for answering a query according to the real-time source conditions:
● Schema changes● Source availability ● Data distribution changes
![Page 36: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/36.jpg)
Adaptation Levels
36
Query Execution: strategies to adapt query processing to the environment conditions. Adaptation can be at different granularities:
● Fine-grained adaptation of small processes, e.g., ○ Per tuple
● Coarse-grained is attempted for large processes, e.g.,○ Several queries are executed under certain
assumptions about the conditions of the environment
![Page 37: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/37.jpg)
Adaptive Query Processing Strategies
Adaptive Join Processing (Intra-Operator):● Adaptivity performed at tuple level. ● Query operators are able to adapt to the environment
changes, even in the context of a fixed query plan.Non-Pipelined Query Execution (Inter-Operator):● Sub-queries are re-scheduled based on uncertainties:
○ Execution cost of remote access, ○ Size of intermediate results, and ○ Unexpected delays.
37
![Page 38: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/38.jpg)
Adaptive Query Processing Strategies
Adaptive Join Processing (Intra-Operator):● Adaptivity performed at tuple level. ● Query operators are able to adapt to the environment
changes, even in the context of a fixed query plan.Non-Pipelined Query Execution (Inter-Operator):● Sub-queries are re-scheduled based on uncertainties:
○ Execution cost of remote access, ○ Size of intermediate results, and ○ Unexpected delays.
38
![Page 39: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/39.jpg)
Distributed SPARQL Systems
39
![Page 40: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/40.jpg)
SPARQL Web-based Interfaces
40
SPARQL Endpoints: RESTful services that accept queries over HTTP written in the SPARQL query language and respecting the SPARQL protocol.
SELECT ?long WHERE { ?port <http://bigdataocean.eu/name> <http://bigdataocean.eu/Funchal> ?port <http://bigdataocean.eu/long> ?long.}
http://bigdataocean.eu/sparql?query=SELECT%20%3Flong%20WHERE%20%7B%20%3Fport%20%3Chttp%3A%2F%2Fbigdataocean.eu%2Fproperty%2Fname%3E%20%3Chttp%3A%2F%2Fbigdataocean.eu%2Fresource%2FFunchal%3E.%20%3Fport%20%3Chttp%3A%2F%2Fbigdataocean.eu%2Fproperty%2Flong%3E%20%3Flong%20%7D
![Page 41: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/41.jpg)
SPARQL 1.1 Federated Extension
41
SERVICE <http://bigdataocean.de/sparql> {SELECT ?long WHERE { ?port <http://bigdataocean.eu/name> <http://bigdataocean.eu/Funchal> ?port <http://bigdataocean.eu/long> ?long.}}
Q1
Q0
The result of evaluating Q0 corresponds to the result of evaluating Q1 in the RDF dataset accessible through the endpoint <http://bigdataocean.de/sparql>
Carlos Buil Aranda, Marcelo Arenas, Óscar Corcho, Axel Polleres: Federating queries in SPARQL 1.1: Syntax, semantics and evaluation. J. Web Sem. 18(1): 1-17 (2013)
![Page 42: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/42.jpg)
SPARQL 1.1 Federation Extension
42Carlos Buil Aranda, Marcelo Arenas, Óscar Corcho, Axel Polleres: Federating queries in SPARQL 1.1: Syntax, semantics and evaluation. J. Web Sem. 18(1): 1-17 (2013)
![Page 43: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/43.jpg)
SPARQL Web-based Interfaces
43
Triple Pattern Fragment:
Web-based access interface that allows for evaluating a triple pattern over an RDF dataset.
“Accessing a TPF interface is equivalent to accessing a SPARQL endpoint with queries that contain a single triple pattern only” [5]
[5] Ruben Verborgh, Miel Vander Sande, Olaf Hartig, Joachim Van Herwegen, Laurens De Vocht, Ben De Meester, Gerald Haesendonck, Pieter Colpaert: Triple Pattern Fragments: A low-cost knowledge graph interface for the Web. J. Web Sem.( 2016)
![Page 44: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/44.jpg)
Client-Server SPARQL Query Engines [5]
44
Server of Triple Pattern Fragments (TPFs)
Clients of Triple Pattern Fragments (TPFs)
TPF: Web-based interface to execute SPARQL triple patterns ?vessel bdo:mmsi ?mmsi
TPF Clients: Execute SPARQL queries over a TPF Server
SPARQL Query Q
[5] Ruben Verborgh, Miel Vander Sande, Olaf Hartig, Joachim Van Herwegen, Laurens De Vocht, Ben De Meester, Gerald Haesendonck, Pieter Colpaert: Triple Pattern Fragments: A low-cost knowledge graph interface for the Web. J. Web Sem.( 2016)
![Page 45: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/45.jpg)
Adaptive Client for Triple Pattern Fragments
45
SPARQL Query Q
Query Optimizer
Adaptive Engine
Routing Policies
Input
Optimized plan
Results for QOutput
TPF Server
nLDE [6]: ● Optimization techniques
tailored for TPFs● Autonomous Eddies● Routing policies for
SPARQL
[6] Maribel Acosta, Maria-Esther Vidal: Networks of Linked Data Eddies: An Adaptive Web Query Processing Engine for RDF Data. International Semantic Web Conference (2015)
![Page 46: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/46.jpg)
Peer-to-Peer SPARQL Query Engines
46
SPARQL Query Q
Avalanche SPARQL Endpoint
Endpoint Directoryor Search Engine
Request Statistics
Distributed join
Avalanche [7]: ● Endpoints may
dereference data from other endpoints.
● On-line statistical information about relevant data sources.
● Queries are executed in a concurrent and distributed fashion.
[7] Cosmin Basca, Abraham Bernstein: Querying a messy web of data with Avalanche. J. Web Sem. (2014)
![Page 47: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/47.jpg)
Federated SPARQL Query Engines
47
SPARQL endpoints: Web-access interfaces that allow for querying RDF data following SPARQL protocol.
![Page 48: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/48.jpg)
Federated SPARQL Query Engines
48
Distribution
AutonomyHeterogeneity
![Page 49: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/49.jpg)
Federated SPARQL Query Engines
49
Federated SPARQL Query Engine
ANAPSID[8]
SPLENDID [9]
Distribution
AutonomyHeterogeneity
[10]
[16]
![Page 50: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/50.jpg)
Federated SPARQL Query Engines
50
Federated SPARQL Query Engine
Distribution
AutonomyHeterogeneity
LILAC[11] FEDRA[12]Fed-DESATUR[3]
MULDER[14]
Extensions
DAW[13]HIBISCUS[15]
ANAPSID[8]
SPLENDID [9]
[10]
[16]
![Page 51: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/51.jpg)
Adaptivity During Source Selection
51
Fine-Grained Adaptivity
ANAPSID SPLENDID
Coarse-Grained Adaptivity
No Adaptivity
Fed-DESATURMULDERDAW
HIBISCUS
LILAC FEDRA
![Page 52: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/52.jpg)
Adaptivity During Source Selection
52
Fine-Grained Adaptivity
Coarse-Grained Adaptivity
No Adaptivity
MULDER
![Page 53: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/53.jpg)
Adaptivity During Source Selection-FedX
SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
53
For triple pattern, (s p o): A query is sent to the SPARQL endpoints for identifying relevant sources
![Page 54: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/54.jpg)
Adaptivity During Source Selection-FedX
SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
54
Exclusive groups:Subqueries are executed by only one source
sq1sq2
sq3
sq4
sq5
![Page 55: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/55.jpg)
Adaptivity During Source Selection-FedX
55
sq1
sq2
sq3
sq4 sq5
Exclusive groups are executed first in the plan
![Page 56: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/56.jpg)
Adaptivity During Source Selection-Mulder
56
Sources are described in terms of Classes and properties.
![Page 57: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/57.jpg)
Adaptivity During Source Selection-Mulder
57
Sources are described in terms of Classes and properties.
Source descriptions are exploited during query decomposition.
SELECT ?mmsi ?name ?flagWHERE { ?vessel bdo:name ?name. ?vessel bdo:mmsi ?mmsi. ?vessel bdo:flag ?flag. ?observ bdo:lat ?lat. ?observ bdo:long ?long. ?observ bdo:date "2017-06-21" . ?observ bdo:mmsi ?mmsi. ?port bdo:lat ?lat. ?port bdo:long ?long. ?port bdo:name “Funchal” }
![Page 58: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/58.jpg)
?port bdo:lat ?lat .?port bdo:long ?long . ?port bdo:name "Funchal"
?vessel bdo:name ?name .?vessel bdo:mmsi ?mmsi .?vessel bdo:flag ?flag .
Query Planning-Mulder
?observation bdo:lat ?lat .?observation bdo:long ?long .?observation bdo:date "2017-06-21" .?observation bdo:mmsi ?vessel .
![Page 59: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/59.jpg)
Empirical Evaluation
59
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
![Page 60: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/60.jpg)
Empirical Evaluation
60
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
![Page 61: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/61.jpg)
Empirical Evaluation
61
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
Adaptivity is a costly task
![Page 62: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/62.jpg)
Adaptivity During Query Execution
62
Fine-Grained Adaptivity
ANAPSIDSPLENDID
No Adaptivity
![Page 63: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/63.jpg)
Adaptivity During Query Execution
63
Fine-Grained Adaptivity
ANAPSIDSPLENDID
No Adaptivity
Fed-DESATURMULDER
DAW HIBISCUS
LILAC FEDRA
![Page 64: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/64.jpg)
Intra-Operator Adaptivity
Intra-Operator Strategies:
● SPARQL operators able to detect when sources become blocked or data traffic is bursty
● Opportunistically produce results as quickly as data arrives from the sources
● Results are produced incrementally
64
JOIN
![Page 65: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/65.jpg)
Inter-Operator AdaptivityInter-Operator Strategies:
● Produce an answer as soon as it is computed
● Can keep producing intermediate results even when data a source becomes blocked
65
JOIN JOIN
JOIN
![Page 66: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/66.jpg)
Inter-Operator AdaptivityInter-Operator Strategies:
● Produce an answer as soon as it is computed
● Can keep producing intermediate results even when data a source becomes blocked
66
JOIN
JOIN
JOIN
![Page 67: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/67.jpg)
Empirical Evaluation
67
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
![Page 68: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/68.jpg)
Empirical Evaluation
68
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
![Page 69: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/69.jpg)
Empirical Evaluation
69
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
![Page 70: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/70.jpg)
Empirical Evaluation
70
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
![Page 71: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/71.jpg)
Empirical Evaluation
71
FedBench Benchmark:● Ten RDF datasets (314K to 44M)
25 Queries:● Cross Domain (CD)● Linked Data (LD)● Life Science (LS)
Ten Complex (C) queries
Completeness versus Execution Time
Adaptivity is a costly task
![Page 72: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/72.jpg)
Adaptive Query Processing
72
No Inter-Operator Strategy
Inter-Operator Strategy
Delays simulated with a Gamma distribution ( =1, =0.3)
Twenty-five queries against DBpedia; basic graph patterns with up to 15 triple patterns. Adaptive query processing strategies are able to speed up query execution in presence of delays
[6] M. Acosta, M.E. Vidal: Networks of Linked Data Eddies: An Adaptive Web Query Processing Engine for RDF Data. ISWC 2015
![Page 73: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/73.jpg)
Data Replication-Aware
73
Replication-Aware
ANAPSID
SPLENDID
No Replication
Fed-DESATURMULDER HIBISCUS
LILAC
FEDRA
DAW
![Page 74: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/74.jpg)
Experimental Study
74
Each dataset is accessible through 10 endpoints. Each endpoint has data for 100 random queries.The same fragment is replicated in at least 3 endpointsWatDiv datasets: 50,000 queriesRest of the datasets: 10,000 queries
Diseasome: 72,445 SWDF: 198,797DBpedia Geo: 1,900,004LinkedMDB: 3,579,610WatDiv1: 104,532WatDiv100: 10,934,518
[11] Gabriela Montoya, Hala Skaf-Molli, Pascal Molli, Maria-Esther Vidal: Decomposing federated queries in presence of replicated fragments. J. Web Sem. (2017)
![Page 75: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/75.jpg)
Experimental Study
75
Each dataset is accessible through 10 endpoints. Each endpoint has data for 100 random queries.The same fragment is replicated in at least 3 endpointsWatDiv datasets: 50,000 queriesRest of the datasets: 10,000 queries
Diseasome: 72,445 SWDF: 198,797DBpedia Geo: 1,900,004LinkedMDB: 3,579,610WatDiv1: 104,532WatDiv100: 10,934,518
![Page 76: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/76.jpg)
Experimental Study
76
Consider replication allows for producing complete answers while the execution time is reduced
![Page 77: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/77.jpg)
Open Issues
77
Distribution
AutonomyHeterogeneity
Distribution
AutonomyHeterogeneity
Diverse types of heterogeneity:Data models, query capabilities,availability.
![Page 78: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/78.jpg)
Open Issues
● Formal Definition
● Access Control● Security ● Privacy
78
● Data Evolution● Synchronization ● Big Data
![Page 79: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/79.jpg)
Conclusions
79
![Page 80: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/80.jpg)
References[1] Tamer Ozsu and Patrick Valduriez. Principles of Distributed Database Systems (Third Edition). Springer (2011)[2] Jorge Pérez, Marcelo Arenas, Claudio Gutierrez: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3): 16:1-16:45 (2009)[3] Maria-Esther Vidal, Simón Castillo, Maribel Acosta, Gabriela Montoya, Guillermo Palma: On the Selection of SPARQL Endpoints to Efficiently Execute Federated SPARQL Queries. Trans. Large-Scale Data- and Knowledge-Centered Systems 25: 109-149 (2016)[4 ] Amol Deshpande, Zachary G. Ives, Vijayshankar Raman: Adaptive Query Processing. Foundations and Trends in Databases, (2007)[5] Ruben Verborgh, Miel Vander Sande, Olaf Hartig, Joachim Van Herwegen, Laurens De Vocht, Ben De Meester, Gerald Haesendonck, Pieter Colpaert: Triple Pattern Fragments: A low-cost knowledge graph interface for the Web. J. Web Sem.( 2016)[6] Maribel Acosta, Maria-Esther Vidal: Networks of Linked Data Eddies: An Adaptive Web Query Processing Engine for RDF Data. International Semantic Web Conference (2015)
80
![Page 81: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/81.jpg)
References[7] Cosmin Basca, Abraham Bernstein: Querying a messy web of data with Avalanche. J. Web Sem. (2014)[8] Maribel Acosta, Maria-Esther Vidal, Tomas Lampo, Julio Castillo, Edna Ruckhaus: ANAPSID: An Adaptive Query Processing Engine for SPARQL Endpoints. International Semantic Web Conference (2011)[9] Olaf Görlitz, Steffen Staab: SPLENDID: SPARQL Endpoint Federation Exploiting VOID Descriptions. COLD (2011)[10] Andreas Schwarte, Peter Haase, Katja Hose, Ralf Schenkel, Michael Schmidt: FedX: Optimization Techniques for Federated Query Processing on Linked Data. International Semantic Web Conference (2011)[11] Gabriela Montoya, Hala Skaf-Molli, Pascal Molli, Maria-Esther Vidal: Decomposing federated queries in presence of replicated fragments. J. Web Sem. (2017)[12] Gabriela Montoya, Hala Skaf-Molli, Pascal Molli, Maria-Esther Vidal: Federated SPARQL Queries Processing with Replicated Fragments. International Semantic Web Conference (2015)
81
![Page 82: Federated Query Processing Universidad Simón Bolívar ...materials.dagstuhl.de/files/17/17262/17262.Maria... · Complexity of FDM Query Processing Tasks NP-Hard Finding Optimal Plan](https://reader036.vdocuments.us/reader036/viewer/2022071023/5fd7ca416ae4a1367e0e5c49/html5/thumbnails/82.jpg)
References[13] Muhammad Saleem, Axel-Cyrille Ngonga Ngomo, Josiane Xavier Parreira, Helena F. Deus, Manfred Hauswirth: DAW: Duplicate-AWare Federated Query Processing over the Web of Data. International Semantic Web Conference (2013)[14] Kemele M. Endris, Mikhail Galkin, Ioanna Lytra, Mohamed Nadjib Mami, Maria-Esther Vidal, Sören Auer: MULDER: Querying the Linked Data Web by Bridging RDF Molecule Templates. International Conference on Database and Expert Systems Applications (2017)[15] Muhammad Saleem, Axel-Cyrille Ngonga Ngomo: HiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation. Extended Semantic Web Conference (2014)[16] SemaGrow: Optimizing federated SPARQL queries Angelos Charalambidis, Antonis Troumpoukis and Stasinos Konstantopoulos In Proceedings of the 11th International Conference on Semantic Systems (SEMANTiCS 2015)
82