Download - PDQ Poster

Transcript
Page 1: PDQ Poster

Finding Plans from Proofs

PDQ: Proof-driven Query Answering over Web-based Data Michael Benedikt, Julien Leblay, Efthymia Tsamoura - Oxford University

Supported by EPSRC grant EP/H017690/1, Query-driven Data Acquisition from Web-based Data Sources

Project homepage: http://www.cs.ox.ac.uk/projects/pdq/

Contact: [email protected]

Example: online services for geographic information

r1: Places(id, name, type, coordinates, ...) information about places (e.g. city, country, continent, lake, etc.)

r2: BelongsTo(source, target) containment between places, "China belongs to Asia".

r3: Countries(id, name, iso_code, ...) information about countries.

φ1:Places(x, y, Country, ...) ↔ Countries(x, y, ...)

Query for countries in Asia: not answerable without considering constraints. SELECT p1.name FROM BelongsTo AS bt

JOIN Places AS p1 ON p1.id=bt.source

JOIN Places AS p2 ON p2.id=bt.target

WHERE p1.type = ’Country’ AND p2.name = ’Asia’

Pre-processing steps create auxiliary schema by adding relations InferredAccPlaces,

InferredAccBelongsTo, InferredAccCountries, Accessible and constraints: φ’1: InferredAccPlaces(x, y, Country, ...) ↔ InferredAccCountries(x, y,...)

α1: Accessible(y)∧Places(x, y , z, ...)

→ InferredAccPlaces(x, y, z, ...)∧Accessible(x)∧Accessible(z)∧ ...

α2: Accessible(x)∧BelongsTo(x, y) → InferredAccBelongsTo(x, y)∧Accessible(y)

α3: Countries(x, y, z, ...) → InferredAccCountries(x, y, z, ...)∧Accessible(x)∧…

α4: …

Context

Web data sources which may have: • overlapping information, • access restrictions.

As a result: • There may be no web query plan for a given user query. • There may be many plans using different sources with different costs.

Need to reason about Integrity constraints and access limitations.

PDQ

System for determining a query plan in the presence of web-based sources. i. constraint-aware ii. access-aware – abiding by access restrictions, iii. cost-aware – making use of any cost information

Approach: generating query plans from proofs that a query is answerable.

Input S: Schema ⟨R, Σ⟩, R set of relations with access methods (free, limited, inaccessible), Σ set of integrity constraints (TGDs). Q: Conjunctive query over S.

f: Cost function on evaluation plans.

Output Pbest: plan with minimal cost.

Step 1: Pre-processing S augmented with new relations and axioms modelling the access restrictions. A goal query Qinferred is created based on the relations of the augmented schema. Q is grounded to form the initial state of the plan search.

Step 2: Basic search step Each state is closed under firing of rules (blue arrows) other than accessibility axioms (denoted αi).

Every possible firing of accessibility axioms (red arrows) gives a new candidate state, inheriting all the facts of its ancestors.

Step 3: Plans and costs Each new state gives a plan, to which a cost is assigned (orange circles).

If state corresponds to a match with Qinferred and its plan’s cost is lower than the best so far, it becomes the new best state.

Queries over Web Data

Architecture & User Experience

User interface for creating and editing schemas and queries

Interactive exploration of the planner’s search space. Online execution of plans.

User interface for creating and configuring planning sessions.

Dashboard

Architecture Runtime Planner

InferredAccPlaces(id2, "Asia", c2, …), Accessible(id2), Accessible(c2), …

T’1 ⇐ Places ⇐ ("𝐴𝑠𝑖𝑎")

InferredAccPlaces(id2, "Asia", c2, …), Accessible(id2), Accessible(c2), …

T2 ⇐ Places ⇐("𝐴𝑠𝑖𝑎") T3 := T1 ⋈ T2

InferredAccBelongsTo(id1, id2)

T4 ⇐ BelongsTo ⇐ π source (T3)

T5 := π name ( T3 ⋈ T4 )

Places(id1, name1, "Country", …), Places(id2, "Asia", …), BelongsTo(id1, id2), Accessible("Asia"), Accessible("Country")

Initial State

Countries(id1, name1, c1, …)

φ1

Goal : Qinferred(name) ← InferredAccPlaces(id1, name1, "Country", …)

∧ InferredAccPlaces(id2, "Asia", …)∧ InferredAccBelongsTo(id1, id2)

φ‘1

α1

α1

α2

α3

InferredAccCountries(id1, name1, c1, …), Accessible(id1), Accessible(name1), Accessible(c1)

T1 ⇐ Countries ⇐ Ø

InferredAccPlaces(id1, name1, "Country", …)

InferredAccCountries(id1, name1, c1, …), Accessible(id1), Accessible(name1), …

T’2 ⇐ Countries ⇐ Ø

T’3 := T’1 ⋈ T’2

InferredAccPlaces(id1, name1, "Country", …)

φ‘1

α3

InferredAccBelongsTo(id1, id2)

T’4 ⇐ BelongsTo ⇐ π source (T’3)

T‘5 := π name ( T‘3 ⋈ T‘4 )

α2

3

2

25

35

45 55

Models free access on Countries

Top Related