implementing database coordination in p2p networks *

18
Implementing Database Coordination in P2P Networks * Ilya Zaihrayeu SemPGRID-04, 18 May 2004, New York, USA * work with Fausto Giunchiglia

Upload: ina

Post on 09-Jan-2016

32 views

Category:

Documents


4 download

DESCRIPTION

Implementing Database Coordination in P2P Networks *. Ilya Zaihrayeu. SemPGRID-04, 18 May 2004, New York, USA. * work with Fausto Giunchiglia. Why P2P Databases. P2P data sharing: files … relational data? - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Implementing Database Coordination in P2P Networks  *

Implementing Database Coordination in P2P Networks *

Ilya Zaihrayeu

SemPGRID-04, 18 May 2004, New York, USA

* work with Fausto Giunchiglia

Page 2: Implementing Database Coordination in P2P Networks  *

Why P2P Databases

• P2P data sharing: files … relational data?• File sharing: KaZaa + Morpheus = more than 460 million

downloads (download.com, May 2004)• P2P databases: academia testbeds so far..• Promises: large-scale fault-tolerant multi-database system

with low start-up and maintenance costs, and high “output” for an individual party

• Difficulties: data integration solutions are not applicable due to centralized nature

• Challenges: new methodologies, theories and algorithms, models, mechanisms and tools need to be developed

Page 3: Implementing Database Coordination in P2P Networks  *

Why P2P Databases, cont’d

• Application: non performance critical domains, where local autonomy of each party is essential

• Medical care scenario– John is going for skiing and suffers an accident– John is taken to local clinic for treatment – doctors need to know whether

John has contraindication against some drugs– John does not know these details, but his database layer has a link to family

doctor’s databases• Cooperating real estate agents example

– Agents coordinate their data to push sales– When on the site of a customer who wants to sell, agent updates his

database and makes data available for other agents– When on the site of a customer who may want to buy, agent shows details

from his database, and may query other agent’s databases• Other examples: scientific databases (genomic data), tourism, etc

Page 4: Implementing Database Coordination in P2P Networks  *

Data Coordination Model

• Interest Groups – group of peers able to answer queries about a certain topic– e.g., group topic – “Tourism in Trentino”, “Real Estate in Scotland”, etc– each Interest Group has group manager (GM) which helps in maintenance of

the group

• Acquaintances – “known” nodes that contribute data– acquaintance query – a query over the relations of an acquaintance which

results satisfy some local relation

• Correspondence Rules – solve heterogeneity problem at instance level– semantic heterogeneity at structure level is solved by acquaintance queries

• Coordination Rules – coordinate data (queries and updates) with acquaintances

Page 5: Implementing Database Coordination in P2P Networks  *

Interest Groups

• Help to cope with large number of nodes by clustering the network

• Nodes self-organize into interest groups

• A node may form a child interest group

• One node may belong to multiple groups

• Use schema matching to monitor group constitution

• GM is to support group constitution, “talk” to other GMs and provide information about the group to newcomers

All topics

Arts Shopping

Movies Music… Publications Computers…

Lyrics Books

Page 6: Implementing Database Coordination in P2P Networks  *

Acquaintance query

• Acquaintance query is a conjunctive query:• q(X) :- r1(X1), …, rn(Xn)

– q(X) – head, refers to local relation;– r1(X1), …, rn(Xn) – subgols of the body, refers to the relation of an

acquaintance; and comparison predicates– X, X1,…, Xn – variables or constants;

• E.g., P1: films (title, year, genre) :- P2: movie (title, year, director); genres (title, genre); year>1995

1 2

3

4

A B C D

E F

I G

I :- A,B

B :- C,D

D :- I,G

C :- E,F

F :- G

A loop

Page 7: Implementing Database Coordination in P2P Networks  *

Correspondence Rules and Coordination Rules

• Correspondence rules define how constants from the local domain are translated into constants in the domain of an acquaintance (forward translation) and vice versa (backward translation)– not necessarily symmetric, e.g. currency translation

• Coordination Rules’ goal is data coordination with acquaintances and acquainted nodes– activated by user (user query) or from the network

(network query, results, update)

Page 8: Implementing Database Coordination in P2P Networks  *

Algorithmic notes

• Query answering algorithm– Use acquaintance queries and correspondence rules to translate queries and

data– Propagate to acquaintances if acquaintance queries are relevant– Compute only new tuples, reconcile results– Process loops in query propagation, define termination point (no propagation

using acquaintance queries that have been already used)

• “Getting acquainted” protocol– Retrieve database schemas and then apply a matching operator on them– Based on the matching results, generate (with help of user) acquaintance

queries, correspondence rules, tune up coordination rules

• Updates handling (work with E. Franconi, G. Kuper, A. Lopatenko)– Data may go through a loop more than once, define termination point

Page 9: Implementing Database Coordination in P2P Networks  *

Implementing P2P databases on top of JXTA

• Benefits– system platform, networking protocol independence– IP-independence (location independence)– gives basic blocks for building P2P applications

• We implement Interest Groups and Acquaintances in JXTA• We encode database related functionalities into a set of custom

JXTA services (DB-related services)

DB-related services

Node-level services Group-level services

Queries handler

DBoperations

… Screening service

GM service

Page 10: Implementing Database Coordination in P2P Networks  *

Architecture

A node

PDBMS

User Interface (UI)

Database Manager (DBM)

Wrapper

Source Database (SDB)

User

A P2P database network

A P2P database network

User-1

User-2

User-n

Nodes on the

network

JXTA Layer

SS

Page 11: Implementing Database Coordination in P2P Networks  *

Architecture, cont’d

JXTA Layer

DBM

User Interface (UI)

Wrapper

In

Out

Disco-very

Query Planner

Pip

es

Query Propagation

P2P Management

Coordination Rules Acquaintances

Peer Groups

Services

JXTA Core Services

GM in-pipe advDB-related services

Results Handler

Acquaintance queries

Correspondence Rules

Advertisements

Peer Adv

Peer Gr. Adv

Gr. topic

Pipe Adv

SS

Updates Handler

Page 12: Implementing Database Coordination in P2P Networks  *

Demo: toy databases and topology

Relations:(1) Movie (title, year, genre)

(2) Credits (name, title, role)

(3) Movie2 (title, year, director)

(4) Genre (title, genre)

0

1

2

5

4

3

Q

[1,2]

[1,2]

[2]

[2,3,4]

[3]

[4]

(1:-1)

(2:-2)

(3:-3)

(4:-4)

(1:-3,4)

(2:-2)

(2:-2)

(4:-1)

Rendezvous peer

Mediator peer

Page 13: Implementing Database Coordination in P2P Networks  *

Query example 1

“List titles of movies featuring Tom Hanks”

Q(t) :- Credits (n,t,r); n=“Tom Hanks”

0

1

2

5

4

3

Q

[1,2]

[1,2]

[2]

[2,3,4]

[3]

[4]

(1:-1)

(2:-2)

(2:-2)

(3:-3)

(4:-4)

(1:-3,4)

(2:-2)

(2:-2)

(4:-1)

Page 14: Implementing Database Coordination in P2P Networks  *

Query example 2

“Titles of drama movies issued after 1995”

Q(t) :- Movie (t,y,g); g=“Drama”; y>1995;

0

1

2

5

4

3

Q

[1,2]

[1,2]

[2]

[2,3,4]

[3]

[4]

(1:-1)

(2:-2)

(3:-3)

(4:-4)

(1:-3,4)

(2:-2)

(2:-2)

(4:-1)

Page 15: Implementing Database Coordination in P2P Networks  *

Query example 3

“Names of actors playing in action movies in 2003”

Q(n) :- Movie (t,y,g); Credits (n,t,r); r=“Actor”; g=“Action”; y=2003;

0

1

2

5

4

3

Q

[1,2]

[1,2]

[2]

[2,3,4]

[3]

[4]

(1:-1)

(2:-2)

(3:-3)

(4:-4)

(1:-3,4)

(2:-2)

(2:-2)

(4:-1)

Page 16: Implementing Database Coordination in P2P Networks  *

References

• F. Giunchiglia and I. Zaihrayeu. Making peer databases interact - a vision for an architecture supporting data coordination. 6th International Workshop on Cooperative Information Agents (CIA-2002), Madrid, Spain, September 18 -20, 2002.

• P. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu, “Data management for peer-to-peer computing: A vision,” WebDB, 2002.

• A. Halevy, Z. Ives, D. Suciu, and I. Tatarinov, “Schema mediation in a peer data management system,” ICDE, 2003.

• V. Kantere, I. Kiringa, J. Mylopoulos, A. Kementsietsidis, and M. Arenas, “Coordinating peer databases using ECA rules,” DBISP2P, September 2003.

• Enrico Franconi, Gabriel Kuper, Andrei Lopatenko, Ilya Zaihrayeu (2004). The coDB Robust Peer-to-Peer Database System. Proc. of the 2nd Workshop on Semantics in Peer-to-Peer and Grid Computing (SemPGrid'04), 2004

• JXTA project, see http://www.jxta.org

Page 17: Implementing Database Coordination in P2P Networks  *

Announcement

Submission deadline: 30 June, 2004

www.p2pkm.org

Page 18: Implementing Database Coordination in P2P Networks  *

Thank you