peerdb : a p2p-based system for distributed data sharing db lab. m.s. 3 lee min young wee siong ng,...

22
PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International Conference on Data Engineering, 2003

Upload: dulcie-mason

Post on 18-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

PeerDB : A P2P-based System for Distributed Data Sharing

DB Lab.M.S. 3

LEE MIN YOUNG

Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou,19th International Conference on Data Engineering, 2003

Page 2: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Overview

Introdution BestPeer PeerDB Implementation Performance Study Conclusion

Page 3: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Introduction

Page 4: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Background: P2P vs DDBS P2P DDBS

CLIENT CLIENT

CLIENT CLIENT

DBMS

CLIENT CLIENT

CLIENT CLIENT

Ad-hoc Membership Stable nodes

No global schema Schema Shared schema

Incomplete Query result set Complete

“Word of mouth” Content location Shared catalog

Page 5: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

P2P DDBS

Limitation of existed P2P system Coarse- grained granularity (file level sharing) Limited in extensibility and flexibility

PeerDB Each node is a DBMS No global schema Incomplete answers possible Fine granularity, content searching

Page 6: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

BestPeer

Page 7: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

BestPeer A general purpose peer-to-peer platform

Key features: LIGLO server (Location-Independent Global Names

Lookup Server) Self-configurable neighborhood maintenance policy Agent-based service mechanism

Share the resource Data and information Storage power Computational power

Page 8: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

BestPeer - LIGLO Server

Page 9: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

BestPeer - Self-configurable Network

Page 10: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

PeerDB Implementation

Page 11: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

PeerDB - Architecture

Local Data(schema, keywords)

Database

Sharable Data

Data Management System : MySqlmeta-data stored in Local Dictionary and Export Dictionary

DBAgent : for mobile agentMaster agent that manages the queriesDispatches worker agents to neighboring nodes

Page 12: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

PeerDB – no global schema Problems with supporting SQL queries:

No global schema information Different nodes could name the same table/attribute differently (“len”,

“length”) Solution

User supplies metadata for each relation name and attribute

P1

P4P2

P3

SeqIDLength

proteinSeq

Kinases

IDseqLen

ProteinKLen

SeqNoLen

sequence

Protein

Namechar

Protein

IDsequence

ProteinKSeq

Protein, kinases,lengthNumber, identifierLengthProtein, sequenceNumber, identifierSequence

ProteinKLenIDseqLenProteinKSeqIDsequence

Protein, humanKey, identifier, IDLengthSequence

KinasesSeqIDLengthproteinSeq

Local/Export Dictionary

Proteinnumber, identifierLengthSequence

ProteinSeqNoLensequence

Protein, kinasenameFeatures, functons

Proteinnamechar

Page 13: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Local Query Processing

Parse Query to Extract the list of relations and attributes name

Check Local Dictionary for matching relations Promising relations Return to the user

flood to all neighbors :query, IP, TTL Wait for responses

Display results to user as they arrive

User selects the relations he/she wants Create a “Data Retrieval Agent” Rewrite query in terms of new relations

If local, submit SQL to local db Contact remote nodes directly to access the data

Phase 1

Phase 2

Page 14: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Local Query Processing

P1

P4P2

P3

Protein, humanKey, identifier, IDLengthSequence

KinasesSeqIDLengthproteinSeq

Local Dictionary

P5

P6

User 1. query 2. query, IP, TTL

3. schema

4.Show shcemas

P1 : KinasesP2: ProteinP3: ProteinKLenP4. Protein

P1

1. SQL2. result

SeqIDLength

proteinSeq

Page 15: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Remote Query Processing

Find relations Relation Matching Agents flood with TTL Check Export Dictionary for a match

Return matches directly

Get data Data Retrieval Agent submits SQL to DBMS Return data to the requesting node directly Run further data processing before returning

Phase 1

Phase 2

Page 16: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Remote Query Processing

1. queryP1

P4

P2

P3

P5

P6

User 2. query, IP, TTL

3. sch

em

a

4.Show shcemas P1 : KinasesP2: ProteinP3: ProteinKLenP4. Protein

P21. SQL

2. result

ProteinSeqNoLensequence

Proteinnumber, identifierLengthSequence

Export Dictionary SeqNoLen

sequence

Protein

Page 17: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

PeerDB

Statistics Reconfiguration based on the policy selected by the

user Relation information, number of answer objects

returned

Caches Cache all query results locally LRU replacement Users choose which copy they want

Page 18: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Performance Study

Page 19: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Performance Study

Rate of Returning Answers

Client/Server :Returns via the search path

PDMR :after each query, PDMR reconfigure itselt :reach out to more promising nodes directly

Page 20: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Performance Study

First search query % answers returned

Page 21: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Performance Study

Completion time vs Data size Communication Overhead

Page 22: PeerDB : A P2P-based System for Distributed Data Sharing DB Lab. M.S. 3 LEE MIN YOUNG Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, 19th International

Conclusion

PeerDB Each node is a DBMS No global schema Incomplete answers possible Fine granularity, content searching