highly available distributed databases (poster)

Project Goals

Implement a reliable, scalable, portable, full-featured high availability solution for distributed

databases, conformant with open standards.

18TH ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, HONG KONG, 2009

Optimizeparityupdates usingJserver

Performance Test usingTPC-C benchin both failureand safemodes [Khediri MScProject]

Automaticincrease of a group highavailabilitylevel

DistributedhighlyavailableDB whichautoscaleover a cluster

Future Work

• Script to createtable fragments fon each DB instance.

• DB population.

DB Set up & Population

• Search item withkey i_id

Key Search• Either by deleting of

up to k fragments contents or by shutting down corresponding DB instances

Simulate k Servers Failure

• Recover item withkey i_id

Record Recovery • Set up k spares

• Query alive servers

• Decode

• Insert recovereddata into spareservers

Recover kServers

References

Further Information

Demonstration Outline

1. CPR, EAR, http://www.contingencyplanningresearch.com/cod.htm

2. Litwin, W., Moussa, R., Schwarz, T.J.E.: LH*RS - a highly available scalable

distributed data structure. ACM Trans., (2005)

3. Weatherspoon, H., Kubiatowicz, J.D.: Erasure Coding vs. Replication: A

quantitative Comparison. Proc. of the 1st International Workshop on P2P

Systems, (2002)

4. Cecchet, E., Marguerite, J., Zwaenepoel, W.: C-JDBC Flexible Database

Clustering Middleware. USENIX, (2004)

URL: http://rim.moussa.googlepages.com/hddbrs_mid_project.html

Email: rim.moussa@googlepages.com

Performances Testbed: Oracle DBMS, 1.7GHz CPU on DB

backends, 2.7GHz mid-tier, all connected through

a100Mbps router,

Insert Performances: 65ms, 140ms, 160ms for

respectively k = 0, 1, 2.

Record Recovery Performances:

130ms for a 3KB record,

only 0.18ms for decoding.

Fragments Recovery:

One data fragment of 7.52MB recovered at a

rate of 720KBps.

Two data fragments of 15.04MB recovered at

a rate of 690KBps.

Decoding overhead is 6% of recovery time.

HDDBRS MIDDLEWARE FOR IMPLEMENTING HIGHLY AVAILABLE DISTRIBUTED DATABASES

RIM MOUSSA, PHD

UTIC LAB. , TUNISIA

Middleware Architecture

Multithreading

DB connection Thread for each DB instance

Query Handler Thread

Distributed Transaction Handler Thread

Recovery Thread

RMI Thread …

Threads communicate through concurrent

queues (working and response queue for

each thread) and sleep and notify primitives.

JDBC Interface with DB backends.

XA/open standard (2-PC protocol) for

distributed transaction management.

RMI for client transaction processing.

System Design

3-tier distributed architecture

1. Client

Not aware of data distribution

Not aware of data redundancy

2. HDDBRS Mid-tier

Redundant data management

Recovery process (records and fragments)

Failure detection …

3. DB backends

DB group k-available composed of m source

DB instances and k parity DB instances.

Tables horizontally fragmented.

Surjective function for record grouping.

Downtime Cost

High-availability Methods

Demonstrated Configuration:

2-available group of 4 source

DB instances and 2 parity DB

instances (m = 4, k = 2) .

Each item is 3KB.

Oracle DBMS instances

up to $50KpH, 46%

between $51KpH and $250KpH, 28%

between $251KpH and $1MpH, 18%

more than $1MpH, 8%

A survey conducted by the CPR and ERA, in

2001 shows important downtime cost per

hour for questionned companies [1].

REED SOLOMON

ERASURE

REPLICATION

High Storage Cost

Quick Recovery

Load Balancing

Minimal Storage Overhead

Encoding/ DecodingOverhead

Data Stripping

DBCT0 Work. Queue

DBCTm-1 Work. Queue

DBCTmWork. Queue

DBCTnWork. Queue

Spare Spare

JDBC Driver JDBC Driver

DBCTjWork. QueueDBCTiWork. Queue

Fig. 1. Middleware Architecture.

highly available distributed databases (poster)

Education

cover feature mining very large databases be efﬁcient, the...

· poster #1333 poster #1305 poster #339 poster #285...

tm -...

spatio-temporal databases spatio-temporal databases

databases and types of databases

client failover best practices for highly available oracle...

temporal databases. outline spatial databases indexing,...

ulb4.3 relational databases vs non relational databases...

text databases. outline spatial databases temporal databases...

module 7 implementing high availability. module overview...

nce-21 poster presentation poster presentation...

poster poster poster poster poster poster poster poster

databases for big data part 2 - ida >...

online databases for academic libraries databases for the...

the use of highly validated recombinant rabbit monoclonal...

maximum availability architecture · client failover best...

nosql: graph databases. databases why nosql databases?

visual summar y privacy, security, and...

online databases for academic libraries ebsco publishing...

enhanced data querying for neuroinformatics...