highly available distributed databases (poster)

Project Goals

Implement a reliable, scalable, portable, full-featured high availability solution for distributed

databases, conformant with open standards.

18TH ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, HONG KONG, 2009

Optimizeparityupdates usingJserver

Performance Test usingTPC-C benchin both failureand safemodes [Khediri MScProject]

Automaticincrease of a group highavailabilitylevel

DistributedhighlyavailableDB whichautoscaleover a cluster

Future Work

• Script to createtable fragments fon each DB instance.

• DB population.

DB Set up & Population

• Search item withkey i_id

Key Search• Either by deleting of

up to k fragments contents or by shutting down corresponding DB instances

Simulate k Servers Failure

• Recover item withkey i_id

Record Recovery • Set up k spares

• Query alive servers

• Decode

• Insert recovereddata into spareservers

Recover kServers

References

Further Information

Demonstration Outline

1. CPR, EAR, http://www.contingencyplanningresearch.com/cod.htm

2. Litwin, W., Moussa, R., Schwarz, T.J.E.: LH*RS - a highly available scalable

distributed data structure. ACM Trans., (2005)

3. Weatherspoon, H., Kubiatowicz, J.D.: Erasure Coding vs. Replication: A

quantitative Comparison. Proc. of the 1st International Workshop on P2P

Systems, (2002)

4. Cecchet, E., Marguerite, J., Zwaenepoel, W.: C-JDBC Flexible Database

Clustering Middleware. USENIX, (2004)

URL: http://rim.moussa.googlepages.com/hddbrs_mid_project.html

Email: [email protected]

Performances Testbed: Oracle DBMS, 1.7GHz CPU on DB

backends, 2.7GHz mid-tier, all connected through

a100Mbps router,

Insert Performances: 65ms, 140ms, 160ms for

respectively k = 0, 1, 2.

Record Recovery Performances:

130ms for a 3KB record,

only 0.18ms for decoding.

Fragments Recovery:

One data fragment of 7.52MB recovered at a

rate of 720KBps.

Two data fragments of 15.04MB recovered at

a rate of 690KBps.

Decoding overhead is 6% of recovery time.

HDDBRS MIDDLEWARE FOR IMPLEMENTING HIGHLY AVAILABLE DISTRIBUTED DATABASES

RIM MOUSSA, PHD

UTIC LAB. , TUNISIA

Middleware Architecture

Multithreading

DB connection Thread for each DB instance

Query Handler Thread

Distributed Transaction Handler Thread

Recovery Thread

RMI Thread …

Threads communicate through concurrent

queues (working and response queue for

each thread) and sleep and notify primitives.

JDBC Interface with DB backends.

XA/open standard (2-PC protocol) for

distributed transaction management.

RMI for client transaction processing.

System Design

3-tier distributed architecture

1. Client

Not aware of data distribution

Not aware of data redundancy

2. HDDBRS Mid-tier

Redundant data management

Recovery process (records and fragments)

Failure detection …

3. DB backends

DB group k-available composed of m source

DB instances and k parity DB instances.

Tables horizontally fragmented.

Surjective function for record grouping.

Downtime Cost

High-availability Methods

Demonstrated Configuration:

2-available group of 4 source

DB instances and 2 parity DB

instances (m = 4, k = 2) .

Item

Each item is 3KB.

Oracle DBMS instances

up to $50KpH, 46%

between $51KpH and $250KpH, 28%

between $251KpH and $1MpH, 18%

more than $1MpH, 8%

A survey conducted by the CPR and ERA, in

2001 shows important downtime cost per

hour for questionned companies [1].

REED SOLOMON

ERASURE

CODES

REPLICATION

High Storage Cost

Quick Recovery

Load Balancing

Minimal Storage Overhead

Encoding/ DecodingOverhead

Data Stripping

.

.

.

.

.

.

JDB

C D

rive

rJD

BC

Dri

ver

JDB

C D

rive

rJD

BC

Dri

ver

DBCT0 Work. Queue

DBCTm-1 Work. Queue

DBCTmWork. Queue

DBCTnWork. Queue

DB

Bac

ken

ds

Spare Spare

JDBC Driver JDBC Driver

DBCTjWork. QueueDBCTiWork. Queue

Fig. 1. Middleware Architecture.

highly available distributed databases (poster)

Education