parallel database systems instructor: dr. yingshu li student: chunyu ai

Parallel Database Systems

Instructor: Dr. Yingshu Li Student: Chunyu Ai

Main Message Technology trends give

– many processors and storage units– inexpensively

To analyze large quantities of data– sequential (regular) access patterns are 100x faster– parallelism is 1000x faster (trades time for money)– Relational systems show many parallel algorithms.

5

1k SPECintCPU

500 GB Disc

2 GB RAM

Implications of Hardware Trends

Large Disc Farms will be inexpensive (10k$/TB)

Large RAM databases will be inexpensive (1K$/GB)

Processors will be inexpensive

So building block will be a processor with large RAM lots of Disc

lots of network bandwidth

Why Parallel Access To Data?

1 Terabyte

10 MB/s

At 10 MB/s1.2 days to scan

1 Terabyte

1,000 x parallel1.5 minute SCAN.

Parallelism: divide a big problem into many smaller ones to be solved in parallel.

Bandwidth

Implication of Hardware Trends: Clusters

Future Servers are CLUSTERSof processors, discs

CPU

50 GB Disc

5 GB RAM

ThesisMany Little will Win over Few Big

Parallel Database Architectures

Parallelism: Performance is the GoalGoal is to get 'good' performance.

Law 1: parallel system should be faster than serial system

Law 2: parallel system should give near-linear scaleup or

near-linear speedup orboth.

Speed-Up and Scale-Up Speedup: a fixed-sized problem executing on a small system is given to a

system which is N-times larger.

– Measured by:

speedup = small system elapsed time

large system elapsed time

– Speedup is linear if equation equals N. Scaleup: increase the size of both the problem and the system

– N-times larger system used to perform N-times larger job

– Measured by:

scaleup = small system small problem elapsed time

big system big problem elapsed time

– Scale up is linear if equation equals 1.

Kinds of Parallel Execution

Pipeline

Partition outputs split N ways inputs merge M ways

Any Sequential Program


SequentialSequential

SequentialSequential Any Sequential Program


The Drawbacks of ParallelismO

ldTi

me

New

Tim

eS

peed

up =

Processors & Discs

The Good Speedup Curve

Processors & Discs

A Bad Speedup Curve

Linearity

No Parallelism Benefit

Processors & Discs

A Bad Speedup Curve3-Factors

Sta

rtu

p

Inte

rfe

ren

ce

Ske

w

Startup: Creating processesOpening filesOptimization

Interference: Device (cpu, disc, bus)logical (lock, hotspot, server, log,...)

Communication: send data among nodesSkew: If tasks get very small, variance > service time

Parallelism: Speedup & Scaleup100GB 100GB

Speedup: Same Job, More Hardware Less time

Scaleup: Bigger Job, More Hardware Same time

100GB 1 TB

100GB 1 TB

Server Server

1 k clients 10 k clientsTransactionScaleup: more clients/servers Same response time

14

Database Systems “Hide” Parallelism Automate system management via tools

• data placement• data organization (indexing)• periodic tasks (dump / recover / reorganize)

Automatic fault tolerance• duplex & failover• transactions

Automatic parallelism• among transactions (locking)• within a transaction (parallel execution)

Automatic Data PartitioningSplit a SQL table to subset of nodes & disks

Partition within set:Range Hash Round Robin

Shared disk and memory less sensitive to partitioning, Shared nothing benefits from "good" partitioning

A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z A...E F...J K...N O...S T...Z

Good for equi-joins, range queriesgroup-by

Good for equi-joins Good to spread load

Index PartitioningHash indices partition by hash

B-tree indices partition as a forest of trees.One tree per range

Primary index clusters data

0...9 10..19 20..29 30..39 40..

A..C D..F G...M N...R S..Z

Partitioned Execution

A...E F...J K...N O...S T...Z

A Table

Count Count Count Count Count

Count

Spreads computation and IO among processors

Partitioned data gives NATURAL parallelism

N x M way Parallelism

A...E F...J K...N O...S T...Z

Merge

Join

Sort

Join

Sort

Join

Sort

Join

Sort

Join

Sort

Merge Merge

N inputs, M outputs, no bottlenecks.

Partitioned DataPartitioned and Pipelined Data Flows

Blocking Operators = Short PipelinesAn operator is blocking,

if it does not produce any output, until it has consumed all its input

Examples:Sort, Aggregates, Hash-Join (reads all of one operand)

Blocking operators kill pipeline parallelismMake partition parallelism all the more important.

Sort RunsScan

Sort Runs

Sort Runs

Sort Runs

Tape File SQL Table Process

Merge Runs

Merge Runs

Merge Runs

Merge Runs

Table Insert

Index Insert

Index Insert

Index Insert

SQL Table

Index 1

Index 2

Index 3

Database LoadTemplate hasthree blocked phases

Parallel AggregatesFor aggregate function, need a decomposition strategy:

count(S) = count(s(i)), ditto for sum()avg(S) = ( sum(s(i))) / count(s(i))and so on...

For groups,sub-aggregate groups close to the sourcedrop sub-aggregates into a hash river.

A...E F...J K...N O...S T...Z

A Table

Count Count Count Count Count

Count

Sub-sortsgenerateruns

Mergeruns

Range or Hash Partition River

River is range or hash partitioned

Scan or other source

Parallel Sort

M inputs N outputs

Disk and mergenot needed if sort fits in Memory

Sort is benchmark from hell for shared nothing machinesnet traffic = disk bandwidth, no data filtering at the source

Hash Join: Combining Two Tables

Hash smaller table into N buckets (hope N=1)

If N=1 read larger table, hash to smallerElse, hash outer to disk then

bucket-by-bucket hash join.

Purely sequential data behavior

Always beats sort-merge and nestedunless data is clustered.

Good for equi, outer, exclusion joinLots of papers,

Hash reduces skew

Right Table

LeftTable

HashBuckets

Q&A

Thank you!

23

parallel database systems instructor: dr. yingshu li student: chunyu ai

Documents

parallel system

system management

systemntimes larger

parallel algorithms

database store

linear scaleup

response time

ntimes larger jobmeasured