lecture 11 main memory databases midterm review. time breakdown for shore dbms source: “oltp under...

Lecture 11

Main Memory DatabasesMidterm Review

Time breakdown for Shore DBMS

Source: “OLTP Under the Looking Glass”, SIGMOD 2008

Systematically removed code for locking, logging, and disk I/O

Remaining useful work

Cost Sources

• Managing disk contents– Logging– Locking

• Managing memory and short-term– Latching– Buffer pool manager

Solution: In-Memory, Single-Threaded Execution Model

What about concurrency?

• Need to make of multi-core• But want to minimize overhead

Solution: Partition the data, most xactions are single-partition

TPC-C

Most queries go to one warehouse

Midterm Review

• Schema design• Relational Algebra• DB Architecture• Buffer Pools• Indexing• Join Algos• Query optimization

Entity-Relationship DiagramSSN Name Address Hobby Cost

123 john main st dolls $

123 john main st bugs $

345 mary lake st tennis $$

456 joe first st dolls $

“Wide” schema – has redundancy and anomalies in the presence of updates, inserts, and deletes

Table key is Hobby, SSN

Person Hobby

SSN

Address

Name

Name

Cost

n:n

Entity Relationship Diagram

BCNFifyStart with one "universal relation”

While some relation R is not in BCNFFind a FD F=XY that violates BCNF on R

Split R into R1 = (X U Y), R2 = R – Y

Relational AlgebraProjection π(R,c1, …, cn) = πc1…cnR

select a subset c1 … cn of columns of RSelection σ(R, pred) = σpredR

select a subset of rows that satisfy predCross Product (||R|| = #attrs in R, |R| = #rows in row) R1 X R2 (aka Cartesian product)

combine R1 and R2, producing a new relation with ||R1|| + ||R2|| attrs, |R1| * |R2| rows

Join ⨝(R1, R2, pred) = R1 ⨝pred R2 = σpred (R1 X R2)

Database Internals OverviewFront End

Admission ControlConnection Management

(sql)Parser

(parse tree)Rewriter

(parse tree) Planner & Optimizer

(query plan) Executor

Query System

Storage System

Access MethodsLock ManagerBuffer PoolLog Manager

Flattening Example

SELECT emp.* FROM empWHERE EXISTS (

SELECT * FROM dept WHERE emp.deptNo = dept.deptNo AND dept.building = ‘Tech’);

SELECT emp.* FROM emp, deptWHERE emp.deptNo = dept.deptNo AND dept.building = ‘Tech’;

Select employees in a department located in the Tech:

B+ Tree Indexes

• Balanced wide tree• Fast value lookup and range scans• Each node is a disk page (except root)• Leafs point to tuple pages

Study Break: B+ Tree

• Build a B+ Tree with values (8, 17, 21, 2, 8, 21, 25, 3, 19, 7) and 4 node slots

• Insert 9 into the tree• Insert 3 into the tree• Delete 8 from tree

Indexes RecapHeap File Bitmap Hash File B+Tree

Insert O(1) O(1) O(1) O( logB n )

Delete O(P) O(1) O(1) O( logB n )

RangeScan

O(P) -- / O(P) -- / O(P) O( logB n + R )

Lookup O(P) O(C) O(1) O( logB n )

n : number of tuplesP : number of pages in fileB : branching factor of B-Tree (keys / node)R : number of pages in rangeC: cardinality (#) of unique values on key

Buffer Pool Management

• Eviction strategies– Least recently used (LRU)– Most recently used (MRU)

• Manager policies– Work by file instance – capture access

pattern and table usage– Memory allotment depends on access

method

Join Algorithms SummarySort-Merge Simple Hash Grace Hash

I/O: 3 (|R| + |S|)CPU: O(P x {S}/P log {S}/P)

I/O: P (|R| + |S|)CPU: O({R} + {S})

I/O: 3 (|R| + |S|)CPU: O({R} + {S})

Notation: P partitions / passes over data; assuming hash is O(1)

Grace hash is generally a safe bet, unless memory is close to size of tables, in which case simple can be preferable

Extra cost of sorting makes sort merge unattractive unless there is a way to access tables in sorted order (e.g., a clustered index), or a need to output data in sorted order (e.g., for a subsequent ORDER BY)

Query Planning Cost Estimation

• Use analytical cost to estimate time needed for a query execution plan tree

• Selectivity (fraction of tuples returned from input):– col = value: 1/ICARD– 1/nth of # of unique col

values, 1/10 if no index– col > value: (value – max) / (max – min) or 1/3– col1 = col2: 1/max(ICARD(c1), ICARD(c2)) or 1/10

Selinger Optimizer Algorithm

• algorithm: compute optimal way to generate every sub-join: size 1, size 2, ... n (in that order)

e.g. {A}, {B}, {C}, {AB}, {AC}, {BC}, {ABC}

R set of relations to joinFor i in {1...|R|}:

for S in {all length i subsets of R}:optjoin(S) = a join (S-a), where a is the relation that

minimizes:cost(optjoin(S-a)) + min. cost to join optjoin(S-a) to a + min. access cost for a

Precomputed in previous iteration!

lecture 11 main memory databases midterm review. time breakdown for shore dbms source: “oltp under...

Documents