lecture 11 main memory databases midterm review. time breakdown for shore dbms source: “oltp under...
TRANSCRIPT
Lecture 11
Main Memory DatabasesMidterm Review
Time breakdown for Shore DBMS
Source: “OLTP Under the Looking Glass”, SIGMOD 2008
Systematically removed code for locking, logging, and disk I/O
Remaining useful work
Cost Sources
• Managing disk contents– Logging– Locking
• Managing memory and short-term– Latching– Buffer pool manager
Solution: In-Memory, Single-Threaded Execution Model
What about concurrency?
• Need to make of multi-core• But want to minimize overhead
Solution: Partition the data, most xactions are single-partition
TPC-C
Most queries go to one warehouse
Midterm Review
• Schema design• Relational Algebra• DB Architecture• Buffer Pools• Indexing• Join Algos• Query optimization
Entity-Relationship DiagramSSN Name Address Hobby Cost
123 john main st dolls $
123 john main st bugs $
345 mary lake st tennis $$
456 joe first st dolls $
“Wide” schema – has redundancy and anomalies in the presence of updates, inserts, and deletes
Table key is Hobby, SSN
Person Hobby
SSN
Address
Name
Name
Cost
n:n
Entity Relationship Diagram
BCNFifyStart with one "universal relation”
While some relation R is not in BCNFFind a FD F=XY that violates BCNF on R
Split R into R1 = (X U Y), R2 = R – Y
Relational AlgebraProjection π(R,c1, …, cn) = πc1…cnR
select a subset c1 … cn of columns of RSelection σ(R, pred) = σpredR
select a subset of rows that satisfy predCross Product (||R|| = #attrs in R, |R| = #rows in row) R1 X R2 (aka Cartesian product)
combine R1 and R2, producing a new relation with ||R1|| + ||R2|| attrs, |R1| * |R2| rows
Join ⨝(R1, R2, pred) = R1 ⨝pred R2 = σpred (R1 X R2)
Database Internals OverviewFront End
Admission ControlConnection Management
(sql)Parser
(parse tree)Rewriter
(parse tree) Planner & Optimizer
(query plan) Executor
Query System
Storage System
Access MethodsLock ManagerBuffer PoolLog Manager
Flattening Example
SELECT emp.* FROM empWHERE EXISTS (
SELECT * FROM dept WHERE emp.deptNo = dept.deptNo AND dept.building = ‘Tech’);
SELECT emp.* FROM emp, deptWHERE emp.deptNo = dept.deptNo AND dept.building = ‘Tech’;
Select employees in a department located in the Tech:
B+ Tree Indexes
• Balanced wide tree• Fast value lookup and range scans• Each node is a disk page (except root)• Leafs point to tuple pages
Study Break: B+ Tree
• Build a B+ Tree with values (8, 17, 21, 2, 8, 21, 25, 3, 19, 7) and 4 node slots
• Insert 9 into the tree• Insert 3 into the tree• Delete 8 from tree
Indexes RecapHeap File Bitmap Hash File B+Tree
Insert O(1) O(1) O(1) O( logB n )
Delete O(P) O(1) O(1) O( logB n )
RangeScan
O(P) -- / O(P) -- / O(P) O( logB n + R )
Lookup O(P) O(C) O(1) O( logB n )
n : number of tuplesP : number of pages in fileB : branching factor of B-Tree (keys / node)R : number of pages in rangeC: cardinality (#) of unique values on key
Buffer Pool Management
• Eviction strategies– Least recently used (LRU)– Most recently used (MRU)
• Manager policies– Work by file instance – capture access
pattern and table usage– Memory allotment depends on access
method
Join Algorithms SummarySort-Merge Simple Hash Grace Hash
I/O: 3 (|R| + |S|)CPU: O(P x {S}/P log {S}/P)
I/O: P (|R| + |S|)CPU: O({R} + {S})
I/O: 3 (|R| + |S|)CPU: O({R} + {S})
Notation: P partitions / passes over data; assuming hash is O(1)
Grace hash is generally a safe bet, unless memory is close to size of tables, in which case simple can be preferable
Extra cost of sorting makes sort merge unattractive unless there is a way to access tables in sorted order (e.g., a clustered index), or a need to output data in sorted order (e.g., for a subsequent ORDER BY)
Query Planning Cost Estimation
• Use analytical cost to estimate time needed for a query execution plan tree
• Selectivity (fraction of tuples returned from input):– col = value: 1/ICARD– 1/nth of # of unique col
values, 1/10 if no index– col > value: (value – max) / (max – min) or 1/3– col1 = col2: 1/max(ICARD(c1), ICARD(c2)) or 1/10
Selinger Optimizer Algorithm
• algorithm: compute optimal way to generate every sub-join: size 1, size 2, ... n (in that order)
e.g. {A}, {B}, {C}, {AB}, {AC}, {BC}, {ABC}
R set of relations to joinFor i in {1...|R|}:
for S in {all length i subsets of R}:optjoin(S) = a join (S-a), where a is the relation that
minimizes:cost(optjoin(S-a)) + min. cost to join optjoin(S-a) to a + min. access cost for a
Precomputed in previous iteration!