csep 544: lecture 10 - courses.cs.washington.edu...column-oriented databases and nosql csep544 -...
TRANSCRIPT
![Page 1: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/1.jpg)
CSEP 544: Lecture 10
Column-Oriented Databases and NoSQL
CSEP544 - Fall 2015 1
![Page 2: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/2.jpg)
Announcement Take home final: 12/9-10 • Online Webquiz
– Need your UW NET ID, check that it works!
– I will also email the final in pdf form (e.g. to print)
• Opens Wed. morning, closes Thursday night
• No time limits: – Work, save, take a
break, return later…
• No need to run code • Questions?
– Email me and cc Laurel • Watch your email
– E.g. corrections • No discussion of the
final with colleagues • When you are done:
– Submit and receive confirmation code!
CSEP544 - Fall 2015 2
![Page 3: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/3.jpg)
Today’s Agenda
• Column-oriented databases
• No-SQL
CSEP544 - Fall 2015 3
![Page 4: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/4.jpg)
Column-Oriented Databases
CSEP544 - Fall 2015 4
Brief discussion of the paper: The Design and Implementation of Modern Column-Oriented Database Systems
![Page 5: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/5.jpg)
Column-Oriented Databases
• Main idea: – Physical storage: complete vertical partition;
each column stored separately: R.A, R.B, R.A – Logical schema: remains the same R(A,B,C)
• Main advantage: – Improved transfer rate: disk to memory,
memory to CPU, better cache locality – Other advantages (next)
CSEP544 - Fall 2015 5
![Page 6: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/6.jpg)
Data Layout
6
Basic tradeoffs: • Reading all attributes of one records, v.s. • Reading some attributes of many records
![Page 7: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/7.jpg)
Key Architectural Trends (Sec.1)
• Virtual IDs
• Block-oriented and vertical processing
• Late materialization
• Column-specific compression
CSEP544 - Fall 2015 7
![Page 8: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/8.jpg)
Key Architectural Trends (Sec.1)
• Virtual IDs – Offsets (arrays) instead of keys
• Block-oriented and vertical processing – Iterator model: one tupleàone block of tuples
• Late materialization – Postpone tuple reconstruction in query plan
• Column-specific compression – Much better than row-compression (why?)
CSEP544 - Fall 2015 8
![Page 9: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/9.jpg)
Fig. 1.2
![Page 10: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/10.jpg)
Discussion
• What are “covering indexes” (pp. 204) And what is their connection to column-oriented databases?
• What is the main takeaway from Fig. 1.2?
CSEP544 - Fall 2015 10
![Page 11: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/11.jpg)
Discussion
• What are “covering indexes” (pp. 204) And what is their connection to column-oriented databases? – A set of indexes that can completely answer
the query; one index ≈ one column • What is the main takeaway from Fig. 1.2?
– Column-oriented databases don’t work! Unless you really optimize them well
CSEP544 - Fall 2015 11
![Page 12: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/12.jpg)
Vectorized Processing
Review: • Volcano-style iterator model
– Next() method – Pipelining
• Materialization of all intermediate results • Discuss in class:
CSEP544 - Fall 2015 12
select avg(A) from R where A < 100
![Page 13: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/13.jpg)
Vectorized Processing
• Vectorized processing: – Next() returns a block of tuples (e.g. N=1000)
instead of single tuple • Pros:
– No more large intermediate results – Tight inner loop for selection and/or avg
• Discuss in class:
CSEP544 - Fall 2015 13
select avg(A) from R where A < 100
![Page 14: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/14.jpg)
Compression (Sec. 4)
• What is the advantage of compression in databases?
• Discuss main column-at-a-time compression techniques
CSEP544 - Fall 2015 14
![Page 15: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/15.jpg)
Compression (Sec. 4)
• What is the advantage of compression in databases?
• Discuss main column-at-a-time compression techniques – Row-length encoding: F,F,F,F,M,Mà4F,2M – Bit-vector (see also bit-map indexes) – Dictionary. More generally: Ziv-Lempel
CSEP544 - Fall 2015 15
![Page 16: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/16.jpg)
Late Materialization (Sec. 4)
• What is it?
• Discuss ΠC(σA=‘a’ ∧ B=‘b’(R(A,B,C,D,…))
CSEP544 - Fall 2015 16
![Page 17: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/17.jpg)
Late Materialization (Sec. 4)
• What is it? – The result is an array of positions
• Discuss ΠC(σA=‘a’ ∧ B=‘b’(R(A,B,C,D,…)) – Retrieve positions in column A: 2, 4, 5, 9, 25… – Retrieve positions in column B: 3, 4, 7, 9,12,.. – Intersect: 4, 9, … – Lookup values in column C: C[4], C[9], …
CSEP544 - Fall 2015 17
![Page 18: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/18.jpg)
Joins (Sec. 4)
CSEP544 - Fall 2015 18
The result of a join R.A ⋈ S.A is an array of positions in R.A and S.A. Note: sorted on R.A only.
1 Value42 2 Value36 3 Value42 4 Value44 5 Value38
1 Value38 2 Value42 3 Value46 4 Value36
R.A S.A
⋈ = 1 1 2 2 2 4 3 3 2 4 5 1
Positions in R.A
(sorted) Positions
in S.A (unsorted)
![Page 19: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/19.jpg)
Jive-Join (Sec. 4)
Problem: accessing the values in the second table has poor memory locality Solution: re-sort by the second column, fetch, sort back E.g. ΠS.C(R(A,…) ⋈ S(B,C,… )
= 1 1 2 2 2 4 3 3 2 4 5 1
Sort on positions
in S.B
4 5 1 1 1 2 3 3 2 2 2 4
![Page 20: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/20.jpg)
Jive-Join (Sec. 4)
Problem: accessing the values in the second table has poor memory locality Solution: re-sort by the second coljun, fetch, sort back E.g. ΠS.C(R(A,…) ⋈ S(B,C,… )
1 Smith 2 Johnson 3 Williams 4 Jones
= 1 1 2 2 2 4 3 3 2 4 5 1
Sort on positions
in S.B
4 5 1 1 1 2 3 3 2 2 2 4
Lookup S.C (this is a
merge-join; why?)
⋈
![Page 21: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/21.jpg)
Jive-Join (Sec. 4)
Problem: accessing the values in the second table has poor memory locality Solution: re-sort by the second coljun, fetch, sort back E.g. ΠS.C(R(A,…) ⋈ S(B,C,… )
1 Smith 2 Johnson 3 Williams 4 Jones
= 1 1 2 2 2 4 3 3 2 4 5 1
Sort on positions
in S.B
4 5 1 1 1 2 3 3 2 2 2 4
Lookup S.C (this is a
merge-join; why?)
4 5 1 Smith 1 1 2 Johnson 3 3 2 Johnson 2 2 4 Jones
= ⋈
![Page 22: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/22.jpg)
Jive-Join (Sec. 4)
Problem: accessing the values in the second table has poor memory locality Solution: re-sort by the second coljun, fetch, sort back E.g. ΠS.C(R(A,…) ⋈ S(B,C,… )
1 Smith 2 Johnson 3 Williams 4 Jones
= 1 1 2 2 2 4 3 3 2 4 5 1
Sort on positions
in S.B
4 5 1 1 1 2 3 3 2 2 2 4
Lookup S.C (this is a
merge-join; why?)
4 5 1 Smith 1 1 2 Johnson 3 3 2 Johnson 2 2 4 Jones
= ⋈
Re-sort on positions
in R.A
1 1 2 Johnson 2 2 4 Jones 3 3 2 Johnson 4 5 1 Smith
=
![Page 23: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/23.jpg)
Late Materialization select sum(R.a) from R, S where R.c = S.b and 5<R.a<20 and 40<R.b<50 and 30<S.a<40
![Page 24: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/24.jpg)
Late Materialization select sum(R.a) from R, S where R.c = S.b and 5<R.a<20 and 40<R.b<50 and 30<S.a<40
40,50
------
![Page 25: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/25.jpg)
Late Materialization select sum(R.a) from R, S where R.c = S.b and 5<R.a<20 and 40<R.b<50 and 30<S.a<40
???
------
![Page 26: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/26.jpg)
Late Materialization select sum(R.a) from R, S where R.c = S.b and 5<R.a<20 and 40<R.b<50 and 30<S.a<40
![Page 27: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/27.jpg)
NoSQL Databases
CSEP544 - Fall 2015 27
Based on paper by Cattell, in SIGMOD Record 2010
![Page 28: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/28.jpg)
NoSLQ: Overview • Main objective: implement distributed state
– Different objects stored on different servers – Same object replicated on different servers
• Main idea: give up some of the ACID constraints to improve performance
• Simple interface: – Write (=Put): needs to write all replicas – Read (=Get): may get only one
• Eventual consistency ß Strong consistency
CSEP544 - Fall 2015 28
![Page 29: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/29.jpg)
NoSQL
“Not Only SQL” or “Not Relational”. Six key features: 1. Scale horizontally “simple operations” 2. Replicate/distribute data over many servers 3. Simple call level interface (contrast w/ SQL) 4. Weaker concurrency model than ACID 5. Efficient use of distributed indexes and RAM 6. Flexible schema
CSEP544 - Fall 2015 29
Cattell, SIGMOD Record 2010
![Page 30: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/30.jpg)
Outline of this Lecture
• Main techniques and concepts: – Distributed storage using DHTs – Consistency: 2PC, vector clocks – The CAP theorem
• Overview of No-SQL systems (Cattell)
• Critique (c.f. Stonebraker)
CSEP544 - Fall 2015 30
![Page 31: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/31.jpg)
Main Techniques and Concepts
CSEP544 - Fall 2015 31
![Page 32: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/32.jpg)
Main Techniques, Concepts
• Distributed Hash Tables
• Consistency: 2PC, Vector Clocks
• The CAP theorem
CSEP544 - Fall 2015 32
![Page 33: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/33.jpg)
A Note
• These techniques belong to a course on distributed systems, and not databases
• We will mention them because they are very relevant to NoSQL, but this is not an exhaustive treatment
CSEP544 - Fall 2015 33
![Page 34: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/34.jpg)
Distributed Hash Table Implements a distributed storage • Each key-value pair (k,v) is stored at some server h(k) • API: write(k,v); read(k)
Use standard hash function: service key k by server h(k) • Problem 1: a client knows only one server, does’t
know how to access h(k)
• Problem 2. if new server joins, then N à N+1, and the entire hash table needs to be reorganized
• Problem 3: we want replication, i.e. store the object at more than one server
CSEP544 - Fall 2015 34
![Page 35: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/35.jpg)
Distributed Hash Table h=0 h=2n-1
A
B
C D
Responsibility of B
Responsibility of C
Responsibility of A
![Page 36: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/36.jpg)
Problem 1: Routing A client doesn’t know server h(k), but some other server • Naive routing algorithm:
– Each node knows its neighbors – Send message to nearest neighbor – Hop-by-hop from there – Obviously this is O(n), So no good
• Better algorithm: “finger table” – Memorize locations of other nodes in the ring – a, a + 2, a + 4, a + 8, a + 16, ... a + 2n – 1 – Send message to closest node to destination – Hop-by-hop again: this is log(n)
CSEP544 - Fall 2015 36
![Page 37: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/37.jpg)
Problem 1: Routing h=0 h=2n-1
A
B
D
C
Read(k)
F E
Client only “knows”
server A
Redirect request
to A + 2m
G
to D + 2p
to F + 1
Found Read(k) !
h(k) handled by server G
O(log n)
![Page 38: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/38.jpg)
Problem 2: Joining h=0 h=2n-1
A
B
C D
Responsibility of D
When X joins: select random ID
![Page 39: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/39.jpg)
Problem 2: Joining h=0 h=2n-1
A
B
C D
When X joins: select random ID
X Responsibility of D
![Page 40: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/40.jpg)
Problem 2: Joining h=0 h=2n-1
A
B
C D
When X joins: select random ID
X Responsibility of X
Redistribute the load at D
Responsibility of D
![Page 41: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/41.jpg)
Problem 3: Replication
• Need to have some degree of replication to cope with node failure
• Let N=degree of replication
• Assign key k to h(k), h(k)+1, …, h(k)+N-1
CSEP544 - Fall 2015 41
![Page 42: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/42.jpg)
Problem 3: Replication h=0 h=2n-1
A
B
C D
Responsibility of B,C,D
Responsibility of C,D,E
Responsibility of A,B,C
![Page 43: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/43.jpg)
Consistency
• ACID – Two phase commit – Paxos (will not discuss)
• Eventual consistency – Vector clocks
CSEP544 - Fall 2015 43
![Page 44: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/44.jpg)
ACID and 2PC
• Need to partition the db across machines
• If a transaction touches one machine – Life is good
• If a transaction touches multiple machines – ACID becomes extremely expensive! – Need two-phase commit
44
![Page 45: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/45.jpg)
45
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 46: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/46.jpg)
46
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 47: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/47.jpg)
47
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) COMMIT
Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 48: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/48.jpg)
48
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) COMMIT
3) COMMIT Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 49: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/49.jpg)
49
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) COMMIT
3) COMMIT 4) Coordinator crashes
Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 50: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/50.jpg)
50
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) COMMIT
3) COMMIT 4) Coordinator crashes
But I already committed!
Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 51: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/51.jpg)
51
Two-Phase Commit: Motivation
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) COMMIT
3) COMMIT 4) Coordinator crashes
But I already committed!
What do we do now?
Each subordinate holds fraction of database
Example: Each node holds some subset of bank accounts Transaction transfers money
![Page 52: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/52.jpg)
52
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
![Page 53: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/53.jpg)
53
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
![Page 54: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/54.jpg)
54
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) PREPARE
![Page 55: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/55.jpg)
55
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) PREPARE
2) PREPARE
![Page 56: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/56.jpg)
56
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) PREPARE
2) PREPARE
2) PREPARE
![Page 57: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/57.jpg)
57
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) PREPARE
2) PREPARE
2) PREPARE
3) YES
![Page 58: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/58.jpg)
58
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) PREPARE
2) PREPARE
2) PREPARE
3) YES
3) YES
![Page 59: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/59.jpg)
59
2PC: Phase 1 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
1) User decides to commit
2) PREPARE
2) PREPARE
2) PREPARE
3) YES
3) YES 3) YES
![Page 60: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/60.jpg)
60
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
Transaction is now committed!
![Page 61: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/61.jpg)
61
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
2) COMMIT
Transaction is now committed!
![Page 62: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/62.jpg)
62
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
2) COMMIT
2) COMMIT Transaction is now committed!
![Page 63: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/63.jpg)
63
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
2) COMMIT
2) COMMIT
2) COMMIT
Transaction is now committed!
![Page 64: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/64.jpg)
64
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
2) COMMIT
2) COMMIT
2) COMMIT
3) ACK
Transaction is now committed!
![Page 65: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/65.jpg)
65
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
2) COMMIT
2) COMMIT
2) COMMIT
3) ACK
3) ACK
Transaction is now committed!
![Page 66: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/66.jpg)
66
2PC: Phase 2 Illustrated
Coordinator Subordinate 1
Subordinate 2
Subordinate 3
2) COMMIT
2) COMMIT
2) COMMIT
3) ACK
3) ACK 3) ACK
Transaction is now committed!
![Page 67: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/67.jpg)
Two Phase Commit
• Multiple servers run parts of the same transaction
• They all must commit, or none should commit
• Two-phase commit is a complicated protocol that ensures that
• 2PC can also be used for WRITE with replication: commit the write at all replicas before declaring success
CSEP544 - Fall 2015 67
![Page 68: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/68.jpg)
Two Phase Commit
Assumptions: • Each site logs actions at that site, but
there is no global log • There is a special site, called the
coordinator, which plays a special role • 2PC involves sending certain messages:
as each message is sent, it is logged at the sending site, to aid in case of recovery
CSEP544 - Fall 2015 68
![Page 69: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/69.jpg)
Two Phase Commit Book, Sec. 22.14.1 1. Coordinator sends prepare message
2. Subordinates receive prepare statement; force-write <prepare> log entry; answers yes or no
3. If coordinator receives only yes, force write <commit>, sends commit messages; If at least one no, or timeout, force write <abort>, sends abort messages
4. If subordinate receives abort, force-write <abort>, sends ack message and aborts; if receives commit, force-write <commit>, sends ack, commits.
5. When coordinator receives all ack, writes <end log>
![Page 70: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/70.jpg)
Two Phase Commit Restart after failure: each server recovers locally 1. If it finds a <commit> or <abort> log entry, then: redo
or undo; if the server is coordinator, then re-request all ack messages, then write <end log>
2. If it finds a <prepare> entry, then re-contact the coordinator to ask for commit/abort
3. If no <prepare> , <commit> or <abort>, presume abort
![Page 71: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/71.jpg)
Two Phase Commit
• ACID properties, but expensive
• Relies on central coordinator: both performance bottleneck, and single-point-of-failure
• Solution: Paxos = distributed protocol – Complex: will not discuss at all
CSEP544 - Fall 2015 71
![Page 72: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/72.jpg)
Vector Clocks • An extension of Multiversion Concurrency
Control (MVCC) to multiple servers
• Standard MVCC: each data item X has a timestamp t: X4, X9, X10, X14, …, Xt
• Vector Clocks: X has set of [server, timestamp] pairs X([s1,t1], [s2,t2],…)
CSEP544 - Fall 2015 72
![Page 73: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/73.jpg)
Vector Clocks Dynamo:2007
![Page 74: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/74.jpg)
Vector Clocks: Example • A client writes D1 at server SX:
D1 ([SX,1]) •
•
•
•
CSEP544 - Fall 2015 74
![Page 75: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/75.jpg)
Vector Clocks: Example • A client writes D1 at server SX:
D1 ([SX,1]) • Another client reads D1, writes back D2; also
handled by server SX: D2 ([SX,2]) (D1 garbage collected)
•
•
•
CSEP544 - Fall 2015 75
![Page 76: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/76.jpg)
Vector Clocks: Example • A client writes D1 at server SX:
D1 ([SX,1]) • Another client reads D1, writes back D2; also
handled by server SX: D2 ([SX,2]) (D1 garbage collected)
• Another client reads D2, writes back D3; handled by server SY: D3 ([SX,2], [SY,1])
•
•
CSEP544 - Fall 2015 76
![Page 77: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/77.jpg)
Vector Clocks: Example • A client writes D1 at server SX:
D1 ([SX,1]) • Another client reads D1, writes back D2; also
handled by server SX: D2 ([SX,2]) (D1 garbage collected)
• Another client reads D2, writes back D3; handled by server SY: D3 ([SX,2], [SY,1])
• Another client reads D2, writes back D4; handled by server SZ: D4 ([SX,2], [SZ,1])
•
CSEP544 - Fall 2015 77
![Page 78: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/78.jpg)
Vector Clocks: Example • A client writes D1 at server SX:
D1 ([SX,1]) • Another client reads D1, writes back D2; also
handled by server SX: D2 ([SX,2]) (D1 garbage collected)
• Another client reads D2, writes back D3; handled by server SY: D3 ([SX,2], [SY,1])
• Another client reads D2, writes back D4; handled by server SZ: D4 ([SX,2], [SZ,1])
• Another client reads D3, D4: CONFLICT !
CSEP544 - Fall 2015 78
![Page 79: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/79.jpg)
Vector Clocks: Meaning
• A data item D[(S1,v1),(S2,v2),…] means a value that represents version v1 for S1, version v2 for S2, etc.
• If server Si updates D, then: – It must increment vi, if (Si, vi) exists – Otherwise, it must create a new entry (Si,1)
CSEP544 - Fall 2015 79
![Page 80: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/80.jpg)
Vector Clocks: Conflicts
• A data item D is an ancestor of D’ if for all (S,v)∈D there exists (S,v’)∈D’ s.t. v ≤ v’
• Otherwise, D and D’ are on parallel branches, and it means that they have a conflict that needs to be reconciled semantically
CSEP544 - Fall 2015 80
![Page 81: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/81.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 81
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2])
![Page 82: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/82.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 82
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
![Page 83: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/83.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 83
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5])
![Page 84: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/84.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 84
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
![Page 85: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/85.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 85
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
([SX,3],[SY,6]) ([SX,3],[SY,6],[SZ,2])
![Page 86: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/86.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 86
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
([SX,3],[SY,6]) ([SX,3],[SY,6],[SZ,2]) No
![Page 87: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/87.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 87
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
([SX,3],[SY,6]) ([SX,3],[SY,6],[SZ,2]) No
([SX,3],[SY,10]) ([SX,3],[SY,6],[SZ,2])
![Page 88: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/88.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 88
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
([SX,3],[SY,6]) ([SX,3],[SY,6],[SZ,2]) No
([SX,3],[SY,10]) ([SX,3],[SY,6],[SZ,2]) Yes
![Page 89: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/89.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 89
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
([SX,3],[SY,6]) ([SX,3],[SY,6],[SZ,2]) No
([SX,3],[SY,10]) ([SX,3],[SY,6],[SZ,2]) Yes
([SX,3],[SY,10]) ([SX,3],[SY,20],[SZ,2])
![Page 90: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/90.jpg)
Vector Clocks: Conflict or not?
CSEP544 - Fall 2015 90
Data 1 Data 2 Conflict ?
([SX,3],[SY,6]) ([SX,3],[SZ,2]) Yes
([SX,3]) ([SX,5]) No
([SX,3],[SY,6]) ([SX,3],[SY,6],[SZ,2]) No
([SX,3],[SY,10]) ([SX,3],[SY,6],[SZ,2]) Yes
([SX,3],[SY,10]) ([SX,3],[SY,20],[SZ,2]) No
![Page 91: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/91.jpg)
CAP Theorem
Brewer 2000: You can only have two of the following three: • Consistency • Availability • Tolerance to Partitions
CSEP544 - Fall 2015 91
![Page 92: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/92.jpg)
CAP Theorem: No Partitions
• CA = Consistency + Availability
• Single site database • Cluster database
• Need 2 phase commit • Need cache validation protocol
CSEP544 - Fall 2015 92 Brewer 2000
![Page 93: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/93.jpg)
CAP Theorem: No Availability
• CP = Consistency + tolerance to Partitions
• Distributed databases • Majority protocols
• Make minority partitions unavailable
CSEP544 - Fall 2015 93 Brewer 2000
![Page 94: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/94.jpg)
CAP Theorem: No Consistency
• AP = Availability + tolerance to Partitions
• DNS • Web caching
CSEP544 - Fall 2015 94 Brewer 2000
![Page 95: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/95.jpg)
CAP Theorem: Criticism
• Not really a “theorem”, since definitions are imprecise: a real theorem was proven a few years later, but under more limiting assumptions
• Many tradeoffs possible • D.Abadi: “CP makes no sense” because it
suggest never available. A, C asymmetric! – No “C” = all the time – No “A” = only when the network is partitioned
CSEP544 - Fall 2015 95
![Page 96: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/96.jpg)
Overview of No-SQL systems
CSEP544 - Fall 2015 96
Cattell, SIGMOD Record 2010
![Page 97: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/97.jpg)
Early “Proof of Concepts”
• Memcached: demonstrated that in-memory indexes (DHT) can be highly scalable
• Dynamo: pioneered eventual consistency for higher availability and scalability
• BigTable: demonstrated that persistent record storage can be scaled to thousands of nodes
CSEP544 - Fall 2015 97
Cattell, SIGMOD Record 2010
![Page 98: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/98.jpg)
ACID v.s. BASE
• ACID = Atomicity, Consistency, Isolation, and Durability
• BASE = Basically Available, Soft state, Eventually consistent
CSEP544 - Fall 2015 98
Cattell, SIGMOD Record 2010
![Page 99: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/99.jpg)
Terminology
• Simple operations = key lookups, read/writes of one record, or a small number of records
• Sharding = horizontal partitioning by some key, and storing records on different servers in order to improve performance.
• Horizontal scalability = distribute both data and load over many servers
• Vertical scaling = when a dbms uses multiple cores and/or CPUs
CSEP544 - Fall 2015 99
Cattell, SIGMOD Record 2010
Not exactly same as horizontal partitioning
Definitely different from vertical partitioning
![Page 100: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/100.jpg)
Data Model
• Tuple = row in a relational db • Document = nested values, extensible
records (think XML or JSON) • Extensible record = families of attributes
have a schema, but new attributes may be added
• Object = like in a programming language, but without methods
CSEP544 - Fall 2015 100
Cattell, SIGMOD Record 2010
![Page 101: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/101.jpg)
1. Key-value Stores
Think “file system” more than “database” • Persistence, • Replication • Versioning, • Locking • Transactions • Sorting
CSEP544 - Fall 2015 101
Cattell, SIGMOD Record 2010
![Page 102: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/102.jpg)
1. Key-value Stores
• Voldemort, Riak, Redis, Scalaris, Tokyo Cabinet, Memcached/Membrain/Membase
• Consistent hashing (DHT) • Only primary index: lookup by key • No secondary indexes • Transactions: single- or multi-update TXNs
– locks, or MVCC
CSEP544 - Fall 2015 102
Cattell, SIGMOD Record 2010
![Page 103: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/103.jpg)
2. Document Stores
• A "document" = a pointerless object = e.g. JSON = nested or not = schema-less
• In addition to KV stores, may have secondary indexes
CSEP544 - Fall 2015 103
Cattell, SIGMOD Record 2010
![Page 104: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/104.jpg)
2. Document Stores
• SimpleDB, CouchDB, MongoDB, Terrastore
• Scalability: – Replication (e.g. SimpleDB, CounchDB –
means entire db is replicated), – Sharding (MongoDB); – Both
CSEP544 - Fall 2015 104
Cattell, SIGMOD Record 2010
![Page 105: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/105.jpg)
3. Extensible Record Stores • Based on Google’s BigTable
• Data model is rows and columns
• Scalability by splitting rows and columns over nodes – Rows partitioned through sharding on primary key – Columns of a table are distributed over multiple nodes by using
“column groups”
• HBase is an open source implementation of BigTable
105
![Page 106: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/106.jpg)
Bigtable
• Distributed storage system • Designed to
– Hold structured data – Scale to thousands of servers – Store up to PB – Perform backend bulk processing – Perform real-time data serving
• To scale, Bigtable has a limited set of features
106
![Page 107: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/107.jpg)
Bigtable Data Model • Sparse, multidimensional sorted map (row:string, column:string, time:int64)è string Notice how everything but time is a string
• Example from Fig 1:
107
Columns are grouped into families
Chang, OSDI 2006
![Page 108: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/108.jpg)
BigTable Key Features
• Read/writes of data under single row key is atomic – Only single-row transactions!
• Data is stored in lexicographical order – Improves data access locality
• Column families are unit of access control • Data is versioned (old versions garbage
collected) – Ex: most recent three crawls of each page, with
times
108
Chang, OSDI 2006
![Page 109: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/109.jpg)
BigTable API
• Data definition – Creating/deleting tables or column families – Changing access control rights
• Data manipulation – Writing or deleting values – Supports single-row transactions – Looking up values from individual rows – Iterating over subset of data in the table
• Can select on rows, columns, and timestamps
CSE 344 - Fall 2013 109
Chang, OSDI 2006
![Page 110: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/110.jpg)
Megastore
• BigTable is implemented, used within Google
• Megastore is a layer on top of BigTable – Transactions that span nodes – A database schema defined in a SQL-like language – Hierarchical paths that allow some limited joins
• Megastore is made available through the Google App Engine Datastore
110
Cattell, SIGMOD Record 2010
![Page 111: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/111.jpg)
4. Scalable Relational Systems • Means RDBS that are offering sharding
• Key difference: NoSQL make it difficult or impossible to perform large-scope operations and transactions (to ensure performance), while scalable RDBMS do not *preclude* these operations, but users pay a price only when they need them.
• MySQL Cluster, VoltDB, Clusterix, ScaleDB, Megastore (the new BigTable)
CSEP544 - Fall 2015 111
Cattell, SIGMOD Record 2010
![Page 112: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/112.jpg)
Application 1
• Web application that needs to display lots of customer information; the users data is rarely updated, and when it is, you know when it changes because updates go through the same interface. Store this information persistently using a KV store.
CSEP544 - Fall 2015 112
Key-value store
Cattell, SIGMOD Record 2010
![Page 113: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/113.jpg)
Application 2
• Department of Motor Vehicle: lookup objects by multiple fields (driver's name, license number, birth date, etc); "eventual consistency" is ok, since updates are usually performed at a single location.
CSEP544 - Fall 2015 113
Document Store
Cattell, SIGMOD Record 2010
![Page 114: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/114.jpg)
Application 3
• eBay stile application. Cluster customers by country; separate the rarely changed "core” customer information (address, email) from frequently-updated info (current bids).
CSEP544 - Fall 2015 114
Extensible Record Store
Cattell, SIGMOD Record 2010
![Page 115: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/115.jpg)
Application 4
• Everything else (e.g. a serious DMV application)
CSEP544 - Fall 2015 115
Scalable RDBMS
Cattell, SIGMOD Record 2010
![Page 116: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/116.jpg)
Criticism
CSEP544 - Fall 2015 116
![Page 117: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/117.jpg)
Criticism
• Two ways to improve OLTP performance: – Sharding over shared-nothing – Improve per-server OLTP performance
• Recent RDBMs do provide sharding: Greenplum, Aster Data, Vertica, ParAccel
• Hence, the discussion is about single-node performance
CSEP544 - Fall 2015 117
Stonebraker, CACM’2010 (blog 1)
![Page 118: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/118.jpg)
Criticism (cont’d)
• Single-node performance: • Major performance bottleneck:
communication with DBMS using ODBC or JDBC – Solution: stored procedures, OR embedded
databases • Server-side performance (next slide)
CSEP544 - Fall 2015 118
Stonebraker, CACM’2010 (blog 1)
![Page 119: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/119.jpg)
Criticism (cont’d)
Server-side performance: abut 25% each • Logging
– Everything written twice; log must be forced • Locking
– Needed for ACID semantics • Latching
– This is when the DBMS itself is multithreaded; e.g. latch for the lock table
• Buffer management CSEP544 - Fall 2015 119
Stonebraker, CACM’2010 (blog 1)
![Page 120: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/120.jpg)
Criticism (cont’d)
Main take-away: • NoSQL databases give up 1, or 2, or 3 of
those features • Thus, performance improvement can only
be modest • Need to give up all 4 features for
significantly higher performance • On the downside, NoSQL give up ACID
CSEP544 - Fall 2015 120
Stonebraker, CACM’2010 (blog 1)
![Page 121: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/121.jpg)
Criticism (cont’d)
Who are the customers of NoSQL? • Lots of startups • Very few enterprises. Why? most
applications are traditional OLTP on structured data; a few other applications around the “edges”, but considered less important
CSEP544 - Fall 2015 121
Stonebraker, CACM’2011 (blog 2)
![Page 122: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/122.jpg)
Criticism (cont’d)
• No ACID Equals No Interest – Screwing up mission-critical data is no-no-no
• Low-level Query Language is Death – Remember CODASYL?
• NoSQL means NoStandards – One (typical) large enterprise has 10,000
databases. These need accepted standards
CSEP544 - Fall 2015 122
Stonebraker, CACM’2011 (blog 2)
![Page 123: CSEP 544: Lecture 10 - courses.cs.washington.edu...Column-Oriented Databases and NoSQL CSEP544 - Fall 2015 1 . Announcement Take home final: 12/9-10 • Online Webquiz – Need your](https://reader035.vdocuments.us/reader035/viewer/2022081618/60aa61ae265fe13d4220d208/html5/thumbnails/123.jpg)
End of CSEP 544
• “Big data” is here to stay • Requires unique tecniques/abstractions
– Logic (SQL, Relational Calculus) – Conceptual modeling (FD’s) – Algorithms (query processing) – Transactions
• Technology evolving rapidly, but • Techniques/abstracts persist over may
years, e.g. What goes around CSEP544 - Fall 2015 123