![Page 1: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/1.jpg)
Large-scale Incremental ProcessingUsing Distributed Transactions and Notifications
Daniel Peng and Frank DabekGoogle, Inc.OSDI 2010
15 Feb 2012Presentation @ IDB Lab. Seminar
Presented by Jee-bum Park
![Page 2: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/2.jpg)
2
Outline Introduction Design
– Bigtable overview– Transactions– Notifications
Evaluation Conclusion Good and Not So Good Things
![Page 3: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/3.jpg)
3
Introduction How can Google find the documents on the web so
fast?
![Page 4: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/4.jpg)
4
Introduction Google uses an index, built by the indexing sys-
tem, that can be used to answer search queries
![Page 5: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/5.jpg)
5
Introduction What does the indexing system do?
– Crawling every page on the web– Parsing the documents– Extracting links– Clustering duplicates– Inverting links– Computing PageRank– ...
![Page 7: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/7.jpg)
7
Introduction Compute PageRank using MapReduce
Job 1: compute R(1) Job 2: compute R(2) Job 3: compute R(3) ...
□
□
□
□
R(t) =
![Page 8: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/8.jpg)
8
Introduction Now, consider how to update that index after re-
crawling some small portion of the web
![Page 9: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/9.jpg)
9
Introduction Now, consider how to update that index after re-
crawling some small portion of the web
Is it okay to run the MapReducesover just new pages?
![Page 10: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/10.jpg)
10
Introduction Now, consider how to update that index after re-
crawling some small portion of the web
Is it okay to run the MapReducesover just new pages?
Nope, there are links between thenew pages and the rest of the web
![Page 11: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/11.jpg)
11
Introduction Now, consider how to update that index after re-
crawling some small portion of the web
Is it okay to run the MapReducesover just new pages?
Nope, there are links between thenew pages and the rest of the web
Well, how about this?
![Page 12: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/12.jpg)
12
Introduction Now, consider how to update that index after re-
crawling some small portion of the web
Is it okay to run the MapReducesover just new pages?
Nope, there are links between thenew pages and the rest of the web
Well, how about this?
MapReduces must be run again over the entire repository
![Page 13: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/13.jpg)
13
Introduction Google’s web search index was produced in this way
– Running over the entire pages
It was not a critical issue,– Because given enough computing resources, MapReduce’s
scalability makes this approach feasible
However, reprocessing the entire web– Discards the work done in earlier runs– Makes latency proportional to the size of the repository,
rather than the size of an update
![Page 14: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/14.jpg)
14
Introduction An ideal data processing system for the task of main-
taining the web search index would be optimized for incremental processing
Incremental processing system: Percolator
![Page 15: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/15.jpg)
15
Outline Introduction Design
– Bigtable overview– Transactions– Notifications
Evaluation Conclusion Good and Not So Good Things
![Page 16: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/16.jpg)
16
Design Percolator is built on top of the Bigtable distributed storage sys-
tem
A Percolator system consists of three binaries that run on every machine in the cluster– A Percolator worker– A Bigtable tablet server– A GFS chunkserver
All observers (user applications) are linked into the Percolator worker
![Page 17: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/17.jpg)
17
Design Dependencies
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
![Page 18: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/18.jpg)
Design System architecture
18
Node 1
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
Node 2
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
Node ...
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
Timestamp oracle ser-vice Lightweight lock service
![Page 19: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/19.jpg)
19
Design The Percolator worker
– Scans the Bigtable for changed columns– Invokes the corresponding observers as a function call in the
worker process
The observers– Perform transactions by sending read/write RPCs to Bigtable
tablet servers
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
![Page 20: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/20.jpg)
20
Design The Percolator worker
– Scans the Bigtable for changed columns– Invokes the corresponding observers as a function call in the
worker process
The observers– Perform transactions by sending read/write RPCs to Bigtable
tablet servers
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
1: scan
![Page 21: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/21.jpg)
21
Design The Percolator worker
– Scans the Bigtable for changed columns– Invokes the corresponding observers as a function call in the
worker process
The observers– Perform transactions by sending read/write RPCs to Bigtable
tablet servers
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
1: scan
2: in-voke
![Page 22: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/22.jpg)
22
Design The Percolator worker
– Scans the Bigtable for changed columns– Invokes the corresponding observers as a function call in the
worker process
The observers– Perform transactions by sending read/write RPCs to Bigtable
tablet servers
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
1: scan
2: in-voke3:
RPC
![Page 23: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/23.jpg)
Design The timestamp oracle service
– Provides strictly increasing timestamps A property required for correct operation of the snapshot isola-
tion protocol
The lightweight lock service– Workers use it to make the search for dirty notifications
more efficient
23
Timestamp oracle ser-vice Lightweight lock service
![Page 24: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/24.jpg)
24
Design Percolator provides two main abstractions
– Transactions Cross-row, cross-table with ACID snapshot-isolation semantics
– Observers Similar to database triggers or events
Transactions Observers Percolator
![Page 25: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/25.jpg)
25
Design – Bigtable overview Percolator is built on top of the Bigtable distributed
storage system
Bigtable presents a multi-dimensional sorted map to users– Keys are (row, column, timestamp) tuples
Bigtable provides lookup, update operations, and transactions on individual rows
Bigtable does not provide multi-row transactions
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
![Page 26: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/26.jpg)
26
Design – Transactions Percolator provides cross-row, cross-table transac-
tions with ACID snapshot-isolation semantics
![Page 27: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/27.jpg)
27
Design – Transactions Percolator stores multiple versions of each data
item using Bigtable’s timestamp dimension– Multiple versions are required to provide snapshot isola-
tion
Snapshot isolation
13
2
![Page 28: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/28.jpg)
28
Design – Transactions Case 1: use exclusive locks
1
![Page 29: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/29.jpg)
29
Design – Transactions Case 1: use exclusive locks
1
![Page 30: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/30.jpg)
30
Design – Transactions Case 1: use exclusive locks
1
2
![Page 31: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/31.jpg)
31
Design – Transactions Case 1: use exclusive locks
2
![Page 32: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/32.jpg)
32
Design – Transactions Case 1: use exclusive locks
2
![Page 33: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/33.jpg)
33
Design – Transactions Case 1: use exclusive locks
2
![Page 34: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/34.jpg)
34
Design – Transactions Case 2: do not use any locks
1
![Page 35: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/35.jpg)
35
Design – Transactions Case 2: do not use any locks
1
![Page 36: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/36.jpg)
36
Design – Transactions Case 2: do not use any locks
1
2
![Page 37: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/37.jpg)
37
Design – Transactions Case 2: do not use any locks
1
2
![Page 38: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/38.jpg)
38
Design – Transactions Case 2: do not use any locks
2
![Page 39: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/39.jpg)
39
Design – Transactions Case 2: do not use any locks
2
![Page 40: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/40.jpg)
40
Design – Transactions Case 2: do not use any locks
2
![Page 41: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/41.jpg)
41
Design – Transactions Case 3: use multiple versioning & timestamp
1
![Page 42: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/42.jpg)
42
Design – Transactions Case 3: use multiple versioning & timestamp
1
![Page 43: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/43.jpg)
43
Design – Transactions Case 3: use multiple versioning & timestamp
1
![Page 44: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/44.jpg)
44
Design – Transactions Case 3: use multiple versioning & timestamp
1
2
![Page 45: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/45.jpg)
45
Design – Transactions Case 3: use multiple versioning & timestamp
1
2
![Page 46: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/46.jpg)
46
Design – Transactions Case 3: use multiple versioning & timestamp
1
2
![Page 47: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/47.jpg)
47
Design – Transactions Case 3: use multiple versioning & timestamp
1
2
![Page 48: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/48.jpg)
48
Design – Transactions Case 3: use multiple versioning & timestamp
2
![Page 49: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/49.jpg)
49
Design – Transactions Case 3: use multiple versioning & timestamp
2
![Page 50: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/50.jpg)
50
Design – Transactions Case 3: use multiple versioning & timestamp
2
![Page 51: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/51.jpg)
51
Design – Transactions Case 3: use multiple versioning & timestamp
2
![Page 52: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/52.jpg)
52
Design – Transactions Percolator stores its locks in special in-memory col-
umns in the same Bigtable
![Page 53: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/53.jpg)
53
Design – Transactions Percolator transaction demo
![Page 54: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/54.jpg)
54
Design – Transactions Percolator transaction demo
![Page 55: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/55.jpg)
55
Design – Transactions Percolator transaction demo
![Page 56: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/56.jpg)
56
Design – Transactions Percolator transaction demo
![Page 57: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/57.jpg)
57
Design – Transactions Percolator transaction demo
![Page 58: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/58.jpg)
58
Design – Notifications In Percolator, the user writes code (“observers”) to
be triggered by changes to the table
Each observer registers a function and a set of col-umns
Percolator invokes the functions after data is written to one of those columns in any row
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
1: scan
2: in-voke3:
RPC
![Page 59: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/59.jpg)
59
A Percolator application
Design – Notifications Percolator applications are structured as a series of
observers– Each observer completes a task and creates more work for
“downstream” observers by writing to the table
Observer 1
Observer 2
Observer 4
Observer 5
Observer 3
Observer 6
![Page 60: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/60.jpg)
60
Google’s new indexing system
Design – Notifications
Document Processor (parse, extract links,
etc.)Clustering Exporter
Observers
Percolator worker
Bigtable tablet server
GFS chunkserver
1: scan
2: in-voke3:
RPC
![Page 61: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/61.jpg)
61
Design – Notifications To implement notifications, Percolator needs to effi-
ciently find dirty cells with observers that need to be run
To identify dirty cells, Percolator maintains a special “notify” Bigtable column, containing an entry for each dirty cell– When a transaction writes an observed cell, it also sets the
corresponding notify cell
![Page 62: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/62.jpg)
Design – Notifications Each Percolator worker chooses a portion of the table
to scan by picking a region of the table randomly– To avoid running observers on the same row concurrently,
each worker acquires a lock from a lightweight lock ser-vice before scanning the row
62
Timestamp oracle ser-vice Lightweight lock service
![Page 63: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/63.jpg)
63
Outline Introduction Design
– Bigtable overview– Transactions– Notifications
Evaluation Conclusion Good and Not So Good Things
![Page 64: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/64.jpg)
64
Evaluation Experiences with converting a MapReduce-based index-
ing pipeline to use Percolator
Latency– 100x faster than the previous system
Simplification– The number of observers in the new system: 10– The number of MapReduces in the previous system: 100
Easier to operate– Far fewer moving parts: tablet servers, Percolator workers,
chunkservers– In the old system, each of a hundred different MapReduces
needed to be individually configured and could independently fail
![Page 65: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/65.jpg)
65
Evaluation Crawl rate benchmark on 240 machines
![Page 66: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/66.jpg)
66
Evaluation Versus Bigtable
![Page 67: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/67.jpg)
67
Evaluation Fault-tolerance
![Page 68: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/68.jpg)
68
Outline Introduction Design
– Bigtable overview– Transactions– Notifications
Evaluation Conclusion Good and Not So Good Things
![Page 69: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/69.jpg)
69
Conclusion Percolator provides two main abstractions
– Transactions Cross-row, cross-table with ACID snapshot-isolation semantics
– Observers Similar to database triggers or events
Transactions Observers Percolator
![Page 70: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/70.jpg)
70
Outline Introduction Design
– Bigtable overview– Transactions– Notifications
Evaluation Conclusion Good and Not So Good Things
![Page 71: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/71.jpg)
71
Good and Not So Good Things Good things
– Simple and neat design– Purpose of use is clear– Detailed description based on real example: Google’s index-
ing system
Not so good things– Lack of observer examples (Google’s indexing system in par-
ticular)
![Page 72: Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI 2010 15 Feb 2012 Presentation](https://reader030.vdocuments.us/reader030/viewer/2022032414/56649ee15503460f94bf1d17/html5/thumbnails/72.jpg)
Thank You!
Any Questions or Comments?