![Page 1: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/1.jpg)
Bigtable: A Distributed Storage System for Structured DataFay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. WallachMike Burrows, Tushar Chandra, Andrew Fikes, Robert E. GruberGoogle, Inc.OSDI '06
August 24, 2011Hye Chan Bae
![Page 2: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/2.jpg)
2
Contents Introduction Data Model Implementation Refinements Evaluation Conclusions
![Page 3: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/3.jpg)
3
Managing structured data GFS(Google File System) for tremendous data
Structured data
![Page 4: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/4.jpg)
4
Managing structured data
We need a storage systemlike database
inDistributed Environment!!
Bigtable
![Page 5: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/5.jpg)
5
Bigtable A distributed storage system
– Wide applicability– Scalability– High performance– High availability
Used by more than 60 Google products and projects– Google Analytics– Google Finance– Orkut– Personalized Search– Writely– Google Earth– …
![Page 6: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/6.jpg)
6
Bigtable & Database Bigtable ≒ Database?
– Shares many implementation strategies with databases
Column1 Column2 Column3Row1Row2Row3
Table
Database
File System
Bigtable
GFS
![Page 7: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/7.jpg)
7
Bigtable & Database (cont.) Bigtable ≠ Database!!
– Does not support a full relational data model– But provides clients with a simple data model
![Page 8: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/8.jpg)
8
Contents Introduction Data Model Implementation Refinements Evaluation Conclusions
![Page 9: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/9.jpg)
9
Table Structure Extends the concepts of table in Relational DB
– Table, Row, Column
Data
ColumnRo
w
RDB
Column Family
Tim
esta
mp
BigtableRow Key
S.D
Column
Row
![Page 10: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/10.jpg)
10
Multi Dimensional Sorted Map
Column Family
Tim
esta
mpRow Key
S.D
ColumnRo
w
(row:string, column:string, time:int64) → string
![Page 11: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/11.jpg)
11
Example : Webtable A kind of Bigtable
– Want to keep a copy of a large collection of web pages– Could be used by many different projects
Row Key
Column FamilyColumn KeyFamily:qualifier
Timestampimages.google.commaps.google.comwww.google.com
com.google.imagescom.google.mapscom.google.www
![Page 12: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/12.jpg)
12
Contents Introduction Data Model Implementation Refinements Evaluation Conclusions
![Page 13: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/13.jpg)
13
Tablet Data is distributed to a number of commodity servers Could split a table into row ranges
– Called "tablet" Tablets are distributed and managed
Table
Row
Tablet
Tablet
Server
Server
![Page 14: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/14.jpg)
14
3 components Master Tablet Server Client
Client GFS
Chubby Master
Tablet ServerTablet Server
Tablet Server
![Page 15: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/15.jpg)
15
Tablet Structure SSTable
– A read-only table for search in GFS– A tablet is composed by SSTables– Data & Index
The index is loaded into memory when the SSTable is opened
Key1Key2Key3
…KeyN
Index SSTable
Data
Index
![Page 16: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/16.jpg)
16
Tablet Structure (cont.) Memtable
– SSTable can't be updated (read-only table)– A small writable table in memory per tablet– Commit log file is created & updated before write task
Tablet Server
memtable
GFS
Commit Log
SSTable
SSTable
![Page 17: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/17.jpg)
17
Tablet Serving Write operation
①
②
③
![Page 18: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/18.jpg)
18
Tablet Serving (cont.) Read operation
①
②
②
![Page 19: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/19.jpg)
19
Accessing Tablet from Client METADATA
– Information about tablet– is also a table
And is split into tablets Searching tablet location
– Basically, B+ tree algorithm is used in 3-level
![Page 20: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/20.jpg)
20
Contents Introduction Data Model Implementation Refinements Evaluation Conclusions
![Page 21: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/21.jpg)
21
Locality Groups Some applications use only specific column families Clients can group multiple column families together
– Each SSTable can store a locality group– More efficient reading
Webtable
![Page 22: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/22.jpg)
22
Caching for read performance Scan Cache
– Higher-level cache– Caches the key-value pairs– Most useful for applications that tend to read the same data
repeatedly
Block Cache– Lower-level cache– Caches SSTables blocks that were read from GFS– Useful for applications that tent to read sequential data
![Page 23: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/23.jpg)
23
Commit-log implementation A large number of log files
– A separate log file per tablet– Could cause a large number of disk seeks
For recovery,– Sorting the commit log entries in order of the keys
<table, row name, log sequence number>
Tablet Server
TabletCommit
Log
TabletCommit
Log
TabletCommit
Log
Tablet Server
Tablet
Tablet
Tablet
Com-mit Log
![Page 24: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/24.jpg)
24
Contents Introduction Data Model Implementation Refinements Evaluation Conclusions
![Page 25: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/25.jpg)
25
A setting cluster 1,786 machines
– 2 * 400 GB IDE Hard drives– 2 * 2 GHz dual-core Opteron chipsets– A single gigabit Ethernet link
Used the same number of clients as table servers Read and write 1000-byte values to Bigtable
![Page 26: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/26.jpg)
26
Values read/written per second
![Page 27: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/27.jpg)
27
Contents Introduction Data Model Implementation Refinements Evaluation Conclusions
![Page 28: Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,](https://reader035.vdocuments.us/reader035/viewer/2022062311/5a4d1b8d7f8b9ab0599bf985/html5/thumbnails/28.jpg)
28
Conclusions As of August 2006, more than 60 projects are using
Bigtable– Users like the performance and high availability– Can scale the capacity of clusters by simply adding machines
Unusual interface– How difficult it has been for our users to adapt to using it– Many Google products successfully use Bigtable well in prac-
tice
Future works– Supports for secondary indices– Builds cross-data-center infrastructure