bigtable a distributed storage system for structured data...
TRANSCRIPT
![Page 1: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/1.jpg)
Bigtable: A Distributed Storage System for Structured
Data by GoogleSUNNIE CHUNG
CIS 612
![Page 2: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/2.jpg)
Google Bigtable
� A distributed storage system for managing structured
data that is designed to scale to a very large size:
petabytes of data across thousands of commodity
servers.
� Many projects at Google store data in Bigtable, including
web indexing, Google Earth, and Google Finance.
� These applications place very different demands on
Bigtable, both in terms of :
� Data size (from URLs to web pages to satellite imagery) a
� Latency requirements (from backend bulk processing to
real-time data serving).
2
![Page 3: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/3.jpg)
Google Bigtable
� Wide Applicability
� Scalability
� High Performance
� High Availability
� Bigtable is used by:
• Google Analytics
• Google Finance
• Google Earth
• Google Search / Personalized Search
• Orkut etc.
3
![Page 4: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/4.jpg)
Google Bigtable is
� a sparse,
� distributed,
� persistent,
� multi dimensional sorted map which is indexed by
� a Row Key,
� a Column Key and
� a Timestamp,
� where each value in the map is an un-interpreted
array of bytes.
4
![Page 5: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/5.jpg)
5
Data Model
( row key : string, column : string, timestamp : int64) ���� string
T3
T5
T6
T9 T8<html>…
<html>…
<html>…
“CNN
”
“CNN.COM
”
“com.cnn.www”
“Contents;”
“anchor:cnnsi.com;”
“anchor:my.look.ca”
Fig: A slice of an eg table (consider name Webtable) storing web pages
![Page 6: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/6.jpg)
6
![Page 7: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/7.jpg)
7
Rows:
• Arbitrary strings; currently up to 64KB in size
allowed; typical size for string data 10-100 bytes
• Every read or write is atomic
• Lexicographic ordering of keys
• Row range definition for a table is dynamic called a
Tablet
Column Families:
• Column keys are grouped into sets called column
families
• Syntax: family:qualifier
• Eg: “anchor:my.look.ca”: in the figure from previous slide
• language:LanguageID: This stores the language for
webpage
Bigtable Rows and Column Families
![Page 8: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/8.jpg)
8
Timestamps:
• Multiple versions of the data are allowed in a
single cell; indexed by timestamps
• Timestamps are 64 bit integer values
• Client or Bigtable can generate timestamps for
the data
• Stored in decreasing order keeping newest data
on top of storage
• Bigtable column family can have their own
garbage collection mechanism defined to help
keep only recent ‘n’ no. of data values
Timestamps
![Page 9: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/9.jpg)
9
Bigtable API Provides functions for
• Creating and Deleting tables
• Creating and Deleting column families
• Changing cluster, table and column family
metadata
• Access to data rows for write and delete
operations
• Scan through data for particular column
families, rows and data values with filters
applied
• Batch and atomic writes
Bigtable API
![Page 10: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/10.jpg)
10
Google File System
Google SSTableChubby
Bigtable Building Blocks
![Page 11: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/11.jpg)
11Bigtable Building Blocks
Google File System:
To Store log and data files
Google SSTable:
Used internally to store data in Bigtable
Chubby:
Paxos based system for consensus in network of
unreliable processors
Provides a namespace to store Directories and small
files. Each Dir or file can be used as a lock and every
access is atomic
Chubby Unavailable Bigtable Unavailable
![Page 12: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/12.jpg)
12Implementation
Three Major components:
A Library that is linked to every client
One master server:Responsible for assigning tablets to Tablet Servers
Many Tablet servers:Dynamically added and removed from a cluster to
accommodate changes in workloads
Recall: Tablet: Every table in Bigtable is dynamically partitioned
on the basis of row keys / range. Each row range is called a
tablet.
![Page 13: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/13.jpg)
13
One master server:• Responsible for assigning tablets to Tablet Servers
• Detecting the addition and expiration of Tablet Servers
• Balancing Tablet Server Load
• Garbage collection of files in GFS
Note: No client practically communicates with Master Server
Many Tablet servers:• Dynamically added and removed from a cluster to
accommodate changes in workloads
• Each Tablet Server manages a set of Tablet
• Handles Read and Write requests to the Tablets
• Split tablets when they have grown too large
Implementation Cntd.
![Page 14: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/14.jpg)
14
Chubby
File
Other Metadata Tablets
User Table 1
User Table N
Root Tablet(1st metadata
Tablet)
Fig: Tablet Location Hierarchy
Tablets
![Page 15: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/15.jpg)
15Tablet Assignment
Master Server and Tablet Assignment:
• Assign a tablet to only one Tablet Server at a given time
• Keep track of all the Tablet servers including tablets
assignments and un-assignments
• Master is responsible for:
• Detecting when a tablet server is no longer serving
its tablets
• Keep track of lock / Session with Chubby
Steps Taken by Master at Startup:• Grab a unique Master lock in Chubby
• Scan the servers directory to find live servers
• Communicate with every live server to check assignments
• Scan the Metadata table to learn the set of Tablets
![Page 16: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/16.jpg)
16
memtable Read Op
Write Op
SSTable Files
tablet log
Memory
GFS
Tablet Representation
![Page 17: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/17.jpg)
17Tablet Serving
Read Operation on Tablet
• Checked for Well
formedness and
proper authorization
• Valid read operation
is executed on a
merged view of
sequence of SSTables
and memtable
Write Operation on Tablet
When write operation arrives
at Tablet:
• Server checks for
well formedness and
proper authorization
• Valid mutation is
written to the
commit log
• Group commit is
used to improve the
throughput
• After write is
committed, contents
are inserted to
memtable
![Page 18: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/18.jpg)
18Tablet Compactions
Everything that happens with data here is atomic!!!
Minor Compactions:
• Shrinks memory usage of Tablet server
• Reduces amount of data that has to be recreated from
log in case of failure
Merging Compactions:
• Performed in background on SSTables created by minor
compactions
Major Compactions:
• All SSTables are merged and rewritten to one SSTable.
• Bigtable performs this operation regularly.
![Page 19: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/19.jpg)
19Refinements
• Locality Groups
• Compression
• Caching for Read performance
• Bloom Filters
• Commit Log Implementation
• Speeding up Tablet recovery
• Exploiting immutability
![Page 20: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/20.jpg)
20Summary
Bigtable is a distributed storage system for storing structured data
at Google
In operation since 2005, by August 2006 more than 60 projects are
using Bigtable
Effective performance, High availability and Scalability are the
key features for most of the clients
Control over architecture allows Google to customize the
product as needed.
Use by old and new clients demonstrates that Bigtable
architecture works.
![Page 21: Bigtable A Distributed Storage System for Structured Data ...eecs.csuohio.edu/~sschung/...BigTable_Updated.pdf · Google Bigtable A distributed storage system for managing structured](https://reader035.vdocuments.us/reader035/viewer/2022071219/60552276cc7dae494779b434/html5/thumbnails/21.jpg)
This Lecture Notes from
the paper published by Google in OSDI 2006
http://grail.csuohio.edu/~sschung/cis612/googlebigt
able-osdi06.pdf
21