bigtable
DESCRIPTION
TRANSCRIPT
A Distributed Storage System for Structured Data
Authors: Fay Chang et. al.
Presenter: Zafar Gilani
Bigtable
1
Bigtable
Outline
• Introduction
• Data model
• Implementation
• Performance evaluation
• Conclusions
2
Bigtable
A distributed storage system ..
• .. for managing structured data.
• Used for demanding workloads, such as:
– Throughput oriented batch processing.
– Serving latency-sensitive data to the client.
• Dynamic control instead of relational model.
• Data locality properties (revisit later briefly).
3
Bigtable
Bigtable has achieved several goals
• Wide applicability: used for 60+ Google products, including:
– Google Analytics, Google Code, Google Earth, Google Maps and Gmail.
• Scalability (explain later under evaluation).
• High performance.
• High availability.
4
Bigtable
Outline
• Introduction
• Data model
• Implementation
• Performance evaluation
• Conclusions
5
Bigtable
Data model
• Essentially a sparse, distributed, persistent multi-dimensional sorted map.
• The map is indexed by a row key, column key and a timestamp.
• Atomic reads and writes over a single row.
6
Columns
Rows
Bigtable
Row and column range
• Row range dynamically partitioned into tablets.
• Data in lexicographic order.
• Allows data locality.
• Column keys grouped into column families.
• Each family has the same type.
• Allows access control and disk or memory accounting.
7
Bigtable
Row and column range
• Row range dynamically partitioned into tablets.
• Data in lexicographic order.
• Allows data locality.
• Column keys grouped into column families.
• Each family has the same type.
• Allows access control and disk or memory accounting.
8
Enables reasoning about data locality
9
Columns
Rows
10
Columns
Rows
Anchor is a column family
11
Columns
“com.bbc.www”
“anchor:bbcworld.com “anchor:weather
“BBC” “BBC.com”
Tablets
Bigtable
Outline
• Introduction
• Data model
• Implementation
• Performance evaluation
• Conclusions
12
Bigtable
Bigtable uses several other technologies
• Google File System to store log and data files.
• SSTable file format to store BigTable data.
• Chubby, a distributed lock service.
For more details on these technologies, refer to section 4 of the paper.
13
Bigtable
Implementation
14
Master responsibilities: -Assign tablets to tablet servers -Add/delete tablet servers -Balance tablet server load -GC -Schema changes
MASTER
TABLET SERVERS
INTERNET
CLIENT
Communicate directly to tablet servers
Bigtable
How data is stored?
15
A three-level hierarchy, similar to B+ trees.
Bigtable
Location hierarchy
16
Chubby file contains location of the root tablet.
Bigtable
Location hierarchy
17
Root tablet contains all tablet locations in Metadata
table.
Bigtable
Location hierarchy
18
Metadata table stores locations of actual tablets.
Bigtable
Location hierarchy
19
Client moves up the hierarchy (Metadata -> Root -> Chubby), if location of tablet is unknown or incorrect.
Bigtable
How data is served?
20
Bigtable
Tablet serving
21
Persistent
Bigtable
Tablet serving
22
Compactions occur regularly, advantages: -Shrinks memory usage. -Reduces amount of data read from log during recovery.
Compactions
Bigtable
Outline
• Introduction
• Data model
• Implementation
• Performance evaluation
• Conclusions
23
Bigtable
Benchmarks for perf evaluation
• Scan:
– Scans over values in a row range.
• Random reads from memory.
• Random reads/writes:
– R keys to be read/written spread over N clients.
• Sequential reads/writes:
– 0 to R-1 keys to be read/written spread over N clients.
24
Bigtable
Performance evaluation
25
Scan uses single RPC call and shows best performance.
Bigtable
Performance evaluation
26
Sequential reads are better than random reads, since
each fetched block is used to serve next requests.
Bigtable
Performance evaluation
27
Random read shows the worst performance. Fetching 64KB
every 1000 bytes is expensive.
Bigtable
Performance evaluation
28
Not linear, but scales well.
Bigtable
Outline
• Introduction
• Data model
• Implementation
• Performance evaluation
• Conclusions
29
Bigtable
Conclusions
• Bigtable: highly scalable and available, without compromising performance.
• Flexibility for Google – designed using their own data model.
• Custom design gives Google the ability to remove or minimize bottlenecks.
• Related work: – Apache Hbase (open source)
– Boxwood (though targeted at a lower/FS level)
30
A Distributed Storage System for Structured Data
Authors: Fay Chang et. al.
Presenter: Zafar Gilani
Bigtable
31
Bigtable
B+ Trees
• A tree with sorted data for:
– Efficient insertion, retrieval and removal of records.
• All records are stored at the leaf level, only keys stored in interior nodes.
32