hypertable berlin buzzwords
DESCRIPTION
This presentation was given by Doug Judd at BerlinBuzzwords 2010.TRANSCRIPT
HypertableHypertable
Doug JuddDoug Judd
CEO, Hypertable, Inc.CEO, Hypertable, Inc.
High Performance, Open SourceHigh Performance, Open SourceScalable DatabaseScalable Database
Modeled after Modeled after BigtableBigtable High Performance Implementation (C++)High Performance Implementation (C++) Project Started in March 2007Project Started in March 2007 Thrift Interface for all popular languagesThrift Interface for all popular languages
JavaJava PHPPHP RubyRuby PythonPython Perl, etc.Perl, etc.
Bigtable: the infrastructure that Bigtable: the infrastructure that Google is built onGoogle is built on
YouTubeYouTube BloggerBlogger Google EarthGoogle Earth Google MapsGoogle Maps Orkut (social network)Orkut (social network) GmailGmail Google AnalyticsGoogle Analytics Google Book SearchGoogle Book Search Google CodeGoogle Code Crawl DatabaseCrawl Database … … plus 90 other Google services …plus 90 other Google services …
FunctionalityFunctionality
Massive sparse tables of informationMassive sparse tables of information Single primary key indexSingle primary key index Cells can have mulitple timestamped Cells can have mulitple timestamped
versionsversions Not RelationalNot Relational
No joins (not yet)No joins (not yet) No secondary indexes (not yet)No secondary indexes (not yet) Not a transaction system (not yet)Not a transaction system (not yet)
Hypertable DeploymentsHypertable Deployments
Other ArchitecturesOther Architectures
Auto-ShardingAuto-Sharding
MongoDBMongoDB
AsterDataAsterData
GreenplumGreenplum
MongoDBMongoDB
Dynamo-based Hash Table Dynamo-based Hash Table ArchitecturesArchitectures
CassandraCassandra
Project VoldemortProject Voldemort
RiakRiak
Eventual ConsistencyEventual Consistency
Consistent HashingConsistent Hashing
Order Preserving Partitioner Order Preserving Partitioner (Cassandra)(Cassandra)
www.recipezaar.com 1091721999…629750272 1091721999…629750272
++www.ribbonprinters.com 1091721999…965293103 1091721999…965293103
/ 2 =/ 2 =www.rgb????i?pQdp?.??? 1091721999…297521687?.??? 1091721999…297521687
Order Preserving PartitionerOrder Preserving PartitionerBalance ProblemBalance Problem
Hypertable ArchitectureHypertable Architecture
Conceptual Table LayoutConceptual Table Layout
Table: Actual RepresentationTable: Actual Representation
Range DistributionRange Distribution
Google StackGoogle Stack
Google File SystemGoogle File System
Google File SystemGoogle File System
System OverviewSystem Overview
Log Structured Merge (LSM) TreeLog Structured Merge (LSM) Tree
Eliminates random I/O on writesEliminates random I/O on writes Converts random I/O to sequential I/OConverts random I/O to sequential I/O Write pathWrite path
1.1. Commit log on disk (DFS)Commit log on disk (DFS)
2.2. In-memory mapIn-memory map In-memory map gets “compacted” to diskIn-memory map gets “compacted” to disk Disk files periodically get mergedDisk files periodically get merged
Range ServerRange Server Manages ranges of table dataManages ranges of table data CellCache: In-memory map containing recent updatesCellCache: In-memory map containing recent updates CellStore: On-disk (DFS) file containing “compacted” cell CellStore: On-disk (DFS) file containing “compacted” cell
cachecache
Range Server: CellStoreRange Server: CellStore
Sequence of 65K Sequence of 65K blocks of compressed blocks of compressed key/value pairskey/value pairs
CompressionCompression Cell Store blocks are compressedCell Store blocks are compressed Commit Log updates are compressedCommit Log updates are compressed Supported Compression SchemesSupported Compression Schemes
zlib (--best and --fast)zlib (--best and --fast) lzolzo quicklzquicklz bmzbmz nonenone
Bloom FilterBloom Filter Probabilistic data structure associated with every Probabilistic data structure associated with every
CellStoreCellStore Indicates if key is Indicates if key is notnot present present
CachingCaching Block CacheBlock Cache
Caches CellStore blocksCaches CellStore blocks Blocks are cached uncompressedBlocks are cached uncompressed Dynamically adjusted size based on workloadDynamically adjusted size based on workload
Query CacheQuery Cache Caches query resultsCaches query results
Dynamic Memory AdjustmentDynamic Memory Adjustment
Performance EvaluationPerformance Evaluation
Hypertable vs. HBaseHypertable vs. HBase
Test SetupTest Setup
Hypertable v0.9.3.2 (not yet released)Hypertable v0.9.3.2 (not yet released) HBase 0.20.3HBase 0.20.3 HDFS 0.20.2HDFS 0.20.2 10 machines10 machines
3 Hyperspace / Zookeeper replicas3 Hyperspace / Zookeeper replicas 1 Master / 4 Tablet Servers (5GB RAM)1 Master / 4 Tablet Servers (5GB RAM) 1 Test Dispatcher / 4 Test Clients1 Test Dispatcher / 4 Test Clients
Machine profileMachine profile 1 X 1.8 GHz Dual-core Opteron1 X 1.8 GHz Dual-core Opteron 10 GB RAM10 GB RAM 3 X 250 GB SATA drives3 X 250 GB SATA drives
Random Write / Sequential ReadRandom Write / Sequential Read
Random ReadRandom Read
Project ResourcesProject Resources
Twitter: hypertableTwitter: hypertable www.hypertable.orgwww.hypertable.org
Professional SupportProfessional Support