nosql series-part-3-hypertable
TRANSCRIPT
HypertableHypertable
Doug JuddDoug Judd
CEO, Hypertable, Inc.CEO, Hypertable, Inc.
High Performance, Open SourceHigh Performance, Open SourceScalable DatabaseScalable Database
Modeled after Modeled after BigtableBigtable High Performance Implementation (C++)High Performance Implementation (C++) Project Started in March 2007Project Started in March 2007 Runs on top of HDFSRuns on top of HDFS Thrift Interface for all popular languagesThrift Interface for all popular languages
JavaJava PHPPHP RubyRuby PythonPython Perl, etc.Perl, etc.
Hypertable DeploymentsHypertable Deployments
ArchitectureArchitecture
Underlying Data Underlying Data RepresentationRepresentation
Scaling (part I)Scaling (part I)
Scaling (part II)Scaling (part II)
Scaling (part III)Scaling (part III)
Request RoutingRequest Routing
Query HandlingQuery Handling
FeaturesFeatures
Load data from HT to Hive and vice-versaLoad data from HT to Hive and vice-versa Use Hive types Use Hive types Use Hive QL (joins, aggregations)Use Hive QL (joins, aggregations) Low latency data warehousingLow latency data warehousing Uses Hypertable’s native MapReduce Uses Hypertable’s native MapReduce
Input/Output formatInput/Output format
NamespacesNamespaces/development user tweet/testing user tweet/production /v1 user tweet /v2 user tweet
Column Family OptionsColumn Family Options
TTL=<t>TTL=<t> ““time to live”time to live” Remove cells that are older than <t>Remove cells that are older than <t>
MAX_VERSIONS=<n>MAX_VERSIONS=<n> Keep only most recent <n> cell versionsKeep only most recent <n> cell versions
Access GroupsAccess Groups
Provides control over physical layoutProvides control over physical layout Row orientedRow oriented Column orientedColumn oriented HybridHybrid
Reduces I/OReduces I/O
CREATE TABLE MyTable ( a, b, c, d, ACCESS GROUP first(a), ACCESS GROUP second (b, c, d));
Regular Expression FilteringRegular Expression Filtering
Google’s RE2 regular expression engineGoogle’s RE2 regular expression engine Extremely fast (up to 50X Java regex)Extremely fast (up to 50X Java regex) Searches run in time linear in the size of the inputSearches run in time linear in the size of the input Searches constrained to a fixed amount of memorySearches constrained to a fixed amount of memory
Supported Searches:Supported Searches: Row keyRow key Column qualifierColumn qualifier ValueValue
SELECT CELLS tag:/(?i)(nosql|bigtable)/ FROM MyTable WHERE ROW REGEXP "^\D+" AND VALUE REGEXP ”(?i)hypertable";
Atomic CountersAtomic Counters
New column option:New column option:
Modified via existing API using specially Modified via existing API using specially formatted values:formatted values:
create table counts ( url COUNTER,);
Value Format Description
[+]n Increment counter by n
-n Decrement counter by n
=n Reset counter to n
Group CommitGroup Commit
Supports Supports highly concurrenthighly concurrent updates updates Trades minimum latency for better throughputTrades minimum latency for better throughput Configurable commit interval per-table:Configurable commit interval per-table:
CREATE TABLE counts ( url, domain) GROUP_COMMIT_INTERVAL=100;
CompressionCompression Block compressionBlock compression
Cell Store (SSTable) blocksCell Store (SSTable) blocks Commit Log blocksCommit Log blocks
Supported Compression Schemes:Supported Compression Schemes: zlibzlib lzolzo quicklzquicklz bmzbmz nonenone
Bloom FilterBloom Filter Dramatically reduces disk accessDramatically reduces disk access Associated with each Cell StoreAssociated with each Cell Store Tells you if key is definitively Tells you if key is definitively notnot present present
Performance EvaluationPerformance Evaluation
SetupSetup
Modeled after Test described in Bigtable paperModeled after Test described in Bigtable paper 1 Test Dispatcher, 4 Test Clients, 4 Tablet Servers1 Test Dispatcher, 4 Test Clients, 4 Tablet Servers Test was written entirely in Test was written entirely in JavaJava HardwareHardware
1 X 1.8 GHz Dual-core Opteron1 X 1.8 GHz Dual-core Opteron 10 GB RAM10 GB RAM 3X 250GB SATA drives3X 250GB SATA drives
SoftwareSoftware HDFS 0.20.2 running on all 10 nodes, 3X replicationHDFS 0.20.2 running on all 10 nodes, 3X replication HBase 0.20.4HBase 0.20.4 Hypertable 0.9.3.3Hypertable 0.9.3.3
LatencyLatency
ThroughputThroughputTest Hypertable
Advantage Relative to HBase (%)
Random Read Zipfian 80 GB 925
Random Read Zipfian 20 GB 777
Random Read Zipfian 2.5 GB 100
Random Write 10KB values 51
Random Write 1KB values 102
Random Write 100 byte values 427
Random Write 10 byte values 931
Sequential Read 10KB values 1060
Sequential Read 1KB values 68
Sequential Read 100 byte values 129
Scan 10KB values 2
Scan 1KB values 58
Scan 100 byte values 75
Scan 10 byte values 220
Why does Performance Why does Performance Matter?Matter?
$$$
Upcoming Release (0.9.5)Upcoming Release (0.9.5)
Last “alpha” releaseLast “alpha” release Release Date: February 15th 2011Release Date: February 15th 2011 FeaturesFeatures
Automatic range balancingAutomatic range balancing Asynchronous APIAsynchronous API Improved Monitoring SystemImproved Monitoring System
ResourcesResources
Twitter: hypertable
Project Site: www.hypertable.org
Blog: blog.hypertable.com
Professional SupportProfessional Support
Q&AQ&A