scaling vividcortex's big data systems on mysql

26
S CALING V IVID C ORTEX ' S B IG D ATA S YSTEMS O N M Y SQL B ARON S CHWARTZ SCALE F EBRUARY 2015

Upload: vividcortex

Post on 14-Jul-2015

577 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Scaling VividCortex's Big Data Systems on MySQL

SCALING VIVIDCORTEX'S BIG DATA SYSTEMS ON

MYSQLBARON SCHWARTZ

SCALE FEBRUARY 2015

Page 2: Scaling VividCortex's Big Data Systems on MySQL

ABOUT VIVIDCORTEX

VIVIDCORTEX IS THE BEST WAY TO SEE WHAT YOUR PRODUCTION MYSQL SERVERS ARE DOING

CAPTURES THOUSANDS OF METRICS IN ONE-SECOND RESOLUTION FROM YOUR PRODUCTION SYSTEMS

NO MORE SLOW-QUERY-LOG ANALYSIS AND PAINFUL MANUAL CONFIGURATION — GET INSIGHT IN SECONDS, NOT HOURS

AWESOME USER INTERFACE

FREE TRIAL, NO-RISK: VIVIDCORTEX.COM/

Page 3: Scaling VividCortex's Big Data Systems on MySQL
Page 4: Scaling VividCortex's Big Data Systems on MySQL

WHAT IS TIME-SERIES DATA?ANY MEASUREMENTS TAKEN AT A SPECIFIC POINT IN TIME

STOCK TICKERS, WEATHER DATA, TWEETS (?)

FOR TODAY'S PURPOSES, LOTS AND LOTS OF:

A MEASUREMENT (VALUE)

OF A SPECIFIC METRIC OF INTEREST

FROM A PARTICULAR HOST/SOURCE

AT A SPECIFIC MOMENT IN TIME

Page 5: Scaling VividCortex's Big Data Systems on MySQL

POPULAR TIME-SERIES DATABASES

RRDTOOL

GRAPHITE (WHISPER)

HBASE, CASSANDRA, OPENTSDB, ETC

INFLUXDB

HOMEGROWN

Page 6: Scaling VividCortex's Big Data Systems on MySQL

VIVIDCORTEX’S TIME-SERIES DATA

METRICS: {HOST, METRIC, TIMESTAMP, VALUE}

E.G. {83, “OS.CPU.UTILIZATION”, 1418143666, 18.2%}

QUERY METRICS

DITTO, BUT THE METRIC NAME IS RELATED TO THE QUERY FAMILY

E.G. “HOST.QUERIES.C.1374C6821EAD6F47.TPUT”

METRICS PER-USER, PER-PROCESS, PER-DATABASE, ETC

QUERY SAMPLES, EVENTS, FAULTS, SYSTEM VARIABLE CHANGES, ETC

OUT OF SCOPE TODAY; SEE HERE

Page 7: Scaling VividCortex's Big Data Systems on MySQL

DENSE AND SPARSE METRICS

DENSE METRICS

ALWAYS EXIST AT EVERY POINT IN TIME

EXAMPLE: SYSTEM FREE MEMORY

EXAMPLE: CPU UTLIZATION

SPARSE METRICS

MAY ONLY OCCUR OCCASIONALLY

EXAMPLE: METRICS RELATED TO A SPECIFIC QUERY

Page 8: Scaling VividCortex's Big Data Systems on MySQL

WHAT’S UNUSUAL AT VIVIDCORTEX

HIGH RESOLUTION: EVERYTHING IN 1-SECOND GRANULARITY

LARGE NUMBER OF METRICS (CARDINALITY, AND RATE)

MANY METRICS ARE HIGHLY SPARSE

Page 9: Scaling VividCortex's Big Data Systems on MySQL

QUESTIONS WE ASKRETRIEVE METRIC A FROM TIMESTAMP B TO C AT RESOLUTION D

RANK ALL METRICS MATCHING PATTERN X FROM B TO C, LIMIT N

Page 10: Scaling VividCortex's Big Data Systems on MySQL

SCHEMA DESIGN + INDEXING

MULTI-TENANT, SHARDED ARCHITECTURE

EACH CUSTOMER’S DATA STORED IN A SEPARATE DATABASE

STRONG ENCRYPTION IN-FLIGHT AND AT-REST (SEE BLOG POST)

DATA IS PARTITIONED BY TIME

WE USE INNODB STORAGE ENGINE (TRANSACTIONAL, CRASH AND CORRUPTION RESISTANT, CLUSTERED INDEXES)

Page 11: Scaling VividCortex's Big Data Systems on MySQL

SCHEMA DESIGN + INDEXING

METRIC-FIRST OR TIMESTAMP-FIRST, THAT IS THE QUESTION.

FOR THIS PURPOSE, A HOST/SOURCE IS ESSENTIALLY A METRIC PREFIX.

Page 12: Scaling VividCortex's Big Data Systems on MySQL

METRIC-FIRST

ADVANTAGES:

OPTIMIZED FOR FAST READS OF DENSE METRICS

DRAWBACKS:

ENUMERATING / READING LARGE CATEGORIES OF METRICS

Page 13: Scaling VividCortex's Big Data Systems on MySQL

TIMESTAMP-FIRST

ADVANTAGES:

OPTIMIZED FOR WRITING METRICS

OPTIMIZED FOR READING ALL METRICS FOR A TIME RANGE

DRAWBACKS:

PENALIZES READING A DENSE METRIC FOR A TIME RANGE

NOT OPTIMAL FOR STREAMING BY METRIC BY TIMESTAMP

Page 14: Scaling VividCortex's Big Data Systems on MySQL

SECONDARY INDEXING?BENEFITS:

OPTIMIZED FOR BOTH USE CASES, THEORETICALLY

HOWEVER, NO SIGNIFICANT DIFFERENCE IN OUR TESTS

DRAWBACKS:

WRITE AMPLIFICATION, SPACE AMPLIFICATION

STILL DOESN’T COVER ALL NEEDED SCENARIOS (WE’D NEED AT LEAST SIX INDEXES)

CREATES RANDOM ACCESS LOOKUPS IN THE PRIMARY KEY

HMMMM…. TOKUDB? SOME OPERATIONAL CHALLENGES.

Page 15: Scaling VividCortex's Big Data Systems on MySQL

PARTITIONING

ADVANTAGES:

COARSE-GRAINED TIMESTAMP-FIRST INDEXING

EASY PURGE OF OLD DATA

TRANSPARENT TO THE APPLICATION

DRAWBACKS:

PARTITION MAINTENANCE CAN BE A DRAG

OPERATIONAL HASSLES FOR ALTER TABLE AND SO FORTH

IMPROVEMENTS IN MYSQL 5.6 ARE VERY HELPFUL THOUGH

Page 16: Scaling VividCortex's Big Data Systems on MySQL

PARTITIONING, NO MORE

DEATH BY ALTER TABLE

DEATH BY WHOLE-TABLE LOCKING

NOW, TABLE-PER-TIME-RANGE

TABLE’S NAME ENCODES TIME RANGE, SCHEMA VERSION

OBSERVATION_1_S_1424444400_1424448000

THE GO PROGRAMMING LANGUAGE MAKES IT EASY FOR US TO PARALLELIZE QUERIES ACROSS SERVERS AND TABLES

Page 17: Scaling VividCortex's Big Data Systems on MySQL

BY THE WAY, YOU SHOULD USE GO :-)VIVIDCORTEX.COM/RESOURCES/BUILDING-DATABASE-DRIVEN-APPS-WITH-GO/

Page 18: Scaling VividCortex's Big Data Systems on MySQL

CHALLENGE #1: HIGH INGEST RATE

LARGE NUMBER OF METRICS/SEC ARRIVING AT OUR SYSTEMS

CURRENTLY ABOUT 100K METRICS/SEC PER SHARD

WRITE WORKLOAD, SPACE USAGE

Page 19: Scaling VividCortex's Big Data Systems on MySQL

CHALLENGE #1: HIGH INGEST RATE

SOLUTION: BATCH METRICS INTO VECTORS

DRAWBACK: LOSE ABILITY TO QUERY WITH SQL

COMPROMISE: AGGREGATE METADATA PER VECTOR

SOLUTION: STORE METRIC IDS, NOT NAMES, WITH VECTORS

DRAWBACK: MUST “JOIN” TO METRIC DICTIONARY FOR PATTERN-MATCHING ETC

SOLUTION (IN PROGRESS): CATEGORIZE METRIC PATTERNS

Page 20: Scaling VividCortex's Big Data Systems on MySQL

CHALLENGE #2: SPARSE METRICS

HUGE CARDINALITY OF METRICS X HOSTS

CAN BE TENS OF MILLIONS OF METRICS PER HOST

MOST OF THEM INACTIVE DURING ANY GIVEN TIME RANGE

QUERYING FOR ALL IS INEFFICIENT; MUST FILTER OUT INACTIVE

NEED: TIMESTAMP-BASED INDEX OF “METRIC HAS DATA”

INEFFICIENT IN MYSQL, WORKS WELL IN REDIS

FUTURE GOAL: ERADICATE REDIS THROUGH CLEVER DESIGN

Page 21: Scaling VividCortex's Big Data Systems on MySQL

HOW WELL DOES IT WORK?DATA IS REASONABLY COMPACT, EVEN THOUGH NOT COMPRESSED

FOR VIVIDCORTEX’S 50 PRODUCTION HOSTS:

FOR 10 DAYS OF 1-SECOND DATA AND 90 DAYS OF 1-MIN

80GB OF TOTAL DATA

MOST DATA IS IN QUERY SAMPLES, EVENT DATA, ETC (BLOBS)

Page 22: Scaling VividCortex's Big Data Systems on MySQL

HOW IS PERFORMANCE?WE USE “WEAK” AWS EC2 SERVERS; 8CPU, 26GB MEMORY

WE INGEST ~28 BILLION METRICS PER DAY (332K/SEC)

THESE ARE ESSENTIALLY HANDLED 100% BY 3 SERVERS

(WE HAVE PASSIVE STANDBY SERVERS IN-REGION, CROSS-REGION, BACKUPS, ETC).

Page 23: Scaling VividCortex's Big Data Systems on MySQL

WHAT’S GOOD?RAW EFFICIENCY PER SERVER IS REASONABLY HIGH

OUR INFRASTRUCTURE IS FAIRLY HOMOGENEOUS

WE’RE RUNNING PRETTY LEAN

Page 24: Scaling VividCortex's Big Data Systems on MySQL

WHAT’S NOT SO GOOD?PROGRAMMER EFFICIENCY COULD BE BETTER

CAN’T AD-HOC QUERY THE TIMESERIES DATA

MUST USE INTERNAL TIMESERIES SERVICE INSTEAD

MYSQL IS STILL NOT AS EFFICIENT AS I WANT

INNODB OVERHEAD

BLOB STORAGE

Page 25: Scaling VividCortex's Big Data Systems on MySQL

ALTERNATIVES?CASSANDRA, CASSANDRA+SPARK, ELASTICSEARCH, INFLUXDB, HBASE, OPENTSDB, DRUID…?

PROBLEMS: COMPLEXITY, PERFORMANCE, IMMATURITY, INEFFICIENCY, UNRELIABILITY...

VENDOR PITCHES ARE OFTEN FAIRLY ABSURD

RIGHT NOW, MYSQL’S RAW EFFICIENCY IS ENOUGH TO COMPENSATE FOR SOME OTHER SHORTCOMINGS. BETTER THE DEVIL YOU KNOW THAN THE DEVIL YOU DON’T?

Page 26: Scaling VividCortex's Big Data Systems on MySQL

QUESTIONS?CONTACT INFO

- @VIVIDCORTEX AND @XAPRB

- VIVIDCORTEX.COM

- [email protected]

- VISIT OUR BOOTH

HTTPS://WWW.FLICKR.COM/PHOTOS/OREGONDOT/14721613997/