Download - Cassandra at teads
![Page 1: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/1.jpg)
Cassandra @
Lyon Cassandra UsersRomain Hardouin - Cassandra architect @ Teads2017-02-16
![Page 2: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/2.jpg)
I.
II.
III.
IV.
V.
VI.
VII.
VIII.
Cassandra @ TeadsAbout Teads
Architecture
Provisioning
Monitoring & alerting
Tuning
Tools
C’est la vie
A light fork
![Page 3: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/3.jpg)
I . About Teads
![Page 4: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/4.jpg)
Teads is the inventor of native video advertisingWith inRead, an award-winning format*
*IPA Media owner survey 2014, IAB recognized format
![Page 5: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/5.jpg)
27 officesin 21 countries
500+Global employees
1.2B usersGlobal reach
90+R&D employees
![Page 6: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/6.jpg)
Teads growthTracking events
![Page 7: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/7.jpg)
Advertisers (to name a few)
![Page 8: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/8.jpg)
Publishers (to name a few)
![Page 9: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/9.jpg)
II. Architecture
![Page 10: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/10.jpg)
Custom C* 2.1.16 C* 3.0 jvm.options
C* 2.2 logback
Backports
Patches
Apache Cassandra versionApache Cassandra version
![Page 11: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/11.jpg)
UsageUsage
Up to 940K qps: Writes vs Reads
![Page 12: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/12.jpg)
TopologyTopology
2 regions: EU & US 3rd region APAC coming soon
4 clusters7 DC110 nodes
Up to 150 with temporary DCs
HP server blades1 cluster18 nodes
![Page 13: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/13.jpg)
AWS nodes
![Page 14: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/14.jpg)
i2.2xlarge8 vCPU 61GB2 x 800 GB attached SSD in RAID0
c3.4xlarge16 vCPU 30GB 2 x 160 GB attached SSD in RAID0
c4.4xlarge16 vCPU 30GB EBS 3.4 TB + 1 TB
AWS instance typesAWS instance types
Tons of counters
Big Data, wide rows
Many billions keys, LCS with TTL
![Page 15: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/15.jpg)
20 x c4.4xlarge with SSD GP2 3.4 TB data 10,000 IOPS ⇒ 16KB 1 TB commitlog 3,000 IOPS ⇒ 16KB
25 tables: batch + real timeTemporary DC
Cheap storage, great for STCSSnapshots (S3 backup)No coupling between disks and CPU/RAM
High latency => high I/O waitThroughput: 160 MB/sUnsteady performances
More on EBS nodesMore on EBS nodes
![Page 16: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/16.jpg)
Physical nodes
![Page 17: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/17.jpg)
HP Apollo XL170R Gen912 CPU Xeon @ 2.60GHz128 GB RAM3 x 1,5 TB High-end SSD in RAID0
Hardware nodesHardware nodes
For Big Data, supersedes EBS DC
![Page 18: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/18.jpg)
DC/Cluster split
![Page 19: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/19.jpg)
Instance type changeInstance type change
20 x i2.2xlarge 20 x c3.4xlarge
Counters
Cheaper and more CPUs
Counters Rebuild
DC X DC Y
![Page 20: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/20.jpg)
Workload isolationWorkload isolation
20 x i2.2xlarge 20 x c3.4xlarge
Counters + Big Data
Counters
20 x c4.4xlarge
Big Data
EBS
Step 1: DC split
DC A DC B DC C
Rebuild +
![Page 21: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/21.jpg)
Workload isolationWorkload isolation
20 x c4.4xlarge
Big Data
EBS
Step 2: Cluster split
Big DataAWS Direct Connect
![Page 22: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/22.jpg)
Data model
![Page 23: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/23.jpg)
“KISS” principle
No fancy stuff No secondary index No list/set/tuple No UDT
![Page 24: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/24.jpg)
III. Provisioning
![Page 25: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/25.jpg)
Capistrano Chef→
Custom Cookbooks: C* C* tools C* reaper Datadog wrapper
Chef provisioning to spawn a cluster
NowNow
![Page 26: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/26.jpg)
C* cookbook michaelklishin/cassandra-chef-cookbook + Teads custom wrapper
Terraform + Chef provisioner
FutureFuture
![Page 27: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/27.jpg)
IV. Monitoring & alerting
![Page 28: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/28.jpg)
PastPast
OpsCenter (Free)
![Page 29: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/29.jpg)
Turnkey dashboardsSupport is reactive
Main metrics onlyPer host graphs
impossible with many hosts
![Page 30: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/30.jpg)
Ring viewMore than monitoringLots of metrics
Still lacks some metricsDashboard creation: no templatesAgent is heavyFree version limitations:
Data stored in production cluster Apache C* <= 2.1 only
DataStax OpsCenter
Free version (v5)
![Page 31: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/31.jpg)
All metrics you wantDashboard creation
● Templating TimeBoard vs ScreenBoard
Graph creation Aggregation, trend, rate, anomaly detection
No turnkey dashboards yet May change: TLP templates
Additional fees if >350 metrics We need to increase this limit for our use case
![Page 32: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/32.jpg)
Now we can easily Find outliers Compare a node to average Compare two DCs Explore a node’s metrics Create overview dashboards Create advanced dashboards for
troubleshooting
![Page 33: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/33.jpg)
Datadog’s cassandra.yamlDatadog’s cassandra.yaml
- include: bean_regex: org.apache.cassandra.metrics:type=ReadRepair,name=* attribute: - Count
- include: bean_regex: org.apache.cassandra.metrics:type=CommitLog,name=(WaitingOnCommit|WaitingOnSegmentAllocation) attribute: - Count - 99thPercentile
- include: bean: org.apache.cassandra.metrics:type=CommitLog,name=TotalCommitLogSize
- include: bean: org.apache.cassandra.metrics:type=ThreadPools,path=transport,scope=Native-Transport-Requests,name=MaxTasksQueued attribute: Value: alias: cassandra.ntr.MaxTasksQueued
![Page 34: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/34.jpg)
ScreenBoardScreenBoard
![Page 35: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/35.jpg)
TimeBoardTimeBoard
alpha
beta
gamma
delta
epsilon
zeta
eta
alpha
![Page 36: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/36.jpg)
ExampleExample
Hints monitoring during maintenance on physical nodes
Storage
Streaming
![Page 37: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/37.jpg)
Datadog Alerting
Down nodeExceptionsCommitlog sizeHigh latencyHigh GCHigh IO WaitHigh PendingsMany hintsLong thrift connectionsClock out of syncDisk space…
Don’t miss this one
Don’t forget /
![Page 38: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/38.jpg)
V. Tuning
![Page 39: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/39.jpg)
Java 8CMS G1→
cassandra-env.sh-Dcassandra.max_queued_native_transport_requests=4096-Dcassandra.fd_initial_value_ms=4000-Dcassandra.fd_max_interval_ms=4000
![Page 40: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/40.jpg)
GC logs enabled
-XX:MaxGCPauseMillis=200-XX:G1RSetUpdatingPauseTimePercent=5-XX:G1HeapRegionSize=32m-XX:G1HeapWastePercent=25
-XX:InitiatingHeapOccupancyPercent=?-XX:ParallelGCThreads=#CPU-XX:ConcGCThreads=#CPU
-XX:+ExplicitGCInvokesConcurrent-XX:+ParallelRefProcEnabled-XX:+UseCompressedOops
jvm.optionsjvm.options
-XX:HeapDumpPath= <dir with enough free space>-XX:ErrorFile= <custom dir>-Djava.io.tmpdir= <custom dir>
-XX:-UseBiasedLocking-XX:+UseTLAB-XX:+ResizeTLAB-XX:+PerfDisableSharedMem-XX:+AlwaysPreTouch...
Backport from C* 3.0
![Page 41: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/41.jpg)
num_tokens: 256 native_transport_max_threads: 256 or 128compaction_throughput_mb_per_sec: 64concurrent_compactors: 4 or 2concurrent_reads: 64concurrent_writes: 128 or 64concurrent_counter_writes: 128hinted_handoff_throttle_in_kb: 10240max_hints_delivery_threads: 6 or 4memtable_cleanup_threshold: 0.6, 0.5 or 0.4memtable_flush_writers: 4 or 2trickle_fsync: truetrickle_fsync_interval_in_kb: 10240dynamic_snitch_badness_threshold: 2.0internode_compression: dc
AWS nodesAWS nodes
Heap c3.4xlarge: 15 GBi2.2xlarge: 24 GB
![Page 42: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/42.jpg)
EBS volume != disk
compaction_throughput_mb_per_sec: 32concurrent_compactors: 4concurrent_reads: 32concurrent_writes: 64concurrent_counter_writes: 64trickle_fsync_interval_in_kb: 1024
AWS nodesAWS nodes
Heap c4.4xlarge: 15 GB
![Page 43: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/43.jpg)
num_tokens: 8 initial_token: ...native_transport_max_threads: 512compaction_throughput_mb_per_sec: 128concurrent_compactors: 4concurrent_reads: 64concurrent_writes: 128concurrent_counter_writes: 128hinted_handoff_throttle_in_kb: 10240max_hints_delivery_threads: 6memtable_cleanup_threshold: 0.4memtable_flush_writers: 8trickle_fsync: truetrickle_fsync_interval_in_kb: 10240
Hardware nodesHardware nodes
More on this later Heap: 24 GB
![Page 44: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/44.jpg)
Why 8 tokens?
Better repair performance, important for Big DataEvenly distributed tokens, stored in a Chef data bag
Hardware nodesHardware nodes
./vnodes_token_generator.py --json --indent 2 --servers hosts_interleaved_racks.txt 4{ "192.168.1.1": "-9223372036854775808,-4611686018427387905,-2,4611686018427387901", "192.168.2.1": "-7686143364045646507,-3074457345618258604,1537228672809129299,6148914691236517202", "192.168.3.1": "-6148914691236517206,-1537228672809129303,3074457345618258600,7686143364045646503"}
https://github.com/rhardouin/cassandra-scripts
Watch out! Know the drawbacks
![Page 45: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/45.jpg)
Small entries, lots of reads
compression = {'chunk_length_kb': '4', 'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}+ nodetool scrub (few GB)
CompressionCompression
![Page 46: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/46.jpg)
Disabled on 2 small clustersdynamic_snitch: false
Less hop count
Dynamic SnitchDynamic Snitch
![Page 47: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/47.jpg)
Client side latency
Dynamic SnitchDynamic Snitch
P95
P75
Mean
![Page 48: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/48.jpg)
Which node to decommission?
DownscaleDownscale
![Page 49: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/49.jpg)
Clients
![Page 50: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/50.jpg)
Scala appsDataStax driver wrapper
Spark & Spark streamingDataStax Spark Cassandra Connector
![Page 51: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/51.jpg)
DataStax driver policy
LatencyAwarePolicy TokenAwarePolicy→
LatencyAwarePolicy Hotspots due to premature nodes eviction
Needs thorough tuning and steady workload
We drop it
TokenAwarePolicy Shuffle replicas depending on CL
![Page 52: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/52.jpg)
For cross-region scheduled jobs
VPN between AWS regions
20 executors with 6GB RAM
output.consistency.level = (LOCAL_)ONE
output.concurrent.writes = 50
connection.compression = LZ4
![Page 53: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/53.jpg)
Useless writes99% of empty unlogged batches on one DC
What an optimization!
![Page 54: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/54.jpg)
VI. Tools
![Page 55: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/55.jpg)
{Parallel SSH + cron} on steroids Security History
who/what/when/whyOutput is kept
CQL migrationRolling restartNodetool or JMX commandsBackup and snapshot jobs
“Job Scheduler & Runbook Automation”
We added a “comment” field
![Page 56: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/56.jpg)
Scheduled range repairSegments: up to 20,000 for TB tables
Hosted fork for C* 2.1We will probably switch to TLP’s fork
We do not use incremental repairsSee fix in C* 4.0
![Page 57: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/57.jpg)
cassandra_snapshotter Backup on S3 Scheduled with Rundeck
We created and use a fork Some PR merged upstream Restore PR still to be merged
![Page 58: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/58.jpg)
Logs management"C* " and "out of sync""C* " and "new session: will sync" | count...
Alerts on pattern"C* " and "[ERROR]""C* " and "[WARN]" and not ( … )...
![Page 59: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/59.jpg)
VII. C’est la vie
![Page 60: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/60.jpg)
Cassandra issues & failures
![Page 61: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/61.jpg)
OS reboot… seems harmless, right?Cassandra service enabled
Want a clue?C* 2.0 + counters
Upgrade to C* 2.1 was a relief
Without any obvious reason
![Page 62: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/62.jpg)
Upgrade 2.0 2.1→
LCS cluster suffered High load Pending compactions was growing
Switch to off heap memtable Less GC => less load
Reduce clients load Better after sstables upgrade
Took days
![Page 63: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/63.jpg)
Upgrade 2.0 2.1→
Lots of NTR “All time blocked”
NTR queue undersized for our workload 128 (hard coded)
We add a property to test CASSANDRA-11363 and set value higher and higher… up to 4096
NTR pool needs to be sized accordingly
![Page 64: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/64.jpg)
After replacing nodes
DELETE FROM system.peers WHERE peer = '<replaced node>';
Used by DataStax Driver for auto discovery
![Page 65: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/65.jpg)
AWSissues & failures
![Page 66: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/66.jpg)
When you have to shoot, shoot, don’t talk!
The Good, the Bad and the Ugly
![Page 67: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/67.jpg)
We’ll shoot your instance
We shot your instance
EBS volume latency spike
EBS volume unreachable
![Page 68: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/68.jpg)
SSDSSD
![Page 69: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/69.jpg)
SSDSSD
dstat -tnvlr
![Page 70: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/70.jpg)
Hardwareissues & failures
![Page 71: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/71.jpg)
One SSD failed
CPUs suddendly became slow on one server
Smart Array Battery BIOS bug
Yup, not a SSD...
![Page 72: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/72.jpg)
VIII. A light fork
![Page 73: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/73.jpg)
Why a fork?
1. Need to add a patch ASAP High Blocked NTR CASSANDRA-11363 Require to deploy from source
2. Why not backport interesting tickets?
3. Why not add small features/fixes? Expose tasks queue length via JMX CASSANDRA-12758
You betcha!
![Page 74: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/74.jpg)
A tiny fork
We keep it as small as possible to fit our needs
Even smaller when we will upgrade to C* 3.0 Backports will be obsolete
![Page 75: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/75.jpg)
« Hey, if your fork is tiny, is it that useful? »
![Page 76: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/76.jpg)
One example: repair
![Page 77: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/77.jpg)
Backport of CASSANDRA-12580
Paulo Motta: log2
vs ln
![Page 78: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/78.jpg)
Get these info without DEBUG burden
Original fix
One-liner fix
![Page 79: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/79.jpg)
The most impressive result for a set of tables: Before: 23 days With CASSANDRA-12580: 16 hours
Longest repair for a table: 2,5 days Impossible to repair this table before the patch Fit in gc_grace_seconds
![Page 80: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/80.jpg)
It was a critical fix for us
It should have landed in 2.1.17 IMHO* Repair is a mandatory operation in many use cases
Paulo already made the patch for 2.1
C* 2.1 is widely used
[*] Full post: http://www.mail-archive.com/[email protected]/msg49344.html
![Page 81: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/81.jpg)
« Why bother with backports? Why not upgrade to 3.0? »
![Page 82: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/82.jpg)
Because C* is critical for our business
We don’t need fancy stuff (SASI, MV, UDF, ...)We just want a rock solid scalable DB
C* 2.1 does the job for the time being
We p l a n t o u p g ra d e t o C * 3 . 0 i n 2 0 1 7We w i l l d o t h o r o u g h t e s t s ; - )
![Page 83: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/83.jpg)
« What about C* 2.2? »
![Page 84: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/84.jpg)
C* 2.2 has some nice improvements: Boostrapping with LCS: send source sstable level 1 Range movement causes CPU & performance impact 2 Resumable bootstrap/rebuild streaming 3
[1] CASSANDRA-7460[2] CASSANDRA-9258[3] CASSANDRA-8838, CASSANDRA-8942, CASSANDRA-10810
But migration path 2.2 3.0 is risky→ Just my opinion based on users mailing list DSE never used 2.2
![Page 85: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/85.jpg)
Questions?
![Page 86: Cassandra at teads](https://reader033.vdocuments.us/reader033/viewer/2022052312/58abeed71a28ab504e8b5ffb/html5/thumbnails/86.jpg)
Thanks!
C* devs for their awesome workZenika for hosting this meetupYou for being here!