viennadb 2014-09 redis operations
DESCRIPTION
One of the talks at the ViennaDB Redis Meetup http://www.meetup.com/ViennaDB-The-Austrian-Database-Meetup-Group/events/200395652/TRANSCRIPT
Redis OpsMichael Renner - @terrorobe
ViennaDB
Tuesday 23 September 14
Basics
• In-Memory Key-Value store
• Single-threaded
• Single-process
• except for a few housekeeping tasks
• optional persistency
Tuesday 23 September 14
Performanceor
It needs to go faster!
Tuesday 23 September 14
Performance - CPU & Memory
• Different commands have different runtime behavior
• CPU most important topic for Redis performance
• few & fast cores if possible (unless you need many)
• Redis can benefit from Intel's Turbo Mode
• Large caches
• Use fast memory & interconnects
Tuesday 23 September 14
Performance - Disk
• Redis can and will block clients when the storage doesn't respond
• At the least use a controller with a write cache - or even better: SSDs
• especially when using AOF & fsync
Tuesday 23 September 14
Performance - Scaling
• Low-hanging fruits: Persistent connections
• No TCP handshake, no AUTH, no Redis SELECT
• Always consider scaling up ("larger server") first
• and only if that fails, scale out ("more servers")
Tuesday 23 September 14
Scaling up
• Regardless of what you use at the moment - check what the market offers
• Hetzner in Sep 2014:
• Six-core Xeon, 128GB ECC RAM, 2x 240GB SSD
• ~117 EUR/mo.
• only then, consider...
Tuesday 23 September 14
Scaling out• Spin up multiple instances on a single
server
• Partition keyspace by hierarchy
• tickets:* -> instance #1
• dlstats:* -> instance #2
• * -> instance #3
• Only if that fails consider sharding
• Partitioning by entity ID
Tuesday 23 September 14
Persisting things
Tuesday 23 September 14
Redis persistence
• By design an in-memory database, reads always come from memory
• Offers best-effort and synchronous persistence - RDB and AOF
• See also http://redis.io/topics/persistence
Tuesday 23 September 14
Redis RDB• Redis database snapshot
• Contains complete database content, usually compressed
• eg: 7GB data in memory produces ~900MB .rdb file.
• Done periodically by Redis or triggered by user
• On crash/restart, rdb file is read, after it's done Redis answers questions
• Can also be used as backup source - immutable file
Tuesday 23 September 14
Redis AOF• Append-only file of all write operations
• Can be configured to fsync after every operation or every second
• Guarantees SQL-like durability at high cost
• Lots of small writes
• much longer crash/restart recovery time compared to RDB
• Periodic rewrites necessary to shrink file size
Tuesday 23 September 14
...and stay up!Looking at availability
Tuesday 23 September 14
Replication
• Asynchronous Master/Slave
• Slave read-only by default
• Slave does a complete state transfer from Master on connect
• Can be used for load distribution, as building block for HA, for backup slaves, etc.
Tuesday 23 September 14
High availability• TMTOWTDI - no silver bullet
• "Database-Approaches" Cluster software &
• Shared blockdevice (DRBD, SAN)
• Replication
• Proxy-Setups
• haproxy
• Twitter's twemproxy
Tuesday 23 September 14
What about Sentinel/Redis Cluster?• Attempts to turn Redis into a
• self-contained
• highly available
• (distributed) key-value store
• Lots of criticism on the design, would stay clear until sufficiently addressed
• http://aphyr.com/tags/Redis
Tuesday 23 September 14
Monitoring things
Tuesday 23 September 14
Stats!• Clients
• requests per second
• key hit/miss ratio
• connected clients
• latency (new in 2.8.13)
• Data
• number of keys (with expiry!)
• used memory
Tuesday 23 September 14
The important things
• Redis CPU
• Kernel CPU (IRQ handlers)
• Network capacity
• Free RAM
Tuesday 23 September 14
Live Monitoring
• Possible with Redis' MONITOR command
• will copy all commands sent by clients
• Can be parsed and aggregated, e.g.
• https://github.com/Instagram/redis-faina
Tuesday 23 September 14
Overall Stats========================================Lines Processed !10000Commands/Sec !13833.96
Top Prefixes========================================monitor ! 2301! (23.01%)BRAMCOMPATIBLE ! 1416! (14.16%)reserved ! 738 ! (7.38%)sysstat ! 644 ! (6.44%)dltraffic ! 472 ! (4.72%)trafficlog ! 221 ! (2.21%)ccblock ! 76 ! (0.76%)
Top Keys========================================1 !1615! (16.15%)<PASSWORD> ! 1615! (16.15%)sysstat:2014-09-22:stat_dl_prem_size !463 ! (4.63%)dltraffic:premium !452 ! (4.52%)trafficlog:2014-09-22 !221 ! (2.21%)login-log-disabled !168 ! (1.68%)
[..]
Tuesday 23 September 14
Tales from Production
Tuesday 23 September 14
Largeish One-Click-Hoster
• Think Megadownload
• Started with MySQL and memcached
• Growth made heavily updated data unfeasible for SQL
• Slowly migrated things away from SQL
• Download Quota, last seen IPs, login rate limiting, etc. pp.
Tuesday 23 September 14
Stats for posterity
• ~10GB of data
• ~43.000 qps
• 3 instances
Tuesday 23 September 14
Hitting the CPU limit
• What we did fairly early on is overwhelm a single Redis instance
• User CPU of Redis at 100% causing latency spikes
• Fixed it by migrating heavily-used hierarchies into separate instances
Tuesday 23 September 14
Hitting the CPU-Limit, again!
• This time it was the kernel
• All interrupt queues of the NIC pinned to CPU0
• Core overloaded by handling softinterrupts
• leading to packetloss
• leading to latency
• resulting in unhappy ops
Tuesday 23 September 14
Going for 1Gbit
• With Redis it's easy to saturate the NIC
• 1Gbps == ~15.000 qps when fetching 8k objects
• Packetloss -> Latency -> Unhappiness
• Cache on application servers
• alternative: add slaves or move to 10GE
Tuesday 23 September 14
I have to set a TTL for keys?!
• Sometimes, people forget to think about the memory footprint of the data in Redis
• If you collect lots of data you need to have an expiration strategy
• Expiring keys is a tedious process when done as an afterthought
Tuesday 23 September 14
redis-cli keys "*"
while read LINE; do ! TTL=`redis-cli ttl $LINE`! if [ $TTL -eq -1 ]; then ! ! echo "$LINE"! fidone
Finding non-expiring keys
Tuesday 23 September 14
See also: twitter
• 40 TB RAM
• 30 Million qps
• 6000 nodes
• http://highscalability.com/blog/2014/9/8/how-twitter-uses-redis-to-scale-105tb-ram-39mm-qps-10000-ins.html
Tuesday 23 September 14
Thanks!Questions?
Tuesday 23 September 14