scaling your cache
DESCRIPTION
Caching has been an essential strategy for greater performance in computing since the beginning of the field. Nearly all applications have data access patterns that make caching an attractive technique, but caching also has hidden trade-offs related to concurrency, memory usage, and latency. As we build larger distributed systems, caching continues to be a critical technique for building scalable, high-throughput, low-latency applications. Large systems tend to magnify the caching trade-offs and have created new approaches to distributed caching. There are unique challenges in testing systems like these as well. Ehcache and Terracotta provide a unique way to start with simple caching for a small system and grow that system over time with a consistent API while maintaining low-latency, high-throughput caching.TRANSCRIPT
![Page 1: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/1.jpg)
Scaling Your Cache
Alex Miller@puredanger
![Page 2: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/2.jpg)
Mission
• Why does caching work?
• What’s hard about caching?
• How do we make choices as we design a caching architecture?
![Page 3: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/3.jpg)
What is caching?
![Page 4: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/4.jpg)
Lots of data
![Page 5: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/5.jpg)
Memory Hierarchy
Register
L1 cache
L2 cache
RAM
Disk
Remote disk
1E+00 1E+01 1E+02 1E+03 1E+04 1E+05 1E+06 1E+07 1E+08 1E+09
1000000000
10000000
200
15
3
1
Clock cycles to access
![Page 6: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/6.jpg)
Register
L1 Cache
L2 Cache
Main Memory
Local Disk
Remote Disk
Fast
Slow
Small Expensive
Big Cheap
Facts of Life
![Page 7: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/7.jpg)
Caching to the rescue!
![Page 8: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/8.jpg)
Temporal Locality
Stream:
Cache:
Hits: 0%
![Page 9: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/9.jpg)
Temporal Locality
Stream:
Cache:
Hits: 0%
Stream:
Cache:
Hits: 65%
![Page 10: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/10.jpg)
Non-uniform distribution
0
800
1600
2400
3200
0%
25%
50%
75%
100%Web page hits, ordered by rank
Page views, ordered by rank
Pageviews per rank% of total hits per rank
![Page 11: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/11.jpg)
Temporal locality+
Non-uniform distribution
![Page 12: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/12.jpg)
17000 pageviewsassume avg load = 250 ms
cache 17 pages / 80% of viewscached page load = 10 ms
new avg load = 58 ms
trade memory for latency reduction
![Page 13: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/13.jpg)
The hidden benefit:reduces database load
DatabaseMemory
line of over provisioning
![Page 14: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/14.jpg)
A brief aside...
• What is Ehcache?
• What is Terracotta?
![Page 15: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/15.jpg)
Ehcache Example
CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("employees");cache.put(new Element(employee.getId(), employee));Element element = cache.get(employee.getId());
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>
![Page 16: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/16.jpg)
Terracotta
App Node
TerracottaServer
TerracottaServer
App NodeApp NodeApp Node
App NodeApp NodeApp NodeApp Node
![Page 17: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/17.jpg)
But things are not always so simple...
![Page 18: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/18.jpg)
Pain of LargeData Sets
• How do I choose which elements stay in memory and which go to disk?
• How do I choose which elements to evict when I have too many?
• How do I balance cache size against other memory uses?
![Page 19: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/19.jpg)
Eviction
When cache memory is full, what do I do?
• Delete - Evict elements
• Overflow to disk - Move to slower, bigger storage
• Delete local - But keep remote data
![Page 20: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/20.jpg)
Eviction in Ehcache
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>
Evict with “Least Recently Used” policy:
![Page 21: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/21.jpg)
Spill to Disk in Ehcache
<diskStore path="java.io.tmpdir"/>
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600"
overflowToDisk="true" maxElementsOnDisk="1000000" diskExpiryThreadIntervalSeconds="120" diskSpoolBufferSizeMB="30" />
Spill to disk:
![Page 22: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/22.jpg)
Terracotta Clustering
<terracottaConfig url="server1:9510,server2:9510"/>
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false">
<terracotta/> </cache>
Terracotta configuration:
![Page 23: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/23.jpg)
Pain of Stale Data
• How tolerant am I of seeing values changed on the underlying data source?
• How tolerant am I of seeing values changed by another node?
![Page 24: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/24.jpg)
Expiration
1 2 3 4 5 6 7 80 9
TTI=4
TTL=4
![Page 25: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/25.jpg)
TTI and TTL in Ehcache
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="false"/>
![Page 26: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/26.jpg)
Replication in Ehcache<cacheManagerPeerProviderFactory class="net.sf.ehcache.distribution. RMICacheManagerPeerProviderFactory" properties="hostName=fully_qualified_hostname_or_ip, peerDiscovery=automatic, multicastGroupAddress=230.0.0.1, multicastGroupPort=4446, timeToLive=32"/>
<cache name="employees" ...> <cacheEventListenerFactory class="net.sf.ehcache.distribution.RMICacheReplicatorFactory” properties="replicateAsynchronously=true, replicatePuts=true, replicatePutsViaCopy=false, replicateUpdates=true, replicateUpdatesViaCopy=true, replicateRemovals=true asynchronousReplicationIntervalMillis=1000"/></cache>
![Page 27: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/27.jpg)
Terracotta Clustering
Still use TTI and TTL to manage stale data between cache and data source
Coherent by default but can relax with coherentReads=”false”
![Page 28: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/28.jpg)
Write-Through CachingEhcache 2.0
• Keep database in sync with database
• Cache write --> database write
• Ehcache 2.0 adds new API:• Cache.putWithWriter(...)
• CacheWriter
• write(Element)
• writeAll(Collection<Element>)
• delete(Object key)
• deleteAll(Collection<Object> keys)
![Page 29: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/29.jpg)
Write-Behind CachingEhcache 2.0
• Allow database writes to lag cache updates
• Improves write latency
• Possibly reduces overall writes
• Use with read-through cache
• API same as write-through
• Other features:
• Batching, coalescing, rate limiting, retry
![Page 30: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/30.jpg)
Pain of Loading
• How do I pre-load the cache on startup?
• How do I avoid re-loading the data on every node?
![Page 31: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/31.jpg)
Persistent Disk Store
<diskStore path="java.io.tmpdir"/>
<cache name="employees" maxElementsInMemory="1000" memoryStoreEvictionPolicy="LRU" eternal="false" timeToIdleSeconds="600" timeToLiveSeconds="3600" overflowToDisk="true" maxElementsOnDisk="1000000" diskExpiryThreadIntervalSeconds="120" diskSpoolBufferSizeMB="30"
diskPersistent="true" />
![Page 32: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/32.jpg)
Bootstrap Cache Loader
<bootstrapCacheLoaderFactory class="net.sf.ehcache.distribution. RMIBootstrapCacheLoaderFactory" properties="bootstrapAsynchronously=true, maximumChunkSizeBytes=5000000" propertySeparator=",” />
Bootstrap a new cache node from a peer:
On startup, create background thread to pull the existing cache data from another peer.
![Page 33: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/33.jpg)
Terracotta Persistence
Nothing needed beyond setting up Terracotta clustering.
Terracotta will automatically bootstrap:- the cache key set on startup- cache values on demand
![Page 34: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/34.jpg)
Cluster Bulk LoadEhcache 2.0
• Terracotta clustered caches only
• Performance
• 6 nodes, 8 threads, 3M entries
• Coherent: 211.9 sec (1416 TPS)
• Bulk Load: 19.0 sec (15790 TPS) - 11.2x
![Page 35: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/35.jpg)
Cluster Bulk Load
CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("hotItems");
// Enter bulk loading mode for this nodecache.setCoherent(false);
// Bulk loadfor(Item item : getItemHotList()) { cache.put(new Element(item.getId(), item));}
// End bulk loading modecache.setCoherent(true);
Ehcache 2.0
![Page 36: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/36.jpg)
Pain of Concurrency• Locking
• Transactions
![Page 37: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/37.jpg)
Thread safety
• Cache-level operations are thread-safe
• No public API for multi-key or multi-cache composite operations
• Provide external locking
![Page 38: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/38.jpg)
BlockingCache
CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("items");manager.replaceCacheWithDecoratedCache( cache, new BlockingCache(cache));
• Cache decorator
• On cache miss, get()s block until someone writes the key to the cache
• Optional timeout
![Page 39: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/39.jpg)
SelfPopulatingCache• Cache decorator, extends BlockingCache
• get() of unknown key will construct the entry using a supplied factory
• “Read through” caching
CacheManager manager = new CacheManager();Ehcache cache = manager.getEhcache("items");manager.replaceCacheWithDecoratedCache( cache, new SelfPopulatingCache(cache, cacheEntryFactory));
![Page 40: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/40.jpg)
JTAEhcache 2.0
• Cache acts as XAResource
• Works with any JTA Transaction Manager
• Autodetects JBossTM, Bitronix, Atomikos
• Transactionally move data between database, cache, queue, etc
![Page 41: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/41.jpg)
Pain of Duplication
• How do I get failover capability while avoiding excessive duplication of data?
![Page 42: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/42.jpg)
Partitioning + Terracotta Virtual Memory
• Each node (mostly) holds data it has seen• Use load balancer to get app-level partitioning• Use fine-grained locking to get concurrency• Use memory flush/fault to handle memory
overflow and availability• Use causal ordering to guarantee coherency
![Page 43: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/43.jpg)
Pain of Ignorance• Is my cache being used? How much?
• Is caching improving latency?
• How much memory is my cache using?
![Page 44: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/44.jpg)
Dev Console
• Ehcache console
• See caches, configuration, statistics
• Dynamic configuration changes
• Hibernate console
• Hibernate-specific view of caches
• Hibernate stats
![Page 45: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/45.jpg)
Scaling Your Cache
![Page 46: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/46.jpg)
21
1 JVM
Ehcache
2 or more JVMS2 or more
JVMs
EhcacheRMI
2 or more JVMS2 or more big JVMs
Ehcachedisk store
Terracotta OSS
2 or more JVMS2 or more
JVMs
2 or more JVMs2 or more
JVMs2 or more JVMslots of
JVMs
more scale
Terracotta FXEhcache FX
2 or more JVMS2 or more
JVMs2 or more JVMslots of
JVMs
Terracotta FXEhcache FX
NO NO YES YESYES
Scalability Continuumcausal
ordering
# JVMs
runtime
Ehcache DXmanagementand control
Ehcache EX and FXmanagementand control
![Page 47: Scaling Your Cache](https://reader034.vdocuments.us/reader034/viewer/2022052410/554a0de2b4c90507558b49a7/html5/thumbnails/47.jpg)
Thanks!
• Twitter - @puredanger
• Blog - http://tech.puredanger.com
• Terracotta - http://terracotta.org
• Ehcache - http://ehcache.org