s. narravula, p. balaji, k. vaidyanathan, h.-w. jin and d. k. panda the ohio state university
Post on 20-Feb-2016
31 Views
Preview:
DESCRIPTION
TRANSCRIPT
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data-Centers over InfiniBandS. Narravula, P. Balaji, K. Vaidyanathan,
H.-W. Jin and D. K. Panda
The Ohio State University
Presentation Outline
Introduction/Motivation Design and Implementation Experimental Results Conclusions
Introduction Fast Internet Growth
Number of Users Amount of data Types of services
Several uses E-Commerce, Online Banking, Online Auctions, etc
Web Server Scalability Multi-Tier Data-Centers Caching – An Important Technique
Presentation Outline Introduction/Motivation
Multi-Tier Data-Centers Active Caches
Design and Implementation Experimental Results Conclusions
A Typical Multi-Tier Data-Center
Database Servers
Clients
Application Servers
Web Servers
Proxy Nodes
Tier 0
Tier 1
Tier 2
Apache
PHP
WAN
SAN
InfiniBand High Performance
Low latency High Bandwidth
Open Industry Standard Provides rich features
RDMA, Remote Atomic operations, etc Targeted for Data-Centers Transport Layers
VAPI IPoIB
Presentation Outline Introduction/Motivation
Multi-Tier Data-Centers Active Caches
Design and Implementation Experimental Results Conclusions
Caching
Can avoid re-fetching of content
Beneficial if requests repeat
Important for scalability Static content caching
Well studied in the past Widely used
Front-End Tiers
Back-End Tiers
Number ofRequestsDecrease
Active Caching Dynamic Data
Stock Quotes, Scores, Personalized Content, etc Complexity of content
Simple caching methods not suited Issues
Consistency Coherency
Proxy NodeCache
Back-EndData
User Request
Update
Cache Coherency
Refers to the average staleness of the document served from cache
Strong or immediate (Strong Coherency) Required for certain kinds of data Cache Disabling Client Polling
Basic Client Polling *Front-End Back-End
Request
Cache Hit
Cache Miss
Response
Version Read
* SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-TierData-Centers over InfiniBand. Narravula, et. Al.
Multiple Object Dependencies Cache documents contain multiple objects A Many-to-Many mapping
Single Cache document can contain Multiple Objects Single Object can be a part of multiple Documents
Complexity!!
Cache Documents Objects
Client PollingFront-End Back-End
Request
Cache Hit
Cache Miss
Response
Version ReadSingle Check
Possible
Single Lookup counter essential for correct and efficient design
Objective
To design an architecture that very efficiently supports strong cache coherency with multiple dynamic dependencies on InfiniBand
Presentation Outline
Introduction/Motivation Design and Implementation Experimental Results Conclusions
Basic System ArchitectureServer Node
Mod
Server Node
Mod
Server Node
Mod
Server Node
Mod
Cooperation
Cache Lookup Counter maintained on the Application Servers
ProxyServers
ApplicationServers
Basic Design
Home Node based Client Polling Cache Documents assigned home nodes
Proxy Server Modules Client polling functionality
Application Server Modules Support “Version Reads” for client polling Handle updates
Many-to-Many Mappings
Mapping of updates to dynamic objects Mapping of dynamic objects with Lookup
counters Efficiency
Factor of dependency
UpdatesObjectsLookupcounters
Mapping of updates
Non-Trivial solution Three possibilities
Database schema, constraints and dependencies are known
Per query dependencies are known No dependency information known
Mapping Schemes
Dependency Lists Home node based Complete dependency lists
Invalidate All Single Lookup Counter for a given class of
queries Low application server overheads
Handling Updates
DatabaseServer
Ack (Atomic)
ApplicationServer
ApplicationServer
ApplicationServer
Update Notification
VAPI SendLocal
Search andCoherentInvalidate
HTTPRequest
HTTPResponse
DB Query (TCP)
DB Response
Presentation Outline Introduction/Motivation Design and Implementation Experimental Results Conclusions
Experimental Test-bed
Cluster 1: Eight Dual 3.0 GHz Xeon processor nodes with 64-bit 133MHz PCI-X interface, 512KB L2-Cache and 533 MHz Front Side Bus
Cluster 2: Eight Dual 2.4 GHz Xeon processor nodes with 64-bit 133MHz PCI-X interface, 512KB L2-Cache and 400 MHz Front Side Bus
Mellanox InfiniHost MT23108 Dual Port 4x HCAs MT43132 eight 4x port Switch Mellanox Golden CD 0.5.0
Experimental Outline
Basic Data-Center Performance Cache Misses in Active Caching Impact of Cache Size Impact of Varying Dependencies Impact of Load in Backend Servers Traces Used
Traces 1-5 with increasing update rate Trace 6: Zipf like trace
Basic Data-Center Performance
Data-Center Throughput
0
5000
10000
15000
20000
Trace 2 Trace 3 Trace 4 Trace 5
Traces with Increasing Update Rate
TPS
No Cache Invalidate All Dependency Lists
• Maintaining Dependency Lists perform significantly well for all traces
Data-Center Response Time
0
1
2
3
4
5
6
Trace 2 Trace 3 Trace 4 Trace 5Traces with Increasing Update Rate
Resp
onse
Tim
e (m
s)No Cache Invalidate All Dependency Lists
Cache Misses in Active Caching
• Cache misses for Invalidate All increases drastically with increasingupdate rates
Cache Misses
0
20
40
60
80
100
120
Trace 2 Trace 3 Trace 4 Trace 5
Traces with Increasing Update Rate
Cach
e M
iss
%
No Cache Invalidate All Dependency Lists
Impact of Cache Size
• Maintaining Dependency Lists perform significantly well for all traces• Possible to cache a select few and still extract performance
Throughput Vs Cache Size
0
5000
10000
15000
20000
25000
100% 50% 10% 5% 1% 0.10% 0%
Relative Cache Size
TPS
Impact of Varying Dependencies
• Throughput drops significantly with increase in the average number ofdependencies per cache file
Effect of Varying Dependencies
02000400060008000
10000120001400016000
x 2x 4x 8x 16x 32x 64x
Factor of Dependencies
TPS
Impact of Load in Backend Servers
• Our design can sustain high performance even under high loadedconditions with a factor of improvement close to 22
Effect of Load
02000400060008000
10000120001400016000
0 1 2 4 8 16 32 64
# Compute Threads
TPS
No Cache Dependency List
Conclusions An architecture for supporting Strong Cache
Coherence with multiple dynamic dependencies
Efficiently handle multiple dynamic dependencies Supporting RDMA-based Client polling
Resilient to load on back-end servers
Web Pointers
http://nowlab.cis.ohio-state.edu/
E-mail: {narravul, balaji, vaidyana, jinhy, panda}@cse.ohio-state.edu
NBC home page
Back-up Slides
Cache Consistency
Non-decreasing views of system state Updates seen by all or none
User Requests
Proxy Nodes
Back-End Nodes
Update
Performance
• Receiver side CPU utilization is very low• Leveraging the benefits of One sided communication
Throughput (RDMA Read)
0
1000
2000
3000
4000
5000
6000
7000
1 4 16 64 256 1K 4K 16K 64K 256K
Message Size (bytes)
Thro
ughp
ut (M
bps)
0
5
10
15
20
25
Send CPU Recv CPU
Throughput (Poll) Throughput (Event)
RDMA based Client Polling *
• The VAPI module can sustain performance even with heavy load on the back-end servers
DataCenter: Throughput
0
500
1000
1500
2000
2500
0 10 20 30 40 50 60 70 80 90 100 200
Number of Compute Threads
Tran
sact
ions
per
sec
ond
(TP
S)
No Cache IPoIB VAPI
* SAN04: Supporting Strong Cache Coherency for Active Caches in Multi-TierData-Centers over InfiniBand. Narravula, et. Al.
Mechanism
Cache Hit: Back-end Version Check If version current, use cache Invalidate data for failed version check Use of RDMA-Read
Cache Miss Get data to cache Initialize local versions
Other Implementation Details Requests to read and update are mutually
excluded at the back-end module to avoid simultaneous readers and writers accessing the same data.
Minimal changes to existing application software
top related