replication and caching in multi-tier...
TRANSCRIPT
1
1UPM, May 2010
Replication and caching inmulti-tier architectures
Bettina KemmeMcGill UniversityMontreal, Canada
2UPM, May 2010
Information Systems today
2
3UPM, May 2010
What is a client/server system?• From Client perspective, there is one server• Server provides service that can be called
Client Server
4UPM, May 2010
Multi-tier Architecture
Application Server
Y:2 Z:3
Client A buys X
DatabaseX:0X:1
Items
Clients
Database Cache: x=0
Preferences
3
5UPM, May 2010
Replication
• What is replication?– Process replication:
• Several instances of AS• Several instances of DBMS
– Data replication• Several copies of a given data item
• Replication good for adaptability– Adjusting to dynamic environments
X:1X:1
X:1
6UPM, May 2010
Without Replication
• Bottleneck
Client Client Client Client
• Single point of failure
4
7UPM, May 2010
With Replication
• Load Sharing
Client ClientClient Client
• In case of crash, redirect clients toother replica
8UPM, May 2010
Challenge: Replica Control• Replica Control
– many physical copies appear asone logical copy
• Client sees logical copy• Execution is on physical copies
– in case of updates keep copiesconsistent
– What does consistency mean?– Read-one-write-all-(available)
• Update has to be (eventually)executed at all replicas to keepthem consistent
• Read can be performed at onereplica
5
9UPM, May 2010
Peer-to-peer
RM1RM2
RM3
Client-Distributed Server
Coordination amongservers
clientserver
clientserver
clientserver
clientserver
clientserver
clientserver
P2P
10UPM, May 2010
• Defs:– Each node/peer is client and server at the
same time– Each peer provides content and/or
resources– Direct exchange between peers– Autonomy of peers (can join and leave at
their will)
6
11UPM, May 2010
Challenges
• Finding the content you are looking for• Multicasting a message to all nodes or a
subset
12UPM, May 2010
Games
Peer-to-Peer Architectures for Games 2010 February 9th
Single Player:Assassin’s Creed Massively Multiplayer:
World of Warcraft
Multiplayer:Unreal Tournament
7
13UPM, May 2010
• Why is this interesting research?• Why is this interesting course
material?
14UPM, May 2010
Practical Importance• Multi-tier
– Multi-billion $ market• Much hype in AS market: JavaEE5, .Net, Corba• Well established database systems with new customers
daily– Stringent adaptability requirements
• High throughput• 24/7 availability• Storage of critical data
• Peer-to-peer– Big market– Fun and cool
8
15UPM, May 2010
Interesting Challenges require…
• knowledge in– data management– distributed architectures– communication mechanisms– distributed coordination algorithms
• both theoretical and systems-orientedwork
16UPM, May 2010
Challenge: Failover
• Transparent Failover• Correctness
– “Behave as a failure-freenon-replicated system”• state• response to client• coordination with other
tiers
Y:2 Z:3X:0X:1
9
17UPM, May 2010
Challenge: Recovery• Add replicas
– State transfer– Synchronization with
ongoing processing
18UPM, May 2010
Challenge: Load Balancing
10
19UPM, May 2010
Challenge: Provisioning
20UPM, May 2010
Object Replication: Fault-Tolerance
• Crash-failure: A server/process works correctly untilcomplete stop
• Correctness:– Replicated System should behave as non-replicated system
that has no failures– Each request has exactly one “successful” execution– Client receives exactly one response (failure transparency)– strong data consistency: data copies are consistent at the
end of request execution• Most well-known approaches
– passive (primary backup) replication– active replication
11
21UPM, May 2010
Understanding failure behavior
Client A buys X
AS losesvolatile state
Client receivesfailure exception
Exception
22UPM, May 2010
Passive Replication
Client Stub
Primary Backup
12
23UPM, May 2010
Execution Diagram
Client Stub
Primary
client requestresponse
execution
Backup
State changesand response
ok
24UPM, May 2010
Failure Cases
Client Stub
Primary
client requestresponse
execution
Backup
State changesand response
ok
1 2 3 4 5
13
25UPM, May 2010
Case 2
Client Stub
Primary
Client stub receivesfailure exception
New Primary
Resubmission
Backup
New execution
26UPM, May 2010
Case 2
Client Stub
Primary
client requestexception
execution
Backup
2
Resub-mission
execution
result
14
27UPM, May 2010
Case 3 (and 4)
Client Stub
Primary
Client stub receivesfailure exception
New Primary
Immediate returnof response
Backup
Resubmission
28UPM, May 2010
Algorithms: normal processingClient Stub
Upon client requestAdd unique request-idForward to primary
Upon responseReturn to client
PrimaryUpon client request
Execute requestPropagate update (request-id, state changes and response)to backups
Wait to receive ok from allReturn to client
BackupUpon reception of update from primary
Apply changesStore request-id/responseReturn ok
15
29UPM, May 2010
Algorithms: failoverClient Stub
Upon failure exception for requestResubmit to new primary
New PrimaryUpon client request with request-idIf request-id/response stored (statealready received) return response immediately
else Execute request (Propagate request-id, state changes andresponse to remaining backups)
(Wait to receive ok from all) Return to client
30UPM, May 2010
Not discussed• Failover
– Client stub• How does it know who is the new primary?
– Backups• How do they detect failure of primary?• How do they decide on new primary?• How do they guarantee that last state changes are received
by all or none of the backups?• How to decide when to stop checking whether it is a
resubmission?
• Recovery– How do the old primary or new backups join?
16
31UPM, May 2010
Active Replication: normalprocessing
Client Stub
Replica Replica
32UPM, May 2010
Active Replication: failure
Client Stub
Replica Replica
17
33UPM, May 2010
Algorithm failover
• No extra failover algorithm
34UPM, May 2010
Concurrent Clients
Client Stub
Replica Replica
Client Stub
18
35UPM, May 2010
Concurrent Clients
Total order of requests at all replicas– All replicas must receive all requests in the
same order– Use of group communication systems
36UPM, May 2010
Total Order Execution
Client Stub
Replica Replica
Client Stub
19
37UPM, May 2010
Active vs. Passive Replication
• Determinism• Execution during normal processing
– Communication Overhead– CPU overhead– Complexity
• Write / read• Termination protocol
38UPM, May 2010
Considering cross-tier execution
AS losesvolatile state
Client receivesfailure exception
Y:2 Z:3X:0X:1X:1
20
39UPM, May 2010
Transactions• DB Transaction
– Sequence of read and write operations– Final commit or abort operation– Example
• ri(x) ri(y) wi(x) ci
• All-or-nothing property (Atomicity)– After successful commit:
• all changes guaranteed– Something goes wrong before commit:
• Changed so far rolled back
• Transactions across tiers– In this lecture: 1 client request = 1 transaction
40UPM, May 2010
Failure before commit
Y:2 Z:3X:0X:1
Write x
21
41UPM, May 2010
Failure after commit
Y:2 Z:3X:0X:1
Write xcommit
42UPM, May 2010
Execution
Client Stub
Primary
client requestresponse
execution
Backup
Database
Start txn commit
22
43UPM, May 2010
Propagation after commit
Client Stub
Primary
client requestresponse
execution
Backup
Database
Start txn commit
44UPM, May 2010
Failure after commit beforepropagation
Y:2 Z:3X:0
Inconsistency
ResubmissionException
23
45UPM, May 2010
Propagation before commit
Client Stub
Primary
client requestresponse
execution
Backup
Database
Start txn commit
46UPM, May 2010
Failure after propagation beforecommit
Y:2 Z:3
Inconsistency
Resubmission
X:1X:0
24
47UPM, May 2010
Solution Outline
• Propagate before DB commit• At failure the following is possible
– New primary has state-changes– DB transaction is not committed– In this case:
• New primary must detect this• Then Discard state-changes• Make a complete reexecution
48UPM, May 2010
Failure after propagation beforecommit
Y:2 Z:3
Resubmission
X:1X:0
New execution
25
49UPM, May 2010
Performance Evaluation
• Comparison:– Non-replicated JBoss– JBoss + our algorithm (Replicated JBoss)– JBoss’ clustering (does not provide
transactional exactly-once semantics)
• Sun ECPerf benchmark– Ordering/ manufacturing / supply-chain
application
50UPM, May 2010
Response Time
0
50
100
150
200
250
300
350
400
450
500
550
600
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29
Injection Rate
Re
sp
on
se
Tim
e
(ms)
1-1 algorithm
JBoss Clustering
Non-ReplicatedJBoss
26
51UPM, May 2010
Throughput
0
200
400
600
800
1000
1200
1400
1600
1800
2000
0 2 4 6 8 10 12 14 16 18 20
Injection Rate
Bu
sin
ess O
pe
ratio
ns
(pe
r m
inu
te) Replicated Jboss
Clustered Jboss
Non-Replicated Jboss
52UPM, May 2010
Further Issues• Reaction of AS on failures of other tiers
– Client / DBS• Recovery
– Failed or new replicas rejoin as backups– Receive necessary backup information
27
53UPM, May 2010
Combining Load-Balancing andFault-tolerance
P P P PB B B
C C CC
BB
54UPM, May 2010
Scalability
28
55UPM, May 2010
So far
ReplicatedApplication
Server
Y ZXDatabase
Clients
56UPM, May 2010
Database Replication
Y ZReplicatedDatabase X
Clients
Y ZX
Clients
29
57UPM, May 2010
WhyFault-Tolerance: Take-Over
Performance: Fast local access
Madrid
Montreal
Rome
Application driven:mobile users, datawarehouses (OLAP vs.OLTP) ...
Scale-Up: cluster instead of bigger mainframe
58UPM, May 2010
Transactions repeated• DB Transaction
– Sequence of read and write operations– Final commit or abort operation– Example
• ri(x) ri(y) wi(x) ci
• All-or-nothing property (Atomicity)– After successful commit:
• all changes guaranteed– Something goes wrong before commit:
• Changed so far rolled back• Isolation
– Execution of concurrent transactions controlled such that resultthe same as if executed serially
– Enforced by a concurrency control protocol:• Decides when submitted operations may be executed on the data
30
59UPM, May 2010
Replica Control– Keep copies consistent: replica control– Difference to AS Replication:
• Several clients access the same data items– Combine replica control with concurrency
control
w(x) w(x)
xx xx xx yy
w(y) w(y)
60UPM, May 2010
Global Serializability
• Global serializability– Execution in replicated system equivalent
to serial execution in single-copy database
31
61UPM, May 2010
How to replicate data?
• Depending on when the updates arepropagated:– eager– lazy
• Depending on wherethe updates can take place:– Primary Copy (master)– Update Anywhere
eager
lazy
Primary copy Upd. any
62UPM, May 2010
Where can updates be submitted?
• Update Anywhere:– Update transactions can be
submitted to any site– Site forwards updates to
other sites
T2w2(y)
T3w3(x)…
T1w1(x)..
• Primary Copy:– Update transactions can
only execute at theprimary copy (master)
– Primary forwardsupdates to secondaries
T1, T2T3
read-only
read-only
T4r(y)…
32
63UPM, May 2010
When to propagate• Eager:
– within the boundariesof the transaction
– Transactions terminateusually with 2-phase-commit protocol
• 2PC: agreementprotocol to guaranteeatomicity
BOT
R(x)
W(x)
W(x) W(x)
R(y)
request
ack
2PC
Eager
64UPM, May 2010
When to propagate• Lazy:
– after thecommit of thetransaction
BOT
R(x)
W(x)
R(y)
Commit
W(x) W(x)
33
65UPM, May 2010
Eager Primary Copy
R(x)
W(x)
W(x) W(x)
R(y)
request
ack
2PC
R(y)
R(x)
66UPM, May 2010
Lazy primary copy
R(x)
W(x)
R(y)
Commit
W(x) W(x)
R(y)
R(x)
commit
34
67UPM, May 2010
Eager vs. lazy Primary Copy• replication transparency
– Need users to be aware of replication or not?• Consistency level• Complexity• Response Time for users• Fault-tolerance• Load-balancing
68UPM, May 2010
Eager vs. Primary Copy• Transparency
– Both: no– update transactions must be submitted to specific primary
• Consistency– Eager: High consistency– Lazy: stale reads
• concurrency control:– Both: relatively easy
• Message Overhead:– Eager: One round per write operation + 2PC
• Reduce message overhead by sending all write operations (writeset) within vote request message of 2PC
– Lazy: low
35
69UPM, May 2010
Eager Primary Copy• Fault-tolerance:
– Eager: yes; Widely used for fault-tolerance– Lazy: no
• Load-Balancing:– Both ok but restricted by update ratio
70UPM, May 2010
Eager update anywhere
W(x)
W(x) W(x)
R(y)
request
ack
2PC
w(y)
w(y) w(y)
2PC
36
71UPM, May 2010
Lazy update anywhere
R(x)
W(x)
R(y)
Commit
W(x) W(x)
R(y)
r(x)
commit
w(y)
w(y)w(y)
72UPM, May 2010
Lazy / Update Anywhere• Resolve conflicts
– for numeric types (or types with comparison):• average:• minimum/maximum:• additive:
– discard new value, overwrite old value– Site priority– value priority– earliest/latest timestamp
• Eventual Consistency– Eventually all copies have the same value
37
73UPM, May 2010
Eager vs. lazy update anywhere• transparency• Consistency• Concurrency control• Response time• Load-balancing• Fault-tolerance
74UPM, May 2010
Eager Update Anywhere• Transparency:
– Both: yes• Consistency:
– Eager: strong– Lazy: bad; stale reads and inconsistencies
• Concurrency control and coordination complexity– Eager: complex– Lazy: conflict resolution complex
38
75UPM, May 2010
Eager Update Anywhere• Response Time:
– Eager: bad– Lazy: good
• Load-Balancing:– Both: good
• Basically no database system supports eager updateanywhere– but many middleware based solutions!
76UPM, May 2010
Primary vs. Update anywhere
– Simpler concurrency control– Less coordination necessary / optimizations are
easier– Inflexible model:
• Clients must know primary to submit update transactions• Have to distinguish update from read-only transactions
– Primary is single point of failure and potentialbottleneck
39
77UPM, May 2010
Lazy vs. Eager– Lazy primary copy: stale reads– Lazy update anywhere: inconsistencies and
reconciliation– No communication within transaction
response time– Possible transaction loss in case of crash
78UPM, May 2010
• so far: Kernel-based approach
• new: Middleware-based approach– Advantages
• Modular• Do not need access to DB code• Reusability
– Disadvantages• No access to concurrency control
information in the kernel
Recent Replica ControlApproaches
40
79UPM, May 2010
Middleware Primary Copy
• (e.g. Ganymed)
primary secondary
1. submit
scheduler
3. exe
2. forward 4. propagate
Update transactions
primary secondary
1. submit
scheduler
3. exe
2. forward
Read only transactions
5. Apply the changes
80UPM, May 2010
So farClients
Y ZXReplicatedDatabase Y ZX
ReplicatedApplication
Server
Items Preferences
Preferences Items
41
81UPM, May 2010
Middle-tier caching
ReplicatedApplication
Server
Y ZXDatabase
Clients
Items
Database Cache: x=0
Prefer Items
Database Cache: x=0
Prefer
82UPM, May 2010
• Avoid DB bottleneck• Reduced response even if DB not
bottleneck• Simpler than DB replication?• Cached in format of application
82
Advantages of Caching
42
83UPM, May 2010
Select*fromcustomerwhereid=2
An example with No CachingAn example with No Caching
ApplicationServerDB
DisplayCustomerID=1
DisplayCustomerID=2
DisplayCustomerID=1
Select*fromcustomerwhereid=1
Customer
NOCACHING!!!
84UPM, May 2010
CachingCaching
ApplicationServer DB
Cache
43
85UPM, May 2010
Cache
CustomerID=1ID=2
Select*fromcustomerwhereid=2
An example with CachingAn example with Caching
ApplicationServer
DB
DisplayCustomerID=1
DisplayCustomerID=2
DisplayCustomerID=1
Select*fromcustomerwhereid=1
Customer
CustomerID=1ID=1
86UPM, May 2010
Two-layered CachingTwo-layered Caching
ApplicationServer
DB
Cache
FirstLevelCache
SecondLevelCache
44
87UPM, May 2010
Hibernate CachingHibernate Caching
• First-level cache (Session Cache)– Caches object within the current session.
• Second-level cache– Is responsible for caching objects across
sessions.
• Query cache– The query cache is responsible for caching
queries and their results.
88UPM, May 2010
Query CacheQuery Cache
• A query cache caches query resultsinstead of objects.
• The cache key is based on the queryname and parameters.
• Always used in conjunction with secondlevel cache.
45
89UPM, May 2010
Hibernate CachingHibernate Caching
Session
CacheConcurrencyStrategy
QueryCache
CacheImplementation(PhysicalCacheRegions)
CacheProvider
FirstLevelCache
SecondLevelCache
90UPM, May 2010
Select*fromcustomerwhereid=2
An example with CachingAn example with Caching
ApplicationServer
DB
DisplayCustomerID=1
DisplayCustomerID=2
DisplayCustomerID=1
Select*fromcustomerwhereid=1
Customer
Cache
Customer#1CustomerID=1ID=2Customer#2
Customer#1
FirstLevelCache
SecondLevelCache
ID=1
46
91UPM, May 2010
Cache
An example with Distributed An example with Distributed ASsASs
AS
DB
DisplayCustomerID=1
DisplayCustomerID=2
DisplayCustomerID=1
CustomerID=1
AS
AS
Cache
Cache
CustomerID=2
CustomerID=1
92UPM, May 2010
Approach # 1Approach # 1
• A separate “shared” cache instead ofdedicated caches for each AS.
AS
AS
AS
CacheDB
47
93UPM, May 2010
Approach # 2Approach # 2
• Two level Cache : Dedicated + SharedCaches
AS
AS
AS
CacheDB
Cache
Cache
Cache
94UPM, May 2010
Approach #Approach # 33
• Distributed Cooperative Cache
AS
AS
AS
DB
Cache
Cache
Cache
48
95UPM, May 2010
Flowchart. Example Cache GetFlowchart. Example Cache Get
CacheGet
Get(key)
isRemotelyCached()
returnobjectornull
getFromRemoteCache()
Yes
No
96UPM, May 2010
Coop.CacheA
An ExampleAn Example
AS
DB
Display
CustomerID=1
Display
CustomerID=2
Display
CustomerID=1
AS
AS
Coop.CacheC
Coop.CacheB
Directory Store
‐Customer#1,A
‐Customer#2,B
Directory Store
Directory Store
‐Customer#1,A
‐Customer#2,B
‐Customer#1,A
‐Customer#2,B
‐Customer#1Object
‐Customer#2Object
FetchRemoteObject:Customer#1fromLocationA
ReturnedObjectwithCustomerkey1
‐Customer#1,A‐Customer#1,A
49
97UPM, May 2010
WAN
Client DBMSAppSrv
98UPM, May 2010
Edge Computing
Client DBMSEdgeAppSrv
Client DBMSBackendSrvEdgeAppSrv
JDBCEJB
JDBC
50
99UPM, May 2010
Advantages
• Load reduction on Server side• Network bandwidth• latency
100UPM, May 2010
Details
EdgeAppSrv BackendSrv
ApplicationLogic
ReplicatedEJBHome
TransientHome(TH)
SLIHome
CacheMiss,OCCLogic
PersistentHome(PH)
JDBC
Transaction
3
57
51
101UPM, May 2010
Details
EdgeAppSrv BackendSrv
ApplicationLogic
ReplicatedEJBHome
TransientHome(TH)
SLIHome
CacheMiss,OCCLogic
PersistentHome(PH)
JDBC
Transaction
3
7
5
102UPM, May 2010
Details
EdgeAppSrv BackendSrv
ApplicationLogic
ReplicatedEJBHome
TransientHome(TH)
SLIHome
CacheMiss,OCCLogic
PersistentHome(PH)
JDBC
Transaction
3 5 7
52
103UPM, May 2010
Details
EdgeAppSrv BackendSrv
ApplicationLogic
ReplicatedEJBHome
TransientHome(TH)
SLIHome
CacheMiss,OCCLogic
PersistentHome(PH)
JDBC
Transaction
357
104UPM, May 2010
Last note
• Issues– Fault-tolerance– Scalability– Load-balancing– Performance
• Throughput --> Scalability• Response Time
– Correctness