e-commerce architectures and technologies
DESCRIPTION
E-Commerce Architectures and Technologies. Rob Oshana Southern Methodist University. Modeling Contention for Software Servers. Review of overhead factors. Processors I/O devices Routers LAN segments Also threads of a server Database locks Semaphores. A Simple Example. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/1.jpg)
E-CommerceArchitecturesand Technologies
Rob Oshana
Southern MethodistUniversity
![Page 2: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/2.jpg)
Modeling Contention for Software Servers
![Page 3: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/3.jpg)
Review of overhead factors
• Processors
• I/O devices
• Routers
• LAN segments
• Also threads of a server
• Database locks
• Semaphores
![Page 4: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/4.jpg)
A Simple Example
• Web server with m threads
• Requests handled directly by available thread or queued
• Executing threads need to use the CPU and I/O and may also be queued
![Page 5: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/5.jpg)
Example of Contention for Software Threads
1
2
m
HTTP Serverthreads
Queue for threads
CPU
Disk
![Page 6: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/6.jpg)
Total response time for a Web request
• Software contention; time spent by a request waiting for a software resource (semaphore, DB lock)
• Hardware contention; time spent by a request waiting for a hardware resource (CPU, I/O device)
• Use of hardware resources; time spent using hardware resources
![Page 7: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/7.jpg)
Example
• HTTP server with five threads
• Request requires 0.050 sec CPU
• Request requires 0.065 sec I/O time
• No limit on size of queue
• What is the impact of contention for threads as arrival rate increases?
![Page 8: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/8.jpg)
Response time and Waiting time for Threads (Unlimited
Queue)
![Page 9: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/9.jpg)
Example
• For arrival rate = 12/sec, thread waiting time is 0.194 (Littles Law)
• Average # requests waiting for thread = 12 X 0.194 = 2.33
• Response time – thread waiting time = request execution time, For 12 requests/sec = 0.487 – 0.194 = 0.293
• Time spent waiting for resources = 0.293 – 0.115 = 0.063 sec
![Page 10: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/10.jpg)
Contention for Server Threads with finite queue
1
2
m
HTTP ServerthreadsQueue
for threads
Max queue = JQueue size = k
K = J?NO
YES
Rejectedrequests
Λ (1 – P reject)
![Page 11: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/11.jpg)
Response time and Waiting time for Threads (Limited
Queue)
![Page 12: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/12.jpg)
Rejection Probability
• Throughput = Λ X ( 1 – P reject)
• P reject = probability that a request is rejected
• Rejection probability with Λ = 12
• Decreases very fast with increase in queue length
![Page 13: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/13.jpg)
Rejection Probability
Maximum queue size Reject probability 1 0.1027 3 0.0605 5 0.0345 7 0.0227 9 0.0159
![Page 14: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/14.jpg)
Contention for Software in E-Business Sites
• WS is multithreaded (m threads)
• AS has n threads
• DS has p threads
• Queue for WS limited (requests may be rejected)
• Requests sent to AS and/or DS and are queued there
![Page 15: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/15.jpg)
S/W and H/W Queues
Disk
CPU
Disk
CPU
Disk
CPU
1
m
WS threads
1
m
AS threads
1
m
DS threads
Rejectedrequests
![Page 16: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/16.jpg)
Contention for Software in E-Business Sites
• ResponseTime = SoftwareContention + ExecutionTime
• SoftwareContention = Wait(WS) + Wait(AS) + Wait(DS)
• ExecutionTime = HardwareContention + TotalDemands
• HardwareContention = HdwWait(WS) + HdwWait(AS) + HdwWait(DS)
• TotalDemand = Demand(WS) + Demand(AS) + Demand(DS)
![Page 17: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/17.jpg)
Example
• E-business site with max queue size for WS = 50 requests
• Parameters given below
Server type Number of threads
D(CPU) in seconds
D(I/O) in seconds
WS 15 0.010 0.015 AS 10 0.012 0.020 DS 5 0.020 0.030
![Page 18: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/18.jpg)
Example
• Simulation results next page
• Software contention, execution time, and hardware contention grow at the beginning with arrival rate and then saturate when queue is filled
• Hardware contention is largest component of execution time
![Page 19: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/19.jpg)
Example
• Nsite, average number of requests at the e-business site
• Model shows that for Λ =12, Nsite = 59.7 and response time = 5.92
• Nsite = Throughput X ResponseTime• = Λ ( 1 – Preject) X ResponseTime• Preject = 1 – (Nsite / (Λ X ResponseTime)• = 1 – 59.7 / (12 X 5.92) = 0.16
![Page 20: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/20.jpg)
Simultaneous Resource Contention
• Simultaneous resource possession; request to simultaneously hold more than one resource
• Can be modeled using hardware and software resources
![Page 21: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/21.jpg)
Simultaneous Resource Possession of S/W, H/W
Resources
![Page 22: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/22.jpg)
Method of Layers
• Multi-tier e-business architecture makes them suitable to model with multiple layers– Layered Queuing Networks– Good for representing hardware and software
hierarchy in e-business sites
• With a LQN, processes with similar behavior form a group or a class of processes
![Page 23: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/23.jpg)
Example of LQN
• WS running on a machine of its own• AS and DS share another machine• AS uses disk 2, DS uses disks 3, 4• WS threads are at level 1 of LQN, requests
services from CPU 1, disk 1, AS threads which are at level 2
• AS server threads use disk 2 and DS threads at level 3
• DS server threads use CPU 2 and disks 3 and 4 which are at level 4
![Page 24: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/24.jpg)
LQN Model for an E-Business Site
WebServerthreads
AppServerthreads
DBServerthreads
Level 1
Level 2
Level 3
Level 4
CPU 1 Disk 1
Disk 2
CPU 2 Disk 3 Disk 3
![Page 25: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/25.jpg)
Analytic Techniques
• Based on Mean Value Analysis
• 1. Method of Layers (MOL)– Iterative technique, decompose LQN
into sequence of 2 level QN submodels
• 2. Stochastic Rendezvous Networks (SRN)– Iterative algorithm that begins by
assuming no H/W, S/W contention
![Page 26: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/26.jpg)
Characterizing E-Business Workloads
![Page 27: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/27.jpg)
Introduction
• Demonstrate how CBMGs and CVMs can be obtained from HTTP logs
• Describe methods based on clustering analysis to derive small groups of CBMGs or CVMs that accurately reflect the workload
• Show how parameters can be obtained from the customer behavior model
![Page 28: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/28.jpg)
Workload Characterization of Web Traffic
• If a web site has 1800 requests for files during a 5 minute period to 12 unique files;
• 1800 n= k X ( 1/1 + ½ + .. + 1/12) = k X 3.1032
• K = 1800/3.1032 = 580.05• Estimated number of accesses to the most
popular file is k/1 = 580 , least popular file is k/12 = 580.05/12 = 48
![Page 29: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/29.jpg)
Example of Zipf’s Law
![Page 30: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/30.jpg)
Tailed Distribution
• Tailed distribution implies the probability that a large value occurs is small but non-negligible
• Web traffic features that are found to be heavy tailed – Size of files requested from Web servers– Number of pages requested per site– Reading time per page
![Page 31: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/31.jpg)
Characterizing Customer Behavior
• CBMG can be used to capture the navigational pattern of a customer through an e-commerce site– Transitional aspect
• how a customer moves between states• Matrix of transition probabilities
– Temporal aspect• The time it takes to move between states• “server perceived” think time; average time elapsed
since a server completes a request for a customer until it receives the next request from the same customer during the same session
![Page 32: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/32.jpg)
Browser side and Server side think times
Server
Browser
Request i
nt nt
t1 t2 t3
nt
Request i+1
Zb
ZsRs
nt = network time Zs = server side think timeZb = browser side think time Rs = server response time
![Page 33: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/33.jpg)
Characterizing Customer Behavior
• Server side think time = t3 – t1 • = 2 X nt + Zb
• A think time can be associated with each transition in the CBMG
• Describe as a pair (P,Z), P = [Pi,j] is an nXn matrix of transition probabilities, Z = [Zi,j], is a nXn matrix representing average think times between CBMG states
![Page 34: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/34.jpg)
Example CBMG
entry
browse
search
selectAdd to cartpay1
.5
.5
.4
.1
.3.1
.6 .2
1.0
.2 .2
.45.4
.3
.25.3
.1 .1
2
3
6 5 4
![Page 35: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/35.jpg)
Example
• Vadd = Vselect X 0.2• Vbrowse = Vsearch X 0.2 + Vselect X 0.3 +
Vadd X 0.25 + Vbrowse X 0.3 + Ventry X 0.5
• In general: Vj = Σ Vk X pkj (k = 1..n-1) and pkj is the probability that a customer makes a transition from state j to state k
![Page 36: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/36.jpg)
Example
• AverageSessionLength = Σ Vj for j = 2..n-1
• For example, AverageSessionLength = Vbrowse + Vsearch + Vselect + Vadd + Vpay
• = 2.498 + 4.413 + 1.324 + 0.265 + 0.053 = 8.552
![Page 37: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/37.jpg)
From HTTP logs to CBMGs
• We can obtain CBMG data from HTTP logs
• Can group small clusters of CBMG to determine behavior (stratification)
• Logs can be merged and filtered using time stamps to help in the merge
![Page 38: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/38.jpg)
Data recorded in the log
• UserID; identification of the customer (using cookies, dynamic URLs and other authentication mechanisms)
• RequestType; GET on the home page, GET on another page, search request, etc
• RequestTime; time request arrived at the site
• ExecTime; not normally recorded, execution time of the request
![Page 39: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/39.jpg)
Customer Behavior Characterization Methodology
Merge and filter
Get sessions
Get CBMGs
HTTP Logs
Request log
Session log
CBMGs
![Page 40: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/40.jpg)
GetSessions Algorithm
• For a given session, there are three transitions between states s and t
• Think times are 20, 45, 38 sec resp.
• Cs,t = 3, Ws,t = 20 + 45 + 38 = 103 sec
• Cs,t = nXn matrix of transition counts
• Ws,t = nXn matrix of think times
![Page 41: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/41.jpg)
Basics of GetSessions
• Sort request log by UserID in order of time
• Separate into sessions using a session threshold time (30 minutes)
• For each session form the C and W matrices (transitions and think times)
![Page 42: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/42.jpg)
Basics of GetSessions
• Precision of time needs to be relevant to processor speed, etc
• May want to clean the log from crawler activity
![Page 43: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/43.jpg)
GetCBMGs algorithm
• Must perform a clustering analysis on the data– Creates a synthetic workload composed
of a relatively small number of CBMGs
• Centroid of the cluster determines the CBMG characteristics
![Page 44: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/44.jpg)
Example
• HTTP log run through GEtSessions produces 20000 sessions out of 340,000 lines in the request log
• Six clusters identified• Buy to visit ratio (BV) represents the %
customers who buy from the store• Session length is the average # of shopper
operations requested by a customer for each visit to the store
• Va is the Add to Shopping Cart Visit Ratio (avg # times customer adds item to shopping cart)
![Page 45: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/45.jpg)
Example
Cluster 1 2 3 4 5 6 % of the sessions
44.28 28 10.6 9.29 6.2 1.5
BV ratio (%)
5.7 4.5 4.7 4 3.5 2
Session Length
5.6 15 27 28 50 81
Va 11 15 21 20 32 50 Vb + Vs 3.6 11.4 20 23 39 70
![Page 46: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/46.jpg)
Conclusions from example
• Cluster 1; represents the majority of the sessions (44.28%)– Very short average session length (5.6)– Highest % of customers that buy from the
store
• Cluster 6; represents a small percentage of customers– Longest session length– Smallest buying ratio
![Page 47: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/47.jpg)
Buy to Visit Ratio vs Session Length
![Page 48: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/48.jpg)
Conclusions from example
• Pattern; the longer the session, the less likely it is for a customer to buy an item from the Web store
• The buy to visit ration decreases in a quadratic fashion with the session length
![Page 49: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/49.jpg)
How many clusters to choose?
• How many clusters accurately represent the workload?
• Examines the variation in two metrics;– Average distance between points of a cluster
and its centroid (intracluster distance)– Average distance between clusters
(intercluster distance)– CV; coefficient of variation
![Page 50: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/50.jpg)
How many clusters to choose?
• Goal of clustering is to minimize the intracluster CV while maximizing the intercluster CV– If the # of clusters is made equal to the
# of points, this will be achieved– But we want a compact representation
so we need to select a small number
![Page 51: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/51.jpg)
Intercluster and Intracluster Coefficients of Variation
![Page 52: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/52.jpg)
From HTTP logs to CVMs
• Sessions represented by a CVM instead of a CBMG can be obtained from an HTTP log through the algorithm GetCVMSessions– Group sessions into representative
groups– Apply clustering techniques– Distance metric represents distance
between two visit ratio vectors
![Page 53: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/53.jpg)
CVM with 12 Sessions
Session Vbrowse Vsearch Vadd Vselect Vpay 1 4 10 2 4 1 2 15 20 1 18 0 3 5 8 3 5 1 4 16 18 3 16 1 5 10 8 0 5 0 6 3 10 2 8 1 7 5 11 3 8 1 8 10 15 0 12 0 9 8 6 3 4 1
10 7 10 1 8 1 11 10 20 0 15 0 12 5 4 1 2 1
![Page 54: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/54.jpg)
Characterizing the Workload at the Resource Level
• In order to perform capacity planning and sizing studies of an e-commerce site, the CBMG must be mapped from the workload characterization to the IT resources
• With each server in the CSID, we associate service demands at the various components (processors, disks) of the server
• To each arc of the CSID we associate service demands for the networks involved in the exchange of messages represented by the arc
![Page 55: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/55.jpg)
From CBMGs to IT Resources
search
C WS AS DS AS WS C
Node of a CBMG
CSID for search
NetworkServicedemands
CPU and diskService demands
![Page 56: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/56.jpg)
Example
• Assume characterization of an e-business site generates 2 CBMGs– Heavy buyers; customers who will buy from
the site with higher probability– Occasional buyers; search more, buy less
• Look at the search function• DS has one CPU and 2 disks– 0.006 sec service demand for CPU– 0.020 sec service demand for disk 1– 0.018 sec service demand for disk 2
![Page 57: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/57.jpg)
Example
Database Server CPU Disk 1 Disk 2
CBMG Type
Arrival Rate (sessions/sec)
Vsearch Service Demands (sec)
Heavy buyers
0.2 2.71 0.0163 0.0542 0.0488
Occasional Buyers
0.8 6.76 0.0406 0.1352 0.1217
Utilizations Heavy buyers 0.0088 0.0294 0.0264 Occassional buyers 0.2193 0.7312 0.6580 Total utilization 0.2282 0.7605 0.6845
![Page 58: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/58.jpg)
Example
• What is the service demand per session for Search functions at each component of the DS for each CBMG?
• What is the utilization of each resource of the DS due to the Search function?
![Page 59: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/59.jpg)
Example
• CBMG for occasional buyers.
• Each session of this type executes 6.76 searches on average
• Each search used 0.006 sec of CPU at the DS
• DCPU,OccasionalBuyers (Search) = 6.76 X 0.006 = 0.0406 sec
![Page 60: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/60.jpg)
Example
• In general, the service demand at a resource due to sessions of type r (heavy or occasional buyers)
• Di,r(f) = Vf,r X Di(f)• Vf,r = avg # of executions of function f per
session of type r• Di(f) = service demand of a single
execution of function f at resource i
![Page 61: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/61.jpg)
Example
• Compute the utilizations
• Ui,r(f) = Di,r(f) X Λ r(f)
• Ui,r(f) is utilization of a resource I due to the execution of function f for sessions of type r
![Page 62: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/62.jpg)
Example
• ΛOccasionalBuyers(Search) = 0.8 X 6.76 = 5.408 searches/sec
• UCPU,OccasionalBuyers(Search) = 0.0406 X 5.408 = 0.2193 = 21.93%
![Page 63: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/63.jpg)
Example
Database Server CPU Disk 1 Disk 2
CBMG Type
Arrival Rate (sessions/sec)
Vsearch Service Demands (sec)
Heavy buyers
0.2 2.71 0.0163 0.0542 0.0488
Occasional Buyers
0.8 6.76 0.0406 0.1352 0.1217
Utilizations Heavy buyers 0.0088 0.0294 0.0264 Occassional buyers 0.2193 0.7312 0.6580 Total utilization 0.2282 0.7605 0.6845
![Page 64: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/64.jpg)
E-Business Benchmarks: TPC-W
• Accurate workload characterizations can be used to build benchmark suites – Use to evaluate/compare competing systems
• Several workload generators exist• Transaction Processing Council (TPC)
releases TPC-W– First benchmark aimed at evaluating sites that
support e-business activities
![Page 65: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/65.jpg)
TPC-W Business Model
• B2C e-tailer that sells products and services over the internet– Browse through selected products– Search information– Place an order, etc
• DB of products as well as customers– Size of catalog is major scalability parameter– Choose between 1000, 10000, 100000, …
![Page 66: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/66.jpg)
TPC-W Customer Behavior Model
• Activity driven by emulated browsers• Generate web interactions that represent
complete cycle• EB engage in user sessions• Web interaction categories;
– Browse– Order– Browsing mix– Shopping mix– Ordering mix
![Page 67: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/67.jpg)
CBMG for TPC-W
ENTRY
HOME
SEARCH
select
Shoppingcart
login
Buyrequest
Buyconfirm
browseProduct
detail
![Page 68: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/68.jpg)
TPC-W Performance Metrics
• Throughput metric– WIPS (Web Interactions Per Second) where all
sessions are of the shopping type– WIPSb; basicallly WIPS for browsing– WIPSo; basicallly WIPS for ordering
• Cost/throughput metric; total cost of the system under test and the # of WIPS measured during a shopping interval– Includes purchase and maintenance costs for all
hardware and software components for the system
![Page 69: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/69.jpg)
Preparing E-Business for Waves of Demand
![Page 70: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/70.jpg)
Customer Demand and Workload
• Companies must meet customer expectations
• Demand “always on” service
• Internet offers low switching costs!
• Must be able to forecast customer demand
• Must anticipate traffic bursts
![Page 71: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/71.jpg)
Customer Demand and Workload
• Customer demand generates workload to e-business sites
• Customer demands translate to system workload
• Must understand why demands change
![Page 72: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/72.jpg)
Revisiting the Reference Model
Special Events
Businessmodel
Externalevents
Businessview
Functionalmodel
Customermodel
Resourcemodel
Special Events
Special Events
Internal eventsTechnological
view
Logs andMeasurement
data
![Page 73: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/73.jpg)
Customer Demand and Workload
• Examples of decisions or plans set at the business model layer– TV campaign– Launch of a new product– Low price offerings– New security policy– Special plans for events (Xmas)
![Page 74: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/74.jpg)
Customer Demand and Workload
• Changes in demand at the functional layer– New functionality– New features
• Navigational structure may change• In general, new software systems
demand additional resources from servers, disks, and networks
![Page 75: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/75.jpg)
Traffic Bursts
• Web traffic is bursty
• Reasons for traffic bursts– Unpredictable news events– Predictable news events– Product or service announcement– Special events
![Page 76: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/76.jpg)
Traffic Volume to an E-Tailer Site
![Page 77: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/77.jpg)
High Variability
• Bursts refer to the random arrival of requests
• Peak rates exceed the average rates by 8 to 10 times
• Peak traffic ratio; ratio between peak site and average site traffic
• Significant amount of short sessions and a small number of very long sessions– Heavy tailed distribution
![Page 78: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/78.jpg)
Traffic Patterns in E-Business
• Analysis of traffic behavior is very useful for predictive purposes
• Visual representation of traffic to an e-commerce site helps provide insight into the patterns of interaction between customers and the online business– Traffic pattern analysis
![Page 79: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/79.jpg)
Customer Sessions Over Time
![Page 80: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/80.jpg)
Forecasting Strategies
• Weak relationship between future and past experience– Access paradigms change constantly
• Forecasting methods help– Quantitative; must have historical data– Qualitative; subjective based on market
surveys, judgment, intuition, business plans, expert opinions
![Page 81: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/81.jpg)
A Forecast Strategy ModelQuantitativeForecastingtechniques
QuantitativeForecasting
methods
Collectinginformation
Historical data,Logs, measurements, etc
Market surveys,Judgment,
Technology forecasting
Forecasting techniques
Businessscenarios
ForecastDemand andworkloads
![Page 82: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/82.jpg)
Historical Data Patterns
![Page 83: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/83.jpg)
Regression Methods
• Regression models are used to estimate the value of a variable as a function of other variables– Predicted variable is the dependent
variable– Variables used to forecast the value are
the independent variables– Relationship can be linear or quadratic
![Page 84: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/84.jpg)
Moving Averages
• Makes the value to be forecast for the next period equal to a number of previous observations
• ft+1 = yt + yt+1 + yt+2 + …+ yt-n+1 / n
• ft+1 = forecast value for period t+1• yt = actual value (observation) at time t• n is the number of observations
![Page 85: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/85.jpg)
Example• IT staff of e-tailer monitors site traffic
Weak number Peak traffic ration 1 13.5 2 16.3 3 19.9 4 14.8 5 12.6 6 13.2 7 17.1 8 15.7
• F = (13.2 + 17.1 + 15.7) / 3 = 15.3
![Page 86: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/86.jpg)
Exponential Smoothing
• Used for non-seasonal data showing no systematic trend
• Uses a weighted average of past observations to forecast a value for the next period
• Place more weight on more recent observations– Latest observations give a better indication of
the future
![Page 87: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/87.jpg)
Exponential Smoothing
• ft+1 = ft + α (yt - ft )
• ft+1 = forecast value for period t+1
• Yt = actual value (observation) at time t
• α = smoothing weight (0 < α < 1)
![Page 88: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/88.jpg)
Example
• Online toy store traffic monitored by research company
• Monthly avg # of visits of a customer to the store is 2.7
• Information from CBMG;– Avg BV is 1.87%’avg customer session length
is 5.91– Avg # of visits to home page is 1.21– Each visited page generates 1 transaction– Avg transactions/visit = 5.91-1.21 = 4.7
![Page 89: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/89.jpg)
Evolution of the Customer Base
Month Actual size of the customer base
Forecast (alpha = 0.6)
January 354,000 354,000 February 327,000 354,000 March 318,000 337,800 April 356,000 325,920 May 304,000 343,968 June 352,000 319,987
![Page 90: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/90.jpg)
Example
• CIO wants to estimate total volume of transactions to be processed in July
• α = 0.6, estimated size of the customer base for July is then;
• f = 319,987 + 0.60 x (352,000 – 319,987) = 339,195
• Estimated # of monthly transactions;• TotalNumberOfVisits =
AvgVisitsPerCustomer X CustomerBase = 2.7 X 339,195 = 915,827
![Page 91: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/91.jpg)
Example
• TransactionsPerMonth = AvgTransactionsPerVisit X TotalNumberOfVisits = 4.7 X 915,827 = 4,304,387
• Estimated number of transactions for July is 4,304,387
![Page 92: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/92.jpg)
Applying Forecasting Techniques
• Validate the selected technique on the data– Use part of the historical data to
exercise the model– Compare the rest of the data to the
forecast for accuracy– Test for Mean Square Error (MSE) – look
for lowest
![Page 93: E-Commerce Architectures and Technologies](https://reader036.vdocuments.us/reader036/viewer/2022062409/56814aee550346895db7fb21/html5/thumbnails/93.jpg)
Applying Forecasting Techniques
• Causal mode; uses customer demands (arrival rate of the search function) as the independent variable and workload parameters (processor demand) as the dependent variable– Regression model can be used to estimate the
future processor demand of a web-based catalog application as a function of the number of items existing in the catalog