empirically-based staffing in call...
TRANSCRIPT
Empirically-Based Staffing in Call CentersSimple Models at the Service of Complex Realities
Sergey Zeltyn
Technion, Haifa, Israel
IBM Thomas J. Watson Research Center, October 3, 2006
Joint work with Professor Avi Mandelbaum
1
Outline
Subject/Flow of the Talk:Hierarchy of operational decisions in call centers, with emphasis onstaffing.
Main message I:Simple Models can be excellent tools at the service ofComplex Realities.
Supported by:Erlang-A Model and the QED operational regime applied to callcenter staffing.
Main message II:Empirically-Based Analysis is a Prerequisite for Research, Teachingand Practice of service operations.
Supported by:DataMOCCA – Data MOdel for Call Center Analysis.
2
Outline
Subject/Flow of the Talk:Hierarchy of operational decisions in call centers, with emphasis onstaffing.
Main message I:Simple Models can be excellent tools at the service ofComplex Realities.
Supported by:Erlang-A Model and the QED operational regime applied to callcenter staffing.
Main message II:Empirically-Based Analysis is a Prerequisite for Research, Teachingand Practice of service operations.
Supported by:DataMOCCA – Data MOdel for Call Center Analysis.
2
Outline
Subject/Flow of the Talk:Hierarchy of operational decisions in call centers, with emphasis onstaffing.
Main message I:Simple Models can be excellent tools at the service ofComplex Realities.
Supported by:Erlang-A Model and the QED operational regime applied to callcenter staffing.
Main message II:Empirically-Based Analysis is a Prerequisite for Research, Teachingand Practice of service operations.
Supported by:DataMOCCA – Data MOdel for Call Center Analysis.
2
Call Centers Industry
U.S. Statistics• Over 60% of annual business volume via the telephone• 70,000 – 200,000 call centers• 3 – 6.5 million employees (3% – 6% workforce)• 20% annual growth rate• $100 – $300 billion annual expenditures• 1000’s agents in a “single" call center.
Quality/Efficiency Tradeoff
• 65 – 80% personnel costs• Over 90% U.S. consumers form company image via call center
experience.
Objective:Having the right number of appropriately skilled agents when needed.
3
Call Centers Industry
U.S. Statistics• Over 60% of annual business volume via the telephone• 70,000 – 200,000 call centers• 3 – 6.5 million employees (3% – 6% workforce)• 20% annual growth rate• $100 – $300 billion annual expenditures• 1000’s agents in a “single" call center.
Quality/Efficiency Tradeoff
• 65 – 80% personnel costs• Over 90% U.S. consumers form company image via call center
experience.
Objective:Having the right number of appropriately skilled agents when needed.
3
Call Centers Industry
U.S. Statistics• Over 60% of annual business volume via the telephone• 70,000 – 200,000 call centers• 3 – 6.5 million employees (3% – 6% workforce)• 20% annual growth rate• $100 – $300 billion annual expenditures• 1000’s agents in a “single" call center.
Quality/Efficiency Tradeoff
• 65 – 80% personnel costs• Over 90% U.S. consumers form company image via call center
experience.
Objective:Having the right number of appropriately skilled agents when needed.
3
Staffing Problem
Determination of load-dependent number of agents.
Prevailing method:SIPP (Stationary Independent Period-by-Period),a constant number of agents over each period (15, 30 or 60 min).
Agents and customers are assumed homogeneous.In particular, every agent can potentially serve every customer.
Main Approaches to Staffing
Constraint Satisfaction: find minimal number of agents n∗ thatsatisfies some performance goal(s) (e.g. less than 3% abandonment).Prevalent in practice, Service-Level Agreements (SLA) in outsourcing.
Cost/Revenue Optimization: find n∗ that optimizes service revenuesand costs of staffing, abandonment and waiting.
4
Staffing Problem
Determination of load-dependent number of agents.
Prevailing method:SIPP (Stationary Independent Period-by-Period),a constant number of agents over each period (15, 30 or 60 min).
Agents and customers are assumed homogeneous.In particular, every agent can potentially serve every customer.
Main Approaches to Staffing
Constraint Satisfaction: find minimal number of agents n∗ thatsatisfies some performance goal(s) (e.g. less than 3% abandonment).Prevalent in practice, Service-Level Agreements (SLA) in outsourcing.
Cost/Revenue Optimization: find n∗ that optimizes service revenuesand costs of staffing, abandonment and waiting.
4
M/M/n+M (Erlang-A, Palm): Main Staffing Model
Call Center:Schematic Representation
arrivals
lost calls
retrials
retrials
abandonment
returns
queueACD
agentsbusy
1
2
n
…1 2 3 k
lost calls
Erlang-A Queue
agents
arrivals
abandonment
λ
µ
1
2
n
…
queue
θ
Erlang-A Assumptions:• λ – Poisson arrival rate• µ – Exponential service rate• n – number of service agents• θ – Exponential individual abandonment rate
• No busysignals
• First ComeFirst Served.
5
Erlang-A Model: Calculations
Performance measures can be calculated relatively easily.
4CallCenters - A Personal Tool for Workforce Management.Based on the M.Sc. thesis of Ofer Garnett.http://iew3.technion.ac.il/serveng2006S/4CallCenters/Downloads.htm
6
Time Scale: Arrival to a Call Center in 1999
Strategic Tactical
Arrival Process, in 1999
Yearly Monthly
Daily Hourly
Arrival Process, in 1999
Yearly Monthly
Daily Hourly Operational Stochastic
Arrival Process, in 1999
Yearly Monthly
Daily Hourly
Arrival Process, in 1999
Yearly Monthly
Daily Hourly
7
Queueing Science:
Arrival to a Call Center in 1976
Arrival Process, in 1976 (E. S. Buffa, M. J. Cosgrove, and B. J. Luce,
“An Integrated Work Shift Scheduling System”)
Yearly Monthly
Daily Hourly
8
Service Times: Distribution and Psychology
Histogram of Service Times in an Israeli Call Center
January-October November-December
Beyond Data Averages Short Service Times
AVG: 200 STD: 249
AVG: 185 STD: 238
7.2 % ? Jan – Oct:
Log-Normal AVG: 200 STD: 249
Nov – Dec:
27
Beyond Data Averages Short Service Times
AVG: 200 STD: 249
AVG: 185 STD: 238
7.2 % ? Jan – Oct:
Log-Normal AVG: 200 STD: 249
Nov – Dec:
27
• Lognormal service times prevalent in call centers
• 7.2% Short-Services: Agents’ “Abandon" (improve bonus, rest)• Distributions, not only Averages, must be measured.
9
Service Times: Distribution and Psychology
Histogram of Service Times in an Israeli Call Center
January-October November-December
Beyond Data Averages Short Service Times
AVG: 200 STD: 249
AVG: 185 STD: 238
7.2 % ? Jan – Oct:
Log-Normal AVG: 200 STD: 249
Nov – Dec:
27
Beyond Data Averages Short Service Times
AVG: 200 STD: 249
AVG: 185 STD: 238
7.2 % ? Jan – Oct:
Log-Normal AVG: 200 STD: 249
Nov – Dec:
27
• Lognormal service times prevalent in call centers• 7.2% Short-Services: Agents’ “Abandon" (improve bonus, rest)• Distributions, not only Averages, must be measured.
9
Erlang-A: Modelling (Im)Patience
• Patience Time: τ ∼ exp(θ)Time a customer is willing to wait for service.
• Offered Wait: VTime a customer is required to wait for service.(= Waiting time of a customer with infinite patience.)
• If τ ≤ V then customer abandons, else served.• Actual Wait Wq = min(τ, V ).
Patience data is censored: 2% abandoning implies 98% censored!
10
Erlang-A: Modelling (Im)Patience
• Patience Time: τ ∼ exp(θ)Time a customer is willing to wait for service.
• Offered Wait: VTime a customer is required to wait for service.(= Waiting time of a customer with infinite patience.)
• If τ ≤ V then customer abandons, else served.• Actual Wait Wq = min(τ, V ).
Patience data is censored: 2% abandoning implies 98% censored!
10
Measuring Patience
Hazard Rates of Patience in an Israeli Bank:Regular over VIP Customers
14
16
• VIP customers are more patient (needy).• Why the peaks in abandonment? Announcements!• Call-by-call data required to obtain this graph (+Uncensoring).
11
Measuring Patience
Hazard Rates of Patience in an Israeli Bank:Regular over VIP Customers
14
16
• VIP customers are more patient (needy).• Why the peaks in abandonment? Announcements!• Call-by-call data required to obtain this graph (+Uncensoring).
11
Estimating Patience: P{Ab} ∝ E[Wq] Relation
In queues with exp(θ) patience: P{Ab} = θ · E[Wq] .
Israeli Bank: Yearly Data
Hourly Data Aggregated
0 50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
0 50 100 150 200 250
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
Graphs are based on 4158 hour intervals.
Estimate of mean patience: 250/0.55 ≈ 450 seconds.
12
Building block: Interrelation
Average Service Time over the Day – Israeli Bank
�
�
�
�
Figure 12: Mean Service Time (Regular) vs. Time-of-day (95% CI) (n =
42613)
Time of Day
Mea
n S
ervi
ce T
ime
10 15 20
100
120
140
160
180
200
220
240
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
3011
Prevalent: Longest services at peak-loads (10:00, 15:00). Why?
Explanations:• Prevalent: Service protocol different (longer) at congestion• Operational: The needy abandon less during peak loads;
hence the VIP remain on line, with their longer service times.13
Erlang-A: Fitting a Simple Model to a Complex Reality
• Small Israeli bank (10 agents)• Patience estimated via P{Ab} / E[Wq]
• Graphs: hourly performance vs. Erlang-A predictions, over 1year, aggregating groups with 40 similar hours.
P{Ab} E[Wq] P{Wq > 0}
0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
Probability to abandon (Erlang−A)
Pro
babi
lity
to a
band
on (
data
)
0 50 100 150 200 2500
50
100
150
200
250
Waiting time (Erlang−A), sec
Wai
ting
time
(dat
a), s
ec
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of wait (Erlang−A)
Pro
babi
lity
of w
ait (
data
)
14
Erlang-A:
Fitting a Simple Model to a Complex Reality II
Large U.S. Bank
Retail. P{Wq > 0} Telesales. E[Wq]
0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Probability of wait (QED: aggregated)
Pro
babi
lity
of w
ait (
data
: agg
rega
ted)
0 10 20 30 40 50 60 70 80 900
10
20
30
40
50
60
70
80
90
Average wait (QED: aggregated), secA
vera
ge w
ait (
data
: agg
rega
ted)
, sec
Partial success, in some cases Erlang-A does not work well(Networking, SBR).
Large Israeli call center – underway.
15
Erlang-A:
Fitting a Simple Model to a Complex Reality III
We have learned:• Arrival process can by approximated by Poisson• Service times are not exponential (typically close to lognormal)• Patience times are not exponential (various patterns are
observed).
Question: why Erlang-A works with non-exponential patience andservice times?
Answer: via study of operational regimes in call centers.
16
Erlang-A:
Fitting a Simple Model to a Complex Reality III
We have learned:• Arrival process can by approximated by Poisson• Service times are not exponential (typically close to lognormal)• Patience times are not exponential (various patterns are
observed).
Question: why Erlang-A works with non-exponential patience andservice times?
Answer: via study of operational regimes in call centers.
16
Operational Regimes: Motivating Example
Health Insurance. ACD Report.
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
Total 20,577 19,860 3.5% 30 307 95.1%
8:00 332 308 7.2% 27 302 87.1% 59.38:30 653 615 5.8% 58 293 96.1% 104.19:00 866 796 8.1% 63 308 97.1% 140.49:30 1,152 1,138 1.2% 28 303 90.8% 211.110:00 1,330 1,286 3.3% 22 307 98.4% 223.110:30 1,364 1,338 1.9% 33 296 99.0% 222.511:00 1,380 1,280 7.2% 34 306 98.2% 222.011:30 1,272 1,247 2.0% 44 298 94.6% 218.012:00 1,179 1,177 0.2% 1 306 91.6% 218.312:30 1,174 1,160 1.2% 10 302 95.5% 203.813:00 1,018 999 1.9% 9 314 95.4% 182.9
13:30 1,061 961 9.4% 67 306 100.0% 163.4
14:00 1,173 1,082 7.8% 78 313 99.5% 188.9
14:30 1,212 1,179 2.7% 23 304 96.6% 206.1
15:00 1,137 1,122 1.3% 15 320 96.9% 205.815:30 1,169 1,137 2.7% 17 311 97.1% 202.216:00 1,107 1,059 4.3% 46 315 99.2% 187.116:30 914 892 2.4% 22 307 95.2% 160.0
17:00 615 615 0.0% 2 328 83.0% 135.0
17:30 420 420 0.0% 0 328 73.8% 103.518:00 49 49 0.0% 14 180 84.2% 5.8
17
Efficiency-Driven (ED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
13:30 1,061 961 9.4% 67 306 100.0% 163.4
• 100% occupancy• High P{Ab}
• Considerable waiting time• P{Wq > 0} ≈ 1.
Offered load:
RED∆=
λ
µ= 1061 :
1800306
= 180.37 .
Characterization:
n = RED · (1− γ) , γ > 0.
Service grade
γ = 1− nRED
= 1− 163.4180.37
= 0.094 ≈ P{Ab} .
ED regime captured by Fluid-Model.
18
Efficiency-Driven (ED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
13:30 1,061 961 9.4% 67 306 100.0% 163.4
• 100% occupancy• High P{Ab}
• Considerable waiting time• P{Wq > 0} ≈ 1.
Offered load:
RED∆=
λ
µ= 1061 :
1800306
= 180.37 .
Characterization:
n = RED · (1− γ) , γ > 0.
Service grade
γ = 1− nRED
= 1− 163.4180.37
= 0.094 ≈ P{Ab} .
ED regime captured by Fluid-Model.
18
Efficiency-Driven (ED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
13:30 1,061 961 9.4% 67 306 100.0% 163.4
• 100% occupancy• High P{Ab}
• Considerable waiting time• P{Wq > 0} ≈ 1.
Offered load:
RED∆=
λ
µ= 1061 :
1800306
= 180.37 .
Characterization:
n = RED · (1− γ) , γ > 0.
Service grade
γ = 1− nRED
= 1− 163.4180.37
= 0.094 ≈ P{Ab} .
ED regime captured by Fluid-Model.
18
Efficiency-Driven (ED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
13:30 1,061 961 9.4% 67 306 100.0% 163.4
• 100% occupancy• High P{Ab}
• Considerable waiting time• P{Wq > 0} ≈ 1.
Offered load:
RED∆=
λ
µ= 1061 :
1800306
= 180.37 .
Characterization:
n = RED · (1− γ) , γ > 0.
Service grade
γ = 1− nRED
= 1− 163.4180.37
= 0.094 ≈ P{Ab} .
ED regime captured by Fluid-Model.18
Quality-Driven (QD) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
17:00 615 615 0.0% 2 328 83.0% 135.0
• Occupancy far below 100%• Negligible P{Ab}
• Very short waiting time• P{Wq > 0} ≈ 0.
Offered load:
RQD =λ
µ= 615 :
1800328
= 112.07 .
Characterization:
n = RQD · (1 + γ) , γ > 0.
Service grade
γ =n
RQD− 1 =
135112.07
− 1 = 0.205 .
19
Quality and Efficiency-Driven (QED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
14:30 1,212 1,179 2.7% 23 304 96.6% 206.1
• High occupancy, but not 100%• P{Wq > 0} ≈ α, 0 < α < 1.
• Small P{Ab} and waiting
Offered load: RQED =λ
µ= 1212 :
1800304
= 204.69 .
Characterization: n = RQED + β√
RQED , −∞ < β < ∞ .
Service grade
β =n − RQED√
RQED=
206.1− 204.69√204.69
= 0.10 .
Square-Root Staffing Rule: Described by Erlang in 1924!
Awaited the seminal formulation of Halfin-Whitt in 1981.
20
Quality and Efficiency-Driven (QED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
14:30 1,212 1,179 2.7% 23 304 96.6% 206.1
• High occupancy, but not 100%• P{Wq > 0} ≈ α, 0 < α < 1.
• Small P{Ab} and waiting
Offered load: RQED =λ
µ= 1212 :
1800304
= 204.69 .
Characterization: n = RQED + β√
RQED , −∞ < β < ∞ .
Service grade
β =n − RQED√
RQED=
206.1− 204.69√204.69
= 0.10 .
Square-Root Staffing Rule: Described by Erlang in 1924!
Awaited the seminal formulation of Halfin-Whitt in 1981.
20
Quality and Efficiency-Driven (QED) Regime
Time Calls Answered Abandoned% ASA AHT Occ% # of agents
14:30 1,212 1,179 2.7% 23 304 96.6% 206.1
• High occupancy, but not 100%• P{Wq > 0} ≈ α, 0 < α < 1.
• Small P{Ab} and waiting
Offered load: RQED =λ
µ= 1212 :
1800304
= 204.69 .
Characterization: n = RQED + β√
RQED , −∞ < β < ∞ .
Service grade
β =n − RQED√
RQED=
206.1− 204.69√204.69
= 0.10 .
Square-Root Staffing Rule: Described by Erlang in 1924!
Awaited the seminal formulation of Halfin-Whitt in 1981.
20
Operational Regimes: Performance.
Assume that offered load R is not small (λ →∞).
ED regime: n ≈ R − δR , 0.1 ≤ δ ≤ 0.25 .
• Essentially all customers are delayed• %Abandoned ≈ δ (10-25%)• Average wait ≈ 30 seconds - 2 minutes.
QD regime: n ≈ R + γR , 0.1 ≤ γ ≤ 0.25 .Essentially no delays.
QED regime: n ≈ R + β√
R, −1 ≤ β ≤ 1 .
• %Delayed between 25% and 75%• %Abandoned is 1-5%• Average wait is one-order less than average service time
(seconds vs. minutes).
21
Erlang-A Queue: QED Approximations
Assume that offered load R is not small (λ →∞).
Let β = β
õ
θ, h(·) =
φ(·)1− Φ(·)
= hazard rate of N (0, 1).
• Delay probability:
P{Wq > 0} ∼
[1 +
√θ
µ· h(β)
h(−β)
]−1
,
• Probability to abandon:
P{Ab|Wq > 0} =1√n·
√θ
µ·[h(β)− β
]+ o
(1√n
).
• Linear relation between P{Ab} and E[Wq]:
P{Ab}E[Wq]
= θ.
22
Generally Distributed Patience:
M/M/n+G Model
agents
arrivals
abandonment
λ
µ
1
2
n
…
queue
G
Back to puzzle of “Why Erlang-A works?"Assume patience times are generally distributed.
Density of patience time: g = {g(x), x ≥ 0}, where g(0)∆= g0 > 0.
QED regime: n ≈ R + β√
R.
QED approximations: use Erlang-A, with g0 replacing θ.
23
Generally Distributed Patience:
M/M/n+G Model
agents
arrivals
abandonment
λ
µ
1
2
n
…
queue
G
Back to puzzle of “Why Erlang-A works?"Assume patience times are generally distributed.
Density of patience time: g = {g(x), x ≥ 0}, where g(0)∆= g0 > 0.
QED regime: n ≈ R + β√
R.
QED approximations: use Erlang-A, with g0 replacing θ.
23
Generally Distributed Patience:
Fitting Erlang-A
Israeli Bank: Yearly Data
Hourly Data Aggregated
0 50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
0 50 100 150 200 250
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
TheoryErlang-A: P{Ab} = θ · E[Wq]. M/M/n+G: P{Ab} ≈ g0 · E[Wq].
Recipe:In both cases, use Erlang-A, with θ = P{Ab}/E[Wq] (slope above).
24
Generally Distributed Patience:
Fitting Erlang-A
Israeli Bank: Yearly Data
Hourly Data Aggregated
0 50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
0 50 100 150 200 250
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
Average waiting time, sec
Pro
bab
ility
to
ab
and
on
TheoryErlang-A: P{Ab} = θ · E[Wq]. M/M/n+G: P{Ab} ≈ g0 · E[Wq].
Recipe:In both cases, use Erlang-A, with θ = P{Ab}/E[Wq] (slope above).
24
Generally Distributed Service Times
Established: M/M/n+M ≈ M/M/n+G (θ = g0).
Next: M/M/n+G ≈ M/G/n+G (same mean service).
Numerical Experiments: Whitt (2004), Rosenshmidt (2006)demonstrate a good fit for typical call-center parameters.
Lognormal (CV=1) vs. Exponential Service Times, QED regime;100 agents, mean patience = mean service
Fraction Abandoning Delay Probability
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8Service grade
Frac
tion
aban
doni
ng
Exponential Lognormal
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8Service grade
Del
ay p
roba
bilit
y
Exponential Lognormal
25
Generally Distributed Service Times
Established: M/M/n+M ≈ M/M/n+G (θ = g0).
Next: M/M/n+G ≈ M/G/n+G (same mean service).
Numerical Experiments: Whitt (2004), Rosenshmidt (2006)demonstrate a good fit for typical call-center parameters.
Lognormal (CV=1) vs. Exponential Service Times, QED regime;100 agents, mean patience = mean service
Fraction Abandoning Delay Probability
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8Service grade
Frac
tion
aban
doni
ng
Exponential Lognormal
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8Service grade
Del
ay p
roba
bilit
y
Exponential Lognormal
25
Generally Distributed Service Times
Established: M/M/n+M ≈ M/M/n+G (θ = g0).
Next: M/M/n+G ≈ M/G/n+G (same mean service).
Numerical Experiments: Whitt (2004), Rosenshmidt (2006)demonstrate a good fit for typical call-center parameters.
Lognormal (CV=1) vs. Exponential Service Times, QED regime;100 agents, mean patience = mean service
Fraction Abandoning Delay Probability
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8Service grade
Frac
tion
aban
doni
ng
Exponential Lognormal
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8Service grade
Del
ay p
roba
bilit
y
Exponential Lognormal
25
Simple Model for Complex Realities:
More on Applications of the QED Regime
• Simple performance approximations that are robust also inthe ED and QD regimes: Mandelbaum, Zeltyn.
• Optimal staffing for cost/revenue optimization problems(staffing, abandonment and waiting costs): Borst, Mandelbaum,Reiman, Zeltyn.
• General service times: Puhalskii, Reiman; Jelencovic,Mandelbaum, Momcilovic.
• Generalizations to time-varying queues: Jennings, Feldman,Mandelbaum, Massey, Whitt.
• Generalizations to system with non-homogeneous customersand/or servers (Skills-Based Routing): Armony, Gurvich,Mandelbaum; Atar, Shaikhet.
• Load-balancing: Dai, Tezcan.
26
The QED Regime and Stochastic-Ignorant Staffing:
The Right Answer for the Wrong Reasons
If β = 0, QED staffing prescribes:
n = R ,
R = offered load (minutes of work that arrive per minute).
In word: Assign number of agents that equals offered load,which is common practice.
No abandonment: queue “explodes".
With abandonment, n = 400, reasonable (im)patience:
• %Delayed ≈ 50%• %Abandoned ≈ 2%• E[Wq] ≈ 2% · E[S], few seconds.
Very good service level.
27
Call Centers: Hierarchical Operational ViewForecasting Customers (Statistics), Agents (HRM) Staffing: Queueing Theory (Erlang-A based) Service Level, Costs # FTE’s (Seats) per unit of time Shifts: IP, Combinatorial Optimization; LP Union constraints, Costs Shift structure Rostering: Heuristics, AI (Complex) Individual constraints
Agents Assignments
Skills-based Routing: Stochastic Control (of Q's)
1
28
DataMOCCA = Data MOdel for Call Center Analysis
Project Goal: Designing and Implementing a (universal)data-base/data-repository and interface for storing, retrieving,analyzing and displaying Call-by-Call-Data.
System Components:• Clean Databases: operational-data of individual calls, agents
and operations.• Friendly yet powerful Online Interface: enables convenient fast
access to (mostly) operational and (some) administrative data(but no marketing/business data).
Current Databases:• Medium-sized U.S. Bank (2.5 years; 220M calls, 40M via
agents; 800 agents at peaks) – Completed.• Israeli Cell-Phone Company (2 years; 110M calls, 25M via
agents; 400 agents at peaks) – Ongoing.• Large Israeli Bank – Pilot.
DataMOCCA will now be used to illustrate the hierarchy ofoperational decisions, from forecasting to SBR.
29
DataMOCCA = Data MOdel for Call Center Analysis
Project Goal: Designing and Implementing a (universal)data-base/data-repository and interface for storing, retrieving,analyzing and displaying Call-by-Call-Data.
System Components:• Clean Databases: operational-data of individual calls, agents
and operations.• Friendly yet powerful Online Interface: enables convenient fast
access to (mostly) operational and (some) administrative data(but no marketing/business data).
Current Databases:• Medium-sized U.S. Bank (2.5 years; 220M calls, 40M via
agents; 800 agents at peaks) – Completed.• Israeli Cell-Phone Company (2 years; 110M calls, 25M via
agents; 400 agents at peaks) – Ongoing.• Large Israeli Bank – Pilot.
DataMOCCA will now be used to illustrate the hierarchy ofoperational decisions, from forecasting to SBR.
29
DataMOCCA = Data MOdel for Call Center Analysis
Project Goal: Designing and Implementing a (universal)data-base/data-repository and interface for storing, retrieving,analyzing and displaying Call-by-Call-Data.
System Components:• Clean Databases: operational-data of individual calls, agents
and operations.• Friendly yet powerful Online Interface: enables convenient fast
access to (mostly) operational and (some) administrative data(but no marketing/business data).
Current Databases:• Medium-sized U.S. Bank (2.5 years; 220M calls, 40M via
agents; 800 agents at peaks) – Completed.• Israeli Cell-Phone Company (2 years; 110M calls, 25M via
agents; 400 agents at peaks) – Ongoing.• Large Israeli Bank – Pilot.
DataMOCCA will now be used to illustrate the hierarchy ofoperational decisions, from forecasting to SBR.
29
DataMOCCA = Data MOdel for Call Center Analysis
Project Goal: Designing and Implementing a (universal)data-base/data-repository and interface for storing, retrieving,analyzing and displaying Call-by-Call-Data.
System Components:• Clean Databases: operational-data of individual calls, agents
and operations.• Friendly yet powerful Online Interface: enables convenient fast
access to (mostly) operational and (some) administrative data(but no marketing/business data).
Current Databases:• Medium-sized U.S. Bank (2.5 years; 220M calls, 40M via
agents; 800 agents at peaks) – Completed.• Israeli Cell-Phone Company (2 years; 110M calls, 25M via
agents; 400 agents at peaks) – Ongoing.• Large Israeli Bank – Pilot.
DataMOCCA will now be used to illustrate the hierarchy ofoperational decisions, from forecasting to SBR.
29
Arrivals to Service: Predictable vs. RandomArrival Rates on Tuesdays in a September – U.S. Bank
0
500
1000
1500
2000
2500
3000
3500
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Time (Resolution 30 min.)
Arrival rate
04.09.2001 11.09.2001 18.09.2001 25.09.2001
• Tuesday, September 4th: Heavy, following Labor Day• Tuesdays, September 18 & 25: Normal
• Tuesday, September 11th, 2001.
30
Arrivals to Service: Predictable vs. RandomArrival Rates on Tuesdays in a September – U.S. Bank
0
500
1000
1500
2000
2500
3000
3500
0:00 2:00 4:00 6:00 8:00 10:00 12:00 14:00 16:00 18:00 20:00 22:00
Time (Resolution 30 min.)
Arrival rate
04.09.2001 11.09.2001 18.09.2001 25.09.2001
• Tuesday, September 4th: Heavy, following Labor Day• Tuesdays, September 18 & 25: Normal• Tuesday, September 11th, 2001.
30
Agent Status
Erlang-A Model ⇒ optimal Staffing Level n.
n = overall number-of-agents that come to work? No!
Israeli Bank, Agent Status: Monthly Averages
0
50
100
150
200
250
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time
Number of agents
Long Break Medium Break Short BreakOther Activities Outgoing Call AvailableIncoming Call
Staffing Level (FTE) = Busy with “Incoming Calls" + “Available" forservice.
31
Shifts Scheduling
Integer Programming given interval-based Staffing Levels.
Example of a shift scheduling problem:should we bring agents early, given a predictable arrival peak?
U.S. Bank: Queue-length and Staffing on May 3, 2002
0
50
100
150
200
250
300
350
400
07:00
08:00
09:00
10:00
11:00
12:00
13:00
14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
22:00
23:00
24:00
:00
Num
ber o
f Age
nts
0
20
40
60
80
100
120
Que
ue L
engt
h
Offered Load Number of Agents Queue length
32
Rostering
Assigning individual agents to schedules.Typically, heuristics used to accommodate individual constraints.
Israeli Technical Support Call Center: Online Shift Bidding
Shift-bidding starts at 18:00.
• 60% of agents are registered till 18:00• 80% till 18:24; 90% till 22:00; registration closed at 5:23am.
33
Introduction to Skills Based Routing
General Setup
10
Introduction
Multi-queue parallel-server system = schematic depiction of a telephone call-center:
λ1 λ2 λ3 λ4
θ1 1 θ2 2 3 θ3 4 θ4
µ1 µ2 µ3 µ4 µ5 µ6 µ7 µ8 S1 S2 S3
Here the λ's designate arrival rates, the µ's service rates, the θ's abandonment rates, and the S's are the
number of servers in each server-pool.
Skills-Based Design:
- Queue: "customer-type" requiring a specific type of service;
- Server-Pool: "skills" defining the service-types it can perform;
- Arrow: leading into a server-pool define its skills / constituency.
For example, a server with skill 2 (S2) can serve customers of type 3 (C3)
at rate µ6 customers/hour.
Customers of type 3 arrive randomly at rate λ3 customers/hour, equipped with
an impatience rate of θ3.
Major Control Decisions• Customer Routing: If an agent turns idle and there are queued
customers, which customer (if any) should be routed to thisagent.
• Agent Scheduling: If a customer arrives and there are idleagents, which agent (if any) should serve this customer.
• Load Balancing: Routing of customers to distributed callcenters (eg. nation-wide).
34
Customers Relationship/Revenue Management (CRM)
NationsBank CRM relationship groups:• RG1: high-value customers• RG2: marginally profitable customers (with potential)• RG3: unprofitable customer.
NationsBank’s Design of the Service Encounter
Examples of Specifications: Assignable Grade Of Service
6
3
NationsBank CRM: What are the relationship groups?
The groups– RG1 : high-value customers– RG2 : marginally profitable customers (with potential)– RG3 : unprofitable customer
What does it mean for a customer in each group to be profitable? Customer Revenue Management
5
NationsBank’s Design of the Service Encounter
90% of calls85% of calls70% of callsVRU Target
within 8 business dayswithin 2 business daysduring callProblem Resolution
basic productproduct expertsuniversalRep. Training
< 9%< 5%< 1%Abandonment rate
mailcall / mailcall / faxTrans. Confirmation
FCFSFCFSrequest rep / callbackRep. Personalization
2 min. average4 min. averageno limitAverage Talk Time
50% in 20 seconds80% in 20 seconds100% in 2 ringsSpeed of Answer
RG3RG2RG1
Examples of Specifications: Assignable Grade Of Service (AGOS)
35
Priorities and Economies-of-ScaleU.S. Bank. Regular vs. VIP Customers. December 2002
Average Wait Staffing Level
0
5
10
15
20
25
30
35
40
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time (Resolution 30 min.)
Average Waiting Time, sec
Retail Premier
0
50
100
150
200
250
300
350
400
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time (Resolution 30 min.)
Number of Agents
Retail Premier
Premier customers do not get a better service level.
Number of agents assigned to Premier is small and they do not getenough help from regular agents.
Challenge: enable better service level for Premier and still servemost of them by a small dedicated group of agents.
36
Priorities and Economies-of-ScaleU.S. Bank. Regular vs. VIP Customers. December 2002
Average Wait Staffing Level
0
5
10
15
20
25
30
35
40
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time (Resolution 30 min.)
Average Waiting Time, sec
Retail Premier
0
50
100
150
200
250
300
350
400
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time (Resolution 30 min.)
Number of Agents
Retail Premier
Premier customers do not get a better service level.
Number of agents assigned to Premier is small and they do not getenough help from regular agents.
Challenge: enable better service level for Premier and still servemost of them by a small dedicated group of agents.
36
Priorities and Routing Protocols I
Israeli Bank. Regular vs. VIP Customers. October 2004
Delay Probability Average Wait
0
10
20
30
40
50
60
70
80
90
100
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time (Resolution 30 min.)
Delay Probability, %
Private Private Platinum
0
5
10
15
20
25
30
35
40
7:00 9:00 11:00 13:00 15:00 17:00 19:00 21:00 23:00
Time (Resolution 30 min.)
Average Wait, sec
Private Private Platinum
More Platinum customers have to wait, but their average wait isshorter. How to explain?
37
Priorities and Routing Protocols II
Histograms of Waiting Times. October 2004
Private Private Platinumprivate_hist
Page 1
0.0
0.5
1.0
1.5
2.0
2.5
3.0
2 14 26 38 50 62 74 86 98
Time (Resolution 1 sec.)
Relative frequencies, %
platinum_hist
Page 1
0.0
2.0
4.0
6.0
8.0
10.0
12.0
14.0
16.0
18.0
2 14 26 38 50 62 74 86
Time (Resolution 1 sec.)
Relative frequencies, %
After 25 seconds of wait, Platinum are routed to Regular agentsgetting high priority. Hence, almost no long waiting times forPlatinum.
38
Dynamic Priority-Upgrade
Large Israeli Bank: Histogram of Waiting Timeswaitwait
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
Peaks every 60 seconds. Why?
• System: Priority-Upgrade (unrevealed) every 60 seconds• Human: Voice-Announcement every 60 seconds.
Served Customers Abandoning Customerswaithandled
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
waitab
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340
Time (Resolution 1 sec.)
Relative frequencies, %
39
Dynamic Priority-Upgrade
Large Israeli Bank: Histogram of Waiting Timeswaitwait
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
Peaks every 60 seconds. Why?• System: Priority-Upgrade (unrevealed) every 60 seconds• Human: Voice-Announcement every 60 seconds.
Served Customers Abandoning Customerswaithandled
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
waitab
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340
Time (Resolution 1 sec.)
Relative frequencies, %
39
Dynamic Priority-Upgrade
Large Israeli Bank: Histogram of Waiting Timeswaitwait
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
Peaks every 60 seconds. Why?• System: Priority-Upgrade (unrevealed) every 60 seconds• Human: Voice-Announcement every 60 seconds.
Served Customers Abandoning Customerswaithandled
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340 360 380
Time (Resolution 1 sec.)
Relative frequencies, %
waitab
Page 1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
20 40 60 80 100 120 140 160 180 200 220 240 260 280 300 320 340
Time (Resolution 1 sec.)
Relative frequencies, %
39
Network Balancing via Interqueue in a U.S. Bank
73
Distributed Call Center (U.S. Bank)
NY 1
RI 3
PA 2
MA 4
179+5
619+3
11+1
74+7
8+1
19+1
20
508+2
101+2
2
External arrivals:2092 2063(98.6%Served)+29(1.4%Aban)
Not Interqueued:1209(57.8%)
• Served: 1184(97.9/56.6)
• Aban: 25(2.1/1.2) Interqueued :883(42.2) • Served
here:174(19.7/8.3)
• Served at 2: 438(49.6/20.9) S d t 3
External arrivals: 16941687(99.6%
Served)+7( 0.4% Aban)
Not Interqueued: 1665(98.3)
• Served: 1659 (99.6/97.9)
• Aban: 6 (0.4/04) Interqueued:28+1 (1.7)
• Served here: 17(58.6/1)
• Served at 1: 3(10.3/0.2)
External arrivals: 122 112(91.8
Served)+10(8.2 Aban)
Not Interqueued: 93 (76.2)
• Served: 85 (91.4/69.7)
• Aban: 8 (8.6/6.6) Interqueued:27+2
(23.8) • Served here:
14(48.3/11.5) • Served at 1: 6
External arrivals: 1770 1755(99.2
Served)+15(0.8 Aban)
Not Interqueued: 1503(84.9)
• Served: 1497 (99.6/84.6)
• Aban: 6 (0.4/0.3) Interqueued:258+9
(15.1) • Served here: 110
(41.2/6.2) • Served at 1:58
(21 7/3 3)
Internal arrivals: 224
• Served at 1: 67 (29.9)
• Served at 2: 41 (18.3)
• Served at 3: 87 (38.8)
• Served at 4:
Internal arrivals: 643
• Served at 1: 157 (24.4)
• Served at 2: 195 (30.3)
• Served at 3: 282 (43.9)
• Served at 4: 4 (0.6)
• Aban at 1: 3
Internal arrivals: 81
• Served at 1: 17(21)
• Served at 3: 42(51.9)
• Served at 4: 15(18 5)
Internal arrivals: 613• Served at 1:
41(6.7) • Served at 2:
513(83.7) • Served at 3:
55(9.0) • Aban at 1:
2(0.3)
10 AM – 11 AM (03/19/01): Interflow Chart Among the 4 Call C t f Fl t B k
40
Network Balancing Protocols and Performance Level
U.S. Bank: Histograms of Waiting Times
Retail BusinessChart1
Page 1
0
2
4
6
8
10
12
14
16
18
20
2 5 8 11 14 17 20 23 26 29 32 35
Time
Relative frequencies, %
Sheet3 Chart 1
Page 1
0
2
4
6
8
10
12
2 8 14 20 26 32 38 44 50 56 62 68 74 80 86 92
Time
Relative frequencies, %
Why do we observe a peak for Retail service (10 seconds)?After 10 seconds of wait Retail customers were sent into theinterqueue.
Business customers – peak at 5 seconds, for the same reason.Second peak – unclear, maybe priority-upgrade.
41
Main Research and Practical Challenges
• Skills-Based Routing: Convergence of Practice and Theory• Uncertainty: in Reality, Model Parameters, Forecasting• Time-Varying Queues: Time-Stable Performance• General Service-Times: Theory• Economic Models: Operations (Dimensioning), Marketing• Human Dimensions:
Measure, Model, Experiment, Validate, Refine, etc.
All of the above in a Network of distributed call centers.
See our Service Engineering site for downloads:http://iew3.technion.ac.il/serveng
42