commissionedpaper - the faculty of industrial engineering...

63
Commissioned Paper Telephone Call Centers: Tutorial, Review, and Research Prospects Noah Gans • Ger Koole • Avishai Mandelbaum The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104 Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Industrial Engineering and Management, Technion, Haifa 32000, Israel [email protected][email protected][email protected] T elephone call centers are an integral part of many businesses, and their economic role is significant and growing. They are also fascinating sociotechnical systems in which the behavior of customers and employees is closely intertwined with physical performance measures. In these environments traditional operational models are of great value—and at the same time fundamentally limited—in their ability to characterize system performance. We review the state of research on telephone call centers. We begin with a tutorial on how call centers function and proceed to survey academic research devoted to the management of their operations. We then outline important problems that have not been addressed and identify promising directions for future research. ( Telephone Call Center; Contact Center; Teleservices; Telequeues; Capacity Management; Staffing; Hiring; Workforce Management Systems; ACD Reports; Queueing; Abandonment; Erlang C; Erlang B; Erlang A; QED Regime; Time-Varying Queues; Call Routing; Skills-Based Routing; Forecasting; Data Mining ) Contents 1. Introduction 79 1.1. Additional Resources 81 1.2. Reading Guide 81 2. Overview of Call-Center Operations 82 2.1. Background 82 2.2. How an Inbound Call Is Handled 83 2.3. Data Generation and Reporting 85 2.4. Call Centers as Queueing Systems 88 2.5. Service Quality 89 3. A Base Example: Homogeneous Customers and Agents 90 3.1. Background on Capacity Management 90 3.2. Capacity-Planning Hierarchy 91 3.3. Forecasting 96 3.4. The Forecasting and Planning Cycle 96 3.5. Longer-Term Issues of System Design 97 4. Research Within the Base-Example Framework 97 4.1. Heavy-Traffic Limits for Erlang C 97 4.2. Busy Signals and Abandonment 102 4.3. Time-Varying Arrival Rates 105 4.4. Uncertain Arrival Rates 107 4.5. Staff Scheduling and Rostering 108 4.6. Long-Term Hiring and Training 110 4.7. Open Questions 110 5. Routing, Multimedia, and Networks 113 5.1. Skills-Based Routing 113 5.2. Call Blending and Multimedia 119 5.3. Networking 121 6. Data Analysis and Forecasting 122 6.1. Types of Call-Center Data 122 6.2. Types of Data Analysis and Source of Model Uncertainty 123 6.3. Models for Operational Parameters 125 6.4. Future Work in Data Analysis and Forecasting 131 7. Future Directions in Call-Center Research 132 7.1. A Broader View of the Service Process 132 7.2. An Exploration of Intertemporal Effects 133 7.3. A Better Understanding of Customer and CSR Behavior 134 7.4. CRM: Customer Relationship/Revenue Management 135 7.5. A Call for Multidisciplinary Research 136 8. Conclusion 137 1. Introduction Call centers and their contemporary successors, con- tact centers, have become a preferred and preva- lent means for companies to communicate with their customers. Most organizations with customer 1523-4614/03/0502/0079$05.00 1526-5498 electronic ISSN Manufacturing & Service Operations Management © 2003 INFORMS Vol. 5, No. 2, Spring 2003, pp. 79–141

Upload: others

Post on 19-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Commissioned PaperTelephone Call Centers: Tutorial, Review,

    and Research ProspectsNoah Gans • Ger Koole • Avishai Mandelbaum

    The Wharton School, University of Pennsylvania, Philadelphia, Pennsylvania 19104Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands

    Industrial Engineering and Management, Technion, Haifa 32000, [email protected][email protected][email protected]

    Telephone call centers are an integral part of many businesses, and their economic roleis significant and growing. They are also fascinating sociotechnical systems in whichthe behavior of customers and employees is closely intertwined with physical performancemeasures. In these environments traditional operational models are of great value—and atthe same time fundamentally limited—in their ability to characterize system performance.

    We review the state of research on telephone call centers. We begin with a tutorial on howcall centers function and proceed to survey academic research devoted to the managementof their operations. We then outline important problems that have not been addressed andidentify promising directions for future research.(Telephone Call Center; Contact Center; Teleservices; Telequeues; Capacity Management; Staffing;Hiring; Workforce Management Systems; ACD Reports; Queueing; Abandonment; Erlang C; ErlangB; Erlang A; QED Regime; Time-Varying Queues; Call Routing; Skills-Based Routing; Forecasting;Data Mining )

    Contents1. Introduction 79

    1.1. Additional Resources 811.2. Reading Guide 81

    2. Overview of Call-Center Operations 822.1. Background 822.2. How an Inbound Call Is Handled 832.3. Data Generation and Reporting 852.4. Call Centers as Queueing Systems 882.5. Service Quality 89

    3. A Base Example: Homogeneous Customers and Agents 903.1. Background on Capacity Management 903.2. Capacity-Planning Hierarchy 913.3. Forecasting 963.4. The Forecasting and Planning Cycle 963.5. Longer-Term Issues of System Design 97

    4. Research Within the Base-Example Framework 974.1. Heavy-Traffic Limits for Erlang C 974.2. Busy Signals and Abandonment 1024.3. Time-Varying Arrival Rates 1054.4. Uncertain Arrival Rates 1074.5. Staff Scheduling and Rostering 1084.6. Long-Term Hiring and Training 1104.7. Open Questions 110

    5. Routing, Multimedia, and Networks 1135.1. Skills-Based Routing 113

    5.2. Call Blending and Multimedia 1195.3. Networking 121

    6. Data Analysis and Forecasting 1226.1. Types of Call-Center Data 1226.2. Types of Data Analysis and Source of Model

    Uncertainty 1236.3. Models for Operational Parameters 1256.4. Future Work in Data Analysis and Forecasting 131

    7. Future Directions in Call-Center Research 1327.1. A Broader View of the Service Process 1327.2. An Exploration of Intertemporal Effects 1337.3. A Better Understanding of Customer and CSR

    Behavior 1347.4. CRM: Customer Relationship/Revenue Management 1357.5. A Call for Multidisciplinary Research 136

    8. Conclusion 137

    1. IntroductionCall centers and their contemporary successors, con-tact centers, have become a preferred and preva-lent means for companies to communicate withtheir customers. Most organizations with customer

    1523-4614/03/0502/0079$05.001526-5498 electronic ISSN

    Manufacturing & Service Operations Management © 2003 INFORMSVol. 5, No. 2, Spring 2003, pp. 79–141

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    contact—private companies, as well as governmentand emergency services—have reengineered theirinfrastructure to include from one to many call cen-ters, either internally managed or outsourced. Formany companies, such as airlines, hotels, retail banks,and credit card companies, call centers provide a pri-mary link between customer and service provider.

    The call center industry is thus vast and rapidlyexpanding, in terms of both workforce and economicscope. For example, a recent analyst’s report estimatesthe number of agents working in U.S. call centersto have been 1.55 million in 1999—more than 1.4%of private-sector employment—and to be growing ata rate of more that 8% per year (Datamonitor, U.S.Bureau of Labor Statistics, various years). In 1998,AT&T reported that on an average business day about40% of the more than 260 million calls on its networkwere toll-free (AT&T). One presumes that the greatmajority of these 104 million daily “1–800” calls ter-minated at a telephone call center.

    The quality and operational efficiency of these tele-phone services can be extraordinary. In a large, best-practice call center, many hundreds of agents cancater to many thousands of phone callers per hour.Agent utilization levels can average between 90%–95%;no customer encounters a busy signal and, in fact,about half of the customers are answered immediately.The waiting time of those delayed is measured in sec-onds, and the fraction that abandon while waitingvaries from the negligible to a mere 1%–2%.

    At the same time, these examples of best practicerepresent the exception, rather than the rule. Mostcall centers—even well-run ones—do not consistentlyachieve such simultaneously high levels of servicequality and efficiency. In part, this fact may be dueto a lack of understanding of the scientific principlesunderlying best practice.

    The performance gap is also likely due to the grow-ing complexity of contact centers. Recent trends innetworking, “skills-based routing,” and multimediahave fundamentally increased the challenges inherentin managing contact centers. While simple analyticalmodels have historically performed an important rolein the management of call centers, they leave much tobe desired. More sophisticated approaches are neededto accurately describe the reality of contact-center

    operations, and models of this reality can improvecontact-center performance significantly.

    In this article, our aim is twofold. We first providea tutorial on call centers, which outlines importantoperational problems. We then review the academicliterature that is related to the management of call-center operations, working from the current “state ofthe art” to open and emerging problems. Our focusis on mathematical models which potentially supportcall-center management, and we primarily addressanalytical models that support capacity management.

    Analytical models can be contrasted with simula-tion techniques, which have been growing in popular-ity (see §VIII in Mandelbaum 2002). This growth hasoccurred partly because of improved user-friendlinessof simulation tools and partly in view of the scarcityof mathematical skills required for the analytical alter-natives. Perhaps it is mostly due to the widening gapbetween the complexity of the modern call center andthe analytical models available to accommodate thiscomplexity.

    We will not dwell here on the virtues and vices ofanalytical versus simulation models. Our contentionis that, ideally, one should blend the two: Analyti-cal models for insight and calibration, simulation forfine tuning. In fact, our experience strongly suggeststhat having analytical models in one’s arsenal, evenlimited in scope, improves dramatically one’s use ofsimulation.

    There are two related reasons for our focuson capacity management. First, in most call cen-ters capacity costs in general, and human resourcecosts in particular, account for 60%–70% of operat-ing expenses. Thus, from a cost perspective, capac-ity management is critical. Second, the majority ofresearch to date has addressed capacity management.This no doubt reflects the traditional emphasis ofoperations management (OR/IE) research, and it alsois due to researchers’ sensitivity to the economicimportance of capacity costs.

    Nevertheless, these traditional operational modelsdo not capture a number of critical aspects of call-center performance, and we also discuss what webelieve to be important determinants that have notbeen adequately addressed. These topics include a

    80 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    better understanding of the role played by human fac-tors, as well as the better use of new technologies,such as networking and “skills-based routing” tools.Indeed, these behavioral and technological issues areclosely intertwined, and we believe that the ability toaddress these problems will often require multidisci-plinary research.

    1.1. Additional ResourcesThe continued growth in both the economic impor-tance and complexity of call centers has promptedincreasingly deep investigation of their operations.This is manifested by a growing body of academicwork devoted to call centers, research ranging in dis-cipline from mathematics and statistics, through oper-ations research, industrial engineering, informationtechnology and human resource management, all theway to psychology and sociology.

    While the focus of the current article is opera-tional issues and models, a number of comple-mentary research resources also exist. In particular,Mandelbaum (2002) provides a comprehensive biblio-graphy of call-center-related work. It includes refer-ences and abstracts that cover well over 250 researchpapers in a wide range of disciplines. Indeed, giventhe speed at which call-center technology and researchare evolving, advances are perhaps best followedthrough the Internet, either via sites of researchersactive in the area or through industry sites. For a listof web sites, see §XI of Mandelbaum (2002).

    There also exists a number of academic reviewarticles of which we are aware: Pinedo et al. (1999)provides the basics of call-center management, includ-ing some analytical models; Anupindi and Smythe(1997) describes the technology that enables currentand plausibly future call centers; Grossman et al.(2001) and Mehrotra (1997) are both short overviewsof some OR challenges in call-center research andpractice; Anton (2000) provides a managerial surveyof the past, present, and future of customer contactcenters; and Koole and Mandelbaum (2002) is morenarrowly focused on queueing models. One may viewour survey as a supplement to these articles, one thatis aimed at academic researchers that seek an entryto the subject, as well as at practitioners who developcall-center applications.

    Additional articles that we recommend as part ofa quantitative introduction to call centers include thefollowing. Buffa et al. (1976) is an early, comprehen-sive treatment of the hierarchical framework used bycall centers to manage capacity. The series of four arti-cles by Andrews et al. (1995); Andrews and Parsons(1989, 1993); and Quinn et al. (1991) constitutes aninteresting record of this group’s work with the callcenter of the catalogue retailer, L. L. Bean. Similarly,Brigandi et al. (1994) present work by AT&T thatdemonstrates the monetary value of call-center mod-elling. Mandelbaum et al. (2001), parts of which havebeen adapted to the present text, provides a thoroughdescriptive analysis of operational data from a callcenter, and Brown et al. (2002a) is its complementarystatistical analysis. Evenson et al. (1998) and Duxburyet al. (1999) discuss performance drivers and the stateof the art of call-center operations. Finally, Clevelandand Mayben (1997) is a well-written overview by andfor practitioners. However, we take exception to someof its views, notably its treatment of customer aban-donment (Garnett et al. 2002) and its capacity-sizingrecommendations (Srinivasan and Talim 2001).

    1.2. Reading GuideThe headings within the Table of Contents providesome detail on the material covered in the various sec-tions. Here we offer a complementary guide for read-ers with specific interests. After reading or skimmingthrough §2, the sections a given reader would concen-trate on may vary. What follows is a list of potentialtopic choices.

    • Queueing performance models for multiple-server systems: §§3.2, 4.1–4.4, 4.7, and 7.

    • Queueing control models for multiple-server,multiclass systems: §§5 and 7.

    • Human resources problems associated with per-sonnel scheduling, hiring, and training: §§3.2, 3.5,4.5–4.6, 4.7.2, and 7.

    • Service quality, and customer and agent behavior:§§2.5, 3.5, 6.3.2–6.3.4.

    • Statistical analysis of call-center data: §§2.3, 3.3,6, and 7.

    Note that §§2.2–2.3 introduce and define commonlyused call-center names and acronyms that we usethroughout the paper. A summary of the abbreviationsand their definitions can be found in Appendix A.

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 81

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    2. Overview of Call-CenterOperations

    This section offers a tutorial on call-center operations.Then in §2.1, we provide background on the scope ofcall-center operations, then in §2.2 we describe howcall centers work, and we define common call-centernomenclature. Next, §2.3 describes how call cen-ters commonly monitor their operations and measuretheir operating performance. In §2.4 we highlight therelationship between call centers and queueing sys-tems. Then, §2.5 discusses measures of service qualitycommonly used in call centers.

    2.1. BackgroundAt its core, a call center constitutes a set of resources—typically personnel, computers, and telecommuni-cation equipment—which enable the delivery ofservices via the telephone. The working environmentof a large call center (Figure 1) can be envisionedas an endless room with numerous open-space cubi-cles, in which people with earphones sit in front ofcomputer terminals, providing teleservices to phan-tom customers.

    Call centers can be categorized along many dimen-sions. The functions that they provide are highly var-ied: From customer service, help desk, and emergencyresponse services, to telemarketing and order taking.They vary greatly in size and geographic dispersion,from small sites with a few agents that take local

    Figure 1 The Working Environment of a Call Center (right image of First Direct from Larréché et al. 1997)

    calls—for example, at a medical practice—to largenational or international centers in which hundredsor thousands of agents may be on the phone at anytime.

    Furthermore, the latest telecommunications andinformation technology allow a call center to be thevirtual embodiment of a few or many geographicallydispersed operations. These range from small groupsof very large centers that are connected over severalcontinents—for example, in the U.S.A., Ireland, andIndia—to large collections of individual agents thatwork from their homes.

    The organization of work may also vary dramati-cally across call centers. When the skill level requiredto handle calls is low, a center may cross-train everyemployee to handle every type of call, and calls maybe handled first-come, first-served (FCFS). In settingsthat require more highly skilled work, each agent maybe trained to handle only a subset of the types of callsthat the center serves, and “skills-based routing” maybe used to route calls to appropriate agents. In turn,the organizational structure may vary from the veryflat—in which essentially all agents are exposed toexternal calls—to the multilayered—in which a layerrepresents a level of expertise—and customers may betransferred through several layers before being servedto satisfaction.

    A central characteristic of a call center is whetherit handles inbound or outbound traffic. Inbound call

    82 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    centers handle incoming calls that are initiated by out-side callers calling in to a center. Typically, these typesof centers provide customer support, help-desk ser-vices, reservation and sales support for airlines andhotels, and order-taking functions for catalog andWeb-based merchants. Outbound call centers han-dle outgoing calls, calls that are initiated from withina center. These types of operations have tradition-ally been associated with telemarketing and surveybusinesses. A recent development in some inboundcenters is to initiate outbound calls to high-value cus-tomers who have abandoned their calls before beingserved.

    Our focus in this article is on inbound call centers,with some attention given to mixed operations thatblend incoming and outgoing calls. In fact, we areaware of almost no academic work devoted to pureoutbound operations, the exception being Samuelson(1999). Within inbound centers, the agents that handlecalls are often referred to as customer service represen-tatives (CSRs) or “reps” for short. (Appendix A sum-marizes the call-center acronyms used in this review,and it displays the page numbers on which they aredefined.)

    In addition to providing the services of CSRs, manyinbound call centers use interactive voice response (IVR)units, also called voice response units (VRUs). Thesespecialized computers allow customers to communi-cate their needs and to “self-serve.” Customers inter-acting with an IVR use their telephone key pads orvoices to provide information, such as account num-bers or indications of the type of service desired. (Infact, the latest generation of speech-recognition tech-nology allows IVRs to interpret complex user com-mands.) In response, the IVR uses a synthesized voiceto report information, such as bank balances or depar-ture times of planes. IVRs can also be used to directthe center’s computers to provide simple services,such as the transfer of funds among bank accounts.For example, in many banking call centers, roughly80% of customer calls are fully self-served using anIVR. (Interestingly, the process by which customerswho wish to speak to a CSR identify themselves,using an IVR, can average 30 seconds, even thoughsubsequent queueing delays often reach no more thana few seconds.)

    A current trend is the extension of the call cen-ter into a contact center. The latter is a call center inwhich agents and IVRs are complemented by servicesin other media, such as e-mail, fax, webpages, or chat(in that order of prevalence). The trend toward con-tact centers has been stimulated by societal hype sur-rounding the Internet and by customer demand forchannel variety, as well as by the potential for effi-ciency gains. In particular, requests for e-mail and faxservices can be “stored” for later response, and it ispossible that, when standardized and well managed,they can be made significantly less costly than tele-phone services.

    Our survey deals almost exclusively with puretelephone services. To the best of our knowledge,no analytical model has yet been dedicated totruly multimedia contact centers, though a promisingframework (skills-based routing) and a few modelsthat accommodate IVRs, e-mails and their blendingwith telephone services, will be described in §5.

    2.2. How an Inbound Call Is HandledThe large-scale emergence of call centers has beenenabled by technological advances in information andcommunications systems. To describe these technolo-gies, and to illustrate how they function, we willwalk the reader through an example of the pro-cess by which a call center serves an incoming call.Figure 2 provides a schematic diagram of the equip-ment involved.

    Consider customers in the United States who wishto buy a ticket from a large airline using the tele-phone. They begin the process of buying the ticketby calling a toll-free “800” number. The long-distanceor public switched telephone network (PSTN) companythat provides the 800 service to the airline knowstwo vital pieces of information about each call: Thenumber from which the call originates, often calledthe automatic number identification (ANI) number; andthe number being dialed, named the call’s dialed-number identification service (DNIS) number. The PSTNprovider uses the ANI and DNIS to connect callerswith the center.

    The airline’s call center has its own, privatelyowned switch, called a private automatic branch ex-change (PABX or PBX), and the caller’s DNIS locates

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 83

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Figure 2 Schematic Diagram of Call-Center Technology

    customersPABX

    agents

    IVR/VRU

    ACD

    CTIserver

    PSTN customerdata servertrunk

    lines

    the PABX on the PSTN’s network. If the airline hasmore than one call center on the network—bothreachable via the same 800 number—then a combina-tion of the ANI, which gives the caller’s location, andthe DNIS may be used to route the call. For example,a caller from Atlanta may be routed to a Dallas callcenter, while another caller from Chicago—who callsthe same 800 number—may be routed to a center inNorth Dakota. Conversely, more than one DNIS maybe routed to the same PABX. For example, the airlinemay maintain different 800 numbers for domestic andinternational reservations and have both types of callterminate at the same PABX.

    The PABX is connected to the PSTN through a num-ber of telephone lines, often called trunk lines, that theairline owns. If there are one or more trunk lines free,then the call will be connected to the PABX. Other-wise, the caller will receive a busy signal. Once thecall is connected it may be served in a number ofphases.

    At first, calls may be connected through the PABXto an IVR that queries customers on their needs. Forexample, in the case of the airline, callers may be toldto “press 1” if they wish to find flight status informa-tion. If this is the case, then through continued inter-action with the IVR customers may complete servicewithout needing to speak with an agent.

    Customers may also communicate a need or desireto speak with a CSR, and in this case calls are handedfrom the IVR to an automatic call distributor (ACD).An ACD is a specialized switch, one that is designedto route calls, connected via the PABX, to individualCSRs within the call center. Modern ACDs are highlysophisticated, and they can be programmed to routecalls based on many criteria.

    Some of the routing criteria may reflect callers’status. For example, an airline may wish to spe-cially route calls from Spanish-speaking customers.This identification can happen in a number of ways:Through the DNIS, because a special 1-800 number isreserved for Spanish-speaking customers; through theANI, which allows the call-center’s computer systemto identify the originating phone number as that ofa Spanish-speaking customer; or through interactionwith the IVR, which allows callers who press “3” toidentify themselves as Spanish speakers.

    The capabilities of agents may also be used inthe routing of calls. For example, when agents atour example airline’s call center begin working, theylog into the center’s ACD. Their log-in IDs are thenused to retrieve records that describe whether theyare qualified to handle domestic and/or internationalreservations, as well as whether or not they are profi-cient in Spanish.

    Given its status, as well as that of the CSRs that arecurrently idle and available to take a call, the incom-ing call may be routed to the “best” available agent.If no suitable agent is free to take the call, the ACDmay keep the call “on hold” and the customer waitsuntil such an agent is available. While the decisionof whether and to whom to route the call may beprogrammed in advance, the rules that are needed tosolve this “skills-based routing” problem can turn outto be very complex.

    Customers that are put on hold are typicallyexposed to music, commercials, or other information.(A welcome, evolving trend is to provide delayedcustomers with predictions of their anticipated wait.)Delayed customers may judge that the service theyseek is not “worth” the wait, become impatient, andhang up before they are served. In this case, they are

    84 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    said to abandon the queue or to renege. Customers thatdo not abandon are eventually connected to a CSR.

    Once connected with a customer, agents can speakon the telephone while, at the same time, they workvia a PC or terminal with a corporate information sys-tem. In the case of our example airline, agents maydiscuss flight reservations with customers as they(simultaneously) query and enter data into the com-pany’s reservation system. In large companies, suchas airlines and retail banks, the information systemis typically not dedicated to the call center. Rather,many call centers, as well as other company branches,may share access to a centralized corporate informa-tion system.Computer-telephone integration (CTI) “middleware”

    can be used to more closely integrate the telephoneand information systems. For instance, CTI is themeans by which a call’s ANI is used to identify acaller and route a call: It takes the ANI and usesit to query a customer database in the company’sinformation systems; if there exists a customer in thedatabase with the same ANI, then routing informa-tion from that customer’s record is returned. In ourairline example, the routing information would be thecustomer’s preferred language.

    Similarly, CTI can be used to automatically dis-play a caller’s customer record on a CSR’s worksta-tion screen. By eliminating the need for the CSR toask the caller for an account number and to enterthe number into the information system, this so-called “screen pop” saves the CSR time and reducesthe call’s duration. If applied uniformly, it can alsoreduce variability among service times, thus improv-ing the standardization of call-handling procedures.

    In more sophisticated settings, CTI is used to inte-grate a special information system, called a customerrelationship management (CRM) system, into the call-center’s operations. CRM systems track customers’records and allow them to be used in operating deci-sions. For example, a CRM system may record cus-tomer preferences, such as the desire for an aisle seaton an airplane, and allow CSRs (or IVRs) to automati-cally deliver more customized service. A CRM systemmay also enable a screen pop to include the historyof the customer’s previous calls and, if relevant, dol-lar figures of past sales the customer has generated.

    It may even suggest cross-selling or up-selling oppor-tunities, or it may be used to route the incoming callto an agent with special cross-selling skills.

    Once a call begins service, it can follow a numberof paths. In the simplest case, the CSR handles thecaller’s request, and the caller hangs up. Even here,the service need not end; instead, the CSR may spendsome time on wrap-up activities, such as an updatingof the customer’s history file or the processing of anorder that the customer has requested. It may alsobe the case that the CSR cannot completely serve thecustomer and the call must be transferred to anotherCSR. Sometimes there are several such hand-offs.

    Finally, the service need not end with the call.Callers who are blocked or abandon the queue maytry to call again, in which case they become retrials.Callers who speak with CSRs but are unable toresolve their problems may also call again, in whichcase they becomes returns. Satisfactory service canalso lead to returns.

    2.3. Data Generation and ReportingAs it operates, a large call center generates vastamounts of data. Its IVR(s) and ACD are special-purpose computers that use data to mediate the flowof calls. Each time one of these switches takes anaction, it records the call’s identification number, theaction taken, the elapsed time since the previousaction, as well as other pieces of information. As a callwinds its way through a call center, a large numberof these records may be generated.

    From these records, a detailed history of each callthat enters the system can, in theory, be reconstructed:When it arrived; who the caller was; what actions thecaller took in the IVR and how long each action took;whether and how long the caller waited in queue;whether and for how long a CSR served the call; whothe CSR was. If the call center uses CTI, then addi-tional data from the company’s information systemsmay be included in the record: What the call wasabout; the types of actions taken by a CSR; relatedaccount information.

    In practice, call centers have not typically storedor analyzed records of individual calls, however. Thismay be due, in part, to the historically high cost ofmaintaining adequately large databases—a large call

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 85

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    center generates many gigabytes of call-by-call dataeach month—but clearly these quantities of data areno longer prohibitively expensive to store. It is alsolikely due to the fact that the software used to man-age call centers—itself developed at a time when datastorage was expensive—often uses only simple mod-els which require limited, summary statistics. Finally,we believe that it is due to lack of understandingof how and why more detailed analyses should becarried out. (Section 6 describes current work thatanalyzes call-by-call data. Sections 6 and 7 argue forthe long-term value of this type of work.)

    Instead, call centers most often summarize call-by-call data from the ACD (and related systems) as aver-ages that are calculated over short time intervals,most often 30 minutes in length. Figure 3 displays 21half-hours’ worth of data from such a report.

    These ACD data are used both for planning pur-poses and to measure system performance. They arecarefully and continuously watched by call-centermanagers. They will also be central to the discussion

    Figure 3 Example Half-Hour Summary Report from an ACD (courtesy of a member of the Wharton Call Center Forum)

    that continues through much of this article. Therefore,it is worth describing the columns of the report insome detail.

    The first four columns indicate the starting time ofthe half-hour interval, as well as counts of calls arriv-ing to the ACD: (Recvd), sometimes called offered, isthe total number of calls arriving during that halfhour; (Answ), sometimes called handled, the num-ber of arriving calls that were actually answeredby a CSR; and (Abn %), the percentage of arriv-ing calls that abandoned before being served (equals�1− �Answ�/ �Recvd��× 100%). Note that the num-ber of calls offered to the ACD may be much smallerthan the total number of calls arriving to the cen-ter. First, (Recvd) does not account for busy signals,which occur at the level of the PSTN and PABX. Fur-thermore, as already mentioned, in some industries itis not unusual for 80% of the calls arriving to a callcenter to be “self-service” and to terminate in the IVR.

    (Abn %) is an important measure of system conges-tion. The next column reports another one: (ASA) is

    86 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    the average speed of answer, the amount of time (Answ)calls spend “on hold” before being served by a CSR.(Because ASA does not include the time that aban-doned calls spend waiting, a reasonably full pictureof congestion requires, at a minimum, both ASA andAbn % statistics.) Call centers sometimes report addi-tional measures of the delay in queue. For example,the service level, also called the telephone service fac-tor (TSF), is the fraction of calls whose delay fell belowa prespecified “service-level” target. Typically the tar-get is 20 or 30 seconds. Some call centers also reportthe delay of the call that waited on hold the longestduring the half hour.

    To interpret the remaining statistics in Figure 3,it is helpful to define the following three states ofCSRs who are logged into the ACD: (1) active, namelyhandling a call; (2) sitting idle, available to handle acall; and (3) not actively handling a call but not idle,unavailable to take calls. Over the course of each half-hour reporting interval, the ACD tracks the time thateach CSR that is logged into the system spends ineach of these states, and it aggregates (total active),(total available), and (total unavailable) time (acrossall logged-in CSRs) to calculate the figure’s statistics.

    The next column in Figure 3 reports the (AHT), theaverage handle time per call, another name for averageservice time (equals (total active) ÷ (Answ)). In somereports this total is broken down into componentparts: “talk” time, the average amount of time a CSRspends talking to the customer during a call; “hold”time, the average time a CSR puts a customer “onhold” during a call, once service has begun; and“wrap” time, the average amount of time a CSR spendscompleting service after the caller has hung up.

    The remaining columns detail the productivity ofthe call-center’s CSRs. (On Prod FTE) is the averagenumber of full-time equivalent (FTE) CSRs that wereactive or available during the half hour (equals ((totalactive)+ (total available))÷ 30 minutes). (Occ %), thesystem occupancy, is a measure of system utilizationthat excludes the time that CSRs were unavailable toserve calls (equals (total active)÷ ((total active)+ (totalavailable))×100%). (On Prod %) is the fraction of timethat logged-in CSRs were actively handling or able tohandle calls (equals ((total active)+ (total available))÷((total active)+(total available)+(total unavailable))×

    100%). (Sch Open FTE) is the number of FTE CSRsthat had been scheduled to be logged in during thehalf hour; it is the planned version of (On Prod FTE).Finally, (Sch Avail %) relates the actual time spentlogged-in to the original plan (equals (On Prod FTE)÷(Sch Open FTE)×100%).

    Thus, the report records three sources of loss inCSR productivity. The first is idle time that is presum-ably induced by naturally occurring stochastic vari-ability in arrival and service times and is capturedby (100%–(Occ %)). The second is the fraction of timethat CSRs were originally scheduled to be available totake calls but were not, which is calculated as (100%–(On Prod %)). This percentage can be tracked againstan operating standard that the call center maintainsto make sure that CSRs are not spending “too much”logged-in time unavailable. Similarly, the third source,(100%–(Sch Avail %)), allows call-center managers totrack the fraction of time CSRs are not logged in, per-haps away from their work stations taking unplannedbreaks. The latter two measures are often monitoredto diagnose perceived disciplinary problems: CSRs’lack of compliance with their assigned schedules.

    Note that the occupancies in Figure 3 are quite high,97%–100% during much of the day. This does notmean, however, that every CSR spends 97%–100% ofhis or her work day speaking with customers. Forexample, suppose the arrival rate of calls to a center isa constant 2,850 per half hour over an eight-hour day.The (AHT) of a call is one minute, so the call centerexpects 2,850 minutes of calls to be served in everyhalf hour, or 95 FTE CSRs worth of calls (95 CSRs ×30 minutes per CSR = 2,850 CSR minutes each halfhour). The center does not allow CSRs to be unavail-able, and in every half hour it makes sure that 100CSRs are taking calls, so that (Sched Open FTE)= (OnProd FTE) = 100. Therefore, (Occ %) = 95% and (OnProd %) = (Sch Avail %) = 100% in every half hour.The call center has 200 CSRs on staff, however, andeach CSR is scheduled to spend only half of the dayon the phone. Indeed, as we will see in §3, CSRs aretypically given breaks and off-phone work that lowertheir overall, daily utilization to more sustainablelevels.

    It is also worth noting that, although the statisticsdescribed above are averaged over all agents work-ing, many can be archived also at the individual-agent

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 87

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    level. This practice is useful for monitoring individualcompliance, and it can be used as a part of incentivecompensative schemes.

    While the specifics of ACD reports may vary fromone site to the next, the reports almost always (as faras we have seen) contain statistics on the four cate-gories of data shown in Figure 3: numbers of arrivalsand abandonment, average service times, CSR uti-lization, and the distribution of delay in queue. Thisis hardly surprising—it simply reflects the fact thatcall centers can be viewed, naturally and usefully, asqueueing systems.

    2.4. Call Centers as Queueing SystemsFigure 4 is an operational scheme of a simple callcenter. In it, the relationship between call centers andqueueing systems is clearly seen.

    The call center depicted in the figure has the follow-ing setup. A set of k trunk lines connects calls to thecenter. There are w ≤ k work stations, often referredto as seats, at which a group of N ≤ w agents serveincoming calls. An arriving call that finds all k trunklines occupied receives a busy signal and is blockedfrom entering the system. Otherwise, it is connectedto the call center and occupies one of the free lines. If

    Figure 4 Operational Scheme of a Simple Call Center

    retrials

    arrivals

    abandon

    queue

    busy

    lost calls

    retrials

    lost calls returns

    N = 3 CSR-servers

    5 = (k – N) places in queue

    w = 5 work stations

    k = 8 trunk lines (not visible)

    Call-center hardware Queueing model parameters

    fewer than N agents are busy, the call is put imme-diately into service. If it finds more than N but fewerthan k calls in the system, the arriving call waits inqueue for an agent to become available. Customerswho become impatient hang up, or abandon, beforebeing served. For the callers that wait and are ulti-mately helped by a CSR, the service discipline is first-come, first-served.

    Once a call exits the system it releases the resourcesit used—trunk line, work station, agent—and theseresources again become available to arriving calls. Afraction of calls that do not receive service becomeretrials that attempt to reenter service. The remainingblocked and abandoned calls are lost. Finally, servedcustomers may also return to the system. Returns maybe for additional services that generate new revenue,and as such may be regarded as good, or they maybe in response to problems with the original service,in which case they may be viewed as bad.

    Thus, the number of trunk lines k acts as an upperbound on the number of calls that can be in the system,either waiting or being served, at one time. Similarly,the number of CSRs taking calls, N ≤ w, provides anupper bound on the number of calls that can be in ser-vice simultaneously. Over the course of the day, call-

    88 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    center managers can (and do) dynamically change thenumber of working CSRs to track the load of arrivingcalls.

    Less frequently, if equipped with the proper tech-nology, managers also vary the number of activetrunk lines k. For example, a smaller k in peak hoursreduces abandonment rates and waiting (as well asthe associated “1-800” costs, to be discussed later);this advantage can be traded off against the increasein busy signals.

    For any fixed N , one can construct an associatedqueueing model in which callers are customers, theN CSRs are servers, and the queue consists of callersthat await service by CSRs. When N changes, �k−N�,the number of spaces in queue, changes as well. Asin Figure 3, model primitives for this system wouldinclude statistics for the arrival, abandonment, andservice processes. Fundamental model outputs wouldinclude the long-run fraction of customers abandon-ing, the steady-state distribution of delay in queue,and the long-run fraction of time that servers are busy.

    In fact, these types of queueing models are usedextensively in the management of call centers. Thesimplest and most widely used model is that of anM/M/N queue, also known in call-center circles asErlang C, which we later describe in more detail. Formany applications, however, the model is an over-simplification. Just looking at Figure 4, one sees thatthe Erlang C model ignores busy signals, customerimpatience, and services that span multiple visits.

    In practice, the service process sketched above isoften much more complicated. For example, the incor-poration of an IVR, with which customers interactprior to joining the agents’ queue, creates two stationsin tandem: An IVR followed by CSRs. The inclusionof a centralized information system adds a resourcewhose capacity is shared by the set of active CSRs,as well as by others who may not even be in thecall center. The picture becomes far more complex ifone considers multiple teams of specialized or cross-trained agents that are geographically dispersed overseveral interconnected call centers, and who are facedwith time-varying loads of calls from multiple typesof customers.

    2.5. Service QualityService quality is a complex and important topic thatis closely related to the understanding of CSR andcustomer behavior, and we return to these subjects in§7. Here, we briefly review three notions of servicequality that are most commonly tracked and managedby call centers.

    The first view of quality regards the accessibility ofagents. Typical questions are, “How long did cus-tomers have to wait to speak to an agent? How manyabandoned the queue before being served?” This typeof quality is measured via ACD (and related) reports,described above, and queueing models are used tomanage it. In this article, we concern ourselves withproblems associated with capacity management, andour emphasis will be on measures of accessibility.

    The second view is of the effectiveness of serviceencounters, and it parallels the notion of rework in themanufacturing literature. The question here is “Didthe service encounter completely resolve the cus-tomer’s problem, or was additional work required?”Among call centers in the United States, a call withoutrework is sometimes referred to as “one and done.”This type of quality is typically measured by samplinginspection; agent calls are listened to at random—either live or on tape—and they are judged as requir-ing rework or not. To our knowledge, there do notexist widespread, formalized schemes for managingservice effectiveness.

    The last type of quality that is consistently mon-itored is that of the content of the CSRs’ interac-tions with customers. Typical questions concern theCSR’s input to the encounter and include, “Didthe CSR use the customer’s name? Did s/he speakto the customer with a ‘smile’ in his or her voice?Did the CSR manage the flow of the conversationin the prescribed manner?” As with the question of“one and done,” answers to these questions are foundby listening to a random sample of each CSR’s calls.Sometimes the output of interactions is tracked, andthe question “Was the customer satisfied?” is asked.Customer satisfaction data are typically collected viasurveys.

    Of course, the notion of the quality of the cus-tomers’ experiences extends beyond their interaction

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 89

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    with CSRs. For example, it critically includes the timespent waiting on hold, in queue.

    In particular, we note that the nature of the timecustomers spend waiting on hold, in a telequeue, isfundamentally different than that spent in a physicalqueue at a bank or a supermarket checkout line, forexample. Customers do not see others waiting andneed not be aware of their “progress” if the call cen-ter does not provide the information. As Clevelandand Mayben (1997) point out, customers that join aphysical queue may start out unhappy—when theysee the length of the queue which they have joined—and become progressively happier as they move up inline. (For experimental evidence of this effect, see Car-mon and Kahneman 2002.) In contrast, customers thatjoin a telequeue may be optimistic initially—becausethey do not realize how long they will be on hold—and become progressively more irritated as they wait.Indeed, call centers that inform on-hold callers oftheir expected delays can be thought of as trying tomake the telequeueing experience more like that of aphysical queue.

    3. A Base Example: HomogeneousCustomers and Agents

    In this section we use a baseline example to describethe standard operational models that are used to man-age capacity. We begin in §3.1 with background oncapacity management in call centers. Then in §3.2 wedefine a hierarchy of capacity management problems,as well as the analytical models that are often used tosolve them: Queueing performance models for low-level staffing decisions; mathematical programmingmodels for intermediate-level personnel scheduling;and long-term planning models for hiring and train-ing. In §§3.3–3.4 we describe standard practices incall-center forecasting. Finally, §3.5 offers a qualitativediscussion of longer-term problems in system design.

    3.1. Background on Capacity ManagementHigher utilization rates imply longer delays in queue,and in managing capacity, call centers trade offresource utilization with accessibility. This trade-offis central to the day-to-day operations of call cen-ters and to the workforce management (WFM) software

    tools that are used to support them. It is also the con-cern of much of the research that is discussed in latersections.

    In some cases, revenues or costs can be directlyassociated with system performance. One can thenseek to maximize expected profits or to minimizeexpected costs. For example, call centers that use toll-free services pay out-of-pocket for the time their cus-tomers spend waiting, and these “1-800” costs growroughly linearly with the average number in queue:A call center that is open 24 hours a day, 7 days aweek, and averages 40 calls in queue will pay about$1 million per year in queueing expenses (when thecost per minute per call is $0.05). Similarly, order-taking businesses can sometimes estimate the oppor-tunity cost of lost sales due to blocking (busy signals)or abandonment. For example, see Andrews andParsons (1993) and Akşin and Harker (2003).

    More typically, however, call-center goals areformulated as the provision of a given level of acces-sibility, subject to a specified budget constraint. Com-mon practice is that upper management decides onthe desired service level and then call-center man-agers are called on to defend their budget. (See Borstet al. 2000 for a discussion of the constraint satisfac-tion and cost minimization approaches.)

    Furthermore, call-center managers’ view of systemcapacity most often focuses on agents. CSR salariestypically account for 60%–70% of the total operatingcosts, and managers presume that other resources,such as information systems, are not bottlenecks.While centers often do maintain extra hardwarecapacity, such as workstations, Akşin and Harker(2001, 2003) show that planning models that do notaccount for other bottlenecks when they exist couldbe a problem.

    We next introduce a “base case” example thatreflects the capacity-planning approach used by mostcall centers. We note that the example does not repre-sent the state of the art or, for that matter, best prac-tice. It does, however, give a sense of the state ofcommon practice. Furthermore, the description of theexample—and its inherent problems and limitations—will provide a framework by which we will organizeour subsequent discussion of call-center research.

    90 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    The subsection is divided into three parts. We beginby describing, from the bottom up, a hierarchy ofcapacity-planning problems (already introduced for-mally in Buffa et al. 1976). We then describe forecast-ing and estimation procedures which are commonlyused to determine inputs to the capacity-planningprocess. Finally, we sketch how the elements are puttogether within the context of the call center’s day-to-day operations.

    3.2. Capacity-Planning HierarchyConsider the call center whose statistics are reportedin Figure 3. One sees that the pattern of arrivals andservice times the center experiences is changing overthe course of the day. Offered calls (per half-hour)peak from 11:00 a.m.–11:30 a.m., dip over lunch, andthen peak again from 2:30 p.m.–3:00 p.m. Averagehandle times also appear to change significantly fromone half-hour to the next.

    Figure 5 A Hierarchical View of Arrival Rates (adapted from Mandelbaum et al. 2001, following Buffa et al. 1976)

    …each month of the year …each day of the month

    …each hour of the day …each minute of the hour

    Number of calls arriving…

    Indeed, in most call centers, the arrival rate andmix of calls entering the system vary over time. Overshort periods of time, minute-by-minute for example,there is significant stochastic variability in the num-ber of arriving calls. Over longer periods of time—thecourse of the day, the days of the week or month,the months of the year—there also can be predictablevariability, such as the seasonal patterns that arrivingcalls follow. (See Figure 5. For more on various typesof uncertainty, see §6.2.)

    Because service capacity cannot be inventoried,managers vary the number of available CSRs to trackthe predictable variations in the arrival rates of calls.In this manner, they attempt to meet demand for ser-vice at a low cost, yet with an acceptable delay. Inturn, capacity-planning naturally takes place from thebottom up: Queueing models determine how manyCSRs must be available to serve calls over a givenhalf-hour or hour; scheduling models determine when

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 91

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    during the week or month each CSR will work;hiring models determine the number of CSRs to hireand train each month or quarter of the year.

    At the lowest level of the hierarchy, the arrivaltimes of individual calls are not predictable (lowerright panel of Figure 5). Here, common practice usesthe M/M/N (Erlang C) queueing model to estimatestationary system performance of short—half-hour orhour—intervals. In doing so, the call center implic-itly assumes constant arrival and service rates, as wellas a system which achieves a steady state quicklywithin each interval. Furthermore, the arrival processis assumed to be Poisson, service times are assumedto be exponentially distributed and independent ofeach other (as well as everything else in the system),and the service discipline is assumed to be first-come,first-served. Blocking, abandonment, and retrials areignored.

    Given these assumptions, the Erlang C formula(see (3) below) allows for straightforward calculationof the stationary distribution of the delay of a callarriving to the system. This and other steady-stateperformance measures are used to make the capacity-accessibility trade-off.

    The calculations begin as follows. Let �i be thearrival rate for 30-minute interval i. Similarly, let ESi�and �i = ESi�−1 be the expected service time and ser-vice rate for the interval. Then define

    Ri�= �i/�i = �iESi� (1)

    to be the offered load and

    �i�= �i/�N�i�= Ri/N (2)

    to be the associated average system utilization oroccupancy (also called “traffic intensity”). Note thatRi, often dubbed the number of offered Erlangs, is aunitless quantity. That is, over half-hour i, an averageof Ri units of service time is offered to the call cen-ter per unit of time, and CSRs are busy an average of�i×100% of the time.

    Given the Erlang C’s no-blocking and no-abandonment assumptions, at least Ri CSRs arerequired to work for a half-hour to serve this expectedload. Furthermore, N must be strictly greater than Ri,

    equivalently �i < 1, for the system to have a steadystate. In this case, the Erlang C formula

    C�N�Ri��= 1−

    ∑N−1m=0�Ri

    m/m!�∑N−1m=0�Ri

    m/m!�+�RiN /N !��1/�1−Ri/N��(3)

    defines the steady-state probability that all N CSRsare busy.

    The application of the “Poisson arrivals see timeaverages” (PASTA) (Wolff 1982) property then allowsus to obtain our first measure of system accessibility,the fraction of arriving customers that must wait tobe served:

    P�Wait> 0�= C�N�Ri�� (4)In turn, given the event that an arriving customermust wait, the conditional delay in queue is expo-nentially distributed with mean �N�i − �i�−1, andadditional steady-state measures of accessibility arestraightforward to calculate:

    ASA�=EWait� = P�Wait>0�·EWait Wait>0�

    = C�N�Ri�·(

    1N

    ) (1�i

    )(1

    1−�i

    )� (5)

    the average waiting time before being served, and

    TSF�= P�Wait ≤ T � = 1−P�Wait> 0�

    ·P�Wait> T Wait> 0�= 1−C�N�Ri� · e−N�i�1−�i�T � (6)

    the fraction of customers that wait no more than Tunits of time, for some T that defines the desired tele-phone service factor. All three stationary measures aremonotone in N : P�Wait> 0� decreasing, ASA decreas-ing, and TSF increasing.

    Figure 6 depicts the empirical relationship betweenASA and system occupancy, �, at a relatively smallcall center, analyzed in Brown et al. (2002a). The“cloud” of points in the figure’s left panel plots theresult for each of 3,867 hourly intervals that the callcenter was open during 1999. The right panel high-lights the relationship between ASA and � by fur-ther averaging the occupancies and ASAs of adjacent

    92 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Figure 6 Congestion Curves Based on Raw and Aggregate Data (from Brown et al. 2002a)

    0 0.2 0.4 0.6 0.8 10

    50

    100

    150

    200

    250

    300

    350

    400

    Occupancy

    Ave

    rage

    wai

    ting

    time,

    sec

    0 0.2 0.4 0.6 0.8 10

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    Occupancy (aggregated)A

    vera

    ge w

    aitin

    g tim

    e (a

    ggre

    gate

    d)

    points. The data plotted in Figure 6 clearly parallelthe theoretical relationship defined by (5).

    It is interesting to note that P�Wait > 0� is a fun-damental measure of accessibility from which ASAand TSF are derived, and it also plays an importantpart in asymptotic characterizations of accessibility.(See §4.) However, it is almost never tracked by call-center management.

    Rather, call centers typically choose ASA or TSF asthe standard used for determining staffing levels. Forexample, a call center might define ASA∗ to be anupper bound on the acceptable average delay of arriv-ing calls. Then the monotonicity of ASA with respectto N is used to find the minimum number of agentsrequired to meet the service-level standard:

    Ni = min�N ≤w ASA ≤ ASA∗�� (7)

    Over relatively long time intervals, variations inarrival rates become more predictable. For example,the fluctuations shown in the lower-left and upper-right panels of Figure 5 are fairly typical patterns ofarrivals over the course of the day and month. Com-mon practice assumes that these fluctuations are com-pletely predictable.

    Point forecasts for system parameters are theninputs to the next level up in the planning hierar-chy, staff scheduling. More specifically, each half-hourinterval’s forecasted �i and �i give rise to a target

    staffing level for the period, Ni. For a call center thatis open 24 hours a day, 7 days a week, repeated useof the Erlang C model will produce 1,440 Nis in a 30-day month. The vector of Nis becomes the input tothe scheduling model.

    We distinguish between two elements of thescheduling process, shifts and schedules. A shiftdenotes a set of half-hour intervals during which aCSR works over the course of the day. A schedule is aset of daily shifts to which an employee is assignedover the course of a week or month. Both shifts andschedules are often restricted by union rules or otherlegal requirements and can be quite complex. Forexample, a feasible shift may start on the half-hourand last nine hours, including an hour total of breaktime. One half-hour of this break must be devotedto lunch, which must begin sometime between twoand three hours after the shift begins, and the otherto a morning or afternoon pause. A feasible sched-ule may require an employee to work five, 9-hourshifts each week of the month, on Sunday, Monday,Tuesday, Friday, and Saturday. Another may require aCSR to work a different set of shifts each week of themonth.

    Now, suppose there is a collection of j = 1� � � � � J ,feasible schedules to which employees may beassigned and that the monthly cost of assigning anagent to schedule j equals cj . This cost includes wagedifferentials and overtime costs that are driven by

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 93

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Figure 7 An Example A-Matrix for the Scheduling Problem

    time j = 1 2 3 4 5 6 7 8 9 10

    8:00-8:29am i = 1 1 1

    8:30-8:59 2 1 1 1 1

    9:00-9:29 3 1 1 1 1 1 1

    9:30-9:59 4 1 1 1 1 1 1 1

    10:00-10:29 5 1 1 1 1 1 1 1 1

    10:30-10:59 6 1 1 1 1 1 1 1 1

    11:00-11:29 7 1 1 1 1 1 1 1 1

    11:30-11:59 8 1 1 1 1 1 1 1 1

    12:00-12:29pm 9 1 1 1 1 1 1 1 1

    12:30-12:59 10 1 1 1 1 1 1 1

    1:00-1:29 11 1 1 1 1 1 1

    1:30-1:59 12 1 1 1 1 1 1

    2:00-2:29 13 1 1 1 1 1 1

    2:30-2:59 14 1 1 1 1 1 1 1

    3:00-3:29 15 1 1 1 1 1 1 1

    3:30-3:59 16 1 1 1 1 1 1 1 1

    4:00-4:29 17 1 1 1 1 1 1 1 1

    4:30-4:59 18 1 1 1 1 1 1 1 1

    5:00-5:29 19 1 1 1 1 1 1

    5:30-5:59 20 1 1 1 1 1 1

    6:00-6:29 21 1 1 1 1

    6:30-6:59 22 1 1

    schedule assignments; it need not include regularwage and benefit costs that do not change with theschedule. Then determination of an optimal set ofschedules can be described as the solution to an inte-ger program (IP). Given i = 1� � � � � I , half-hour inter-vals during the planning horizon, we define the I × Jmatrix A= aij �, where

    aij =

    1� if an agent working according toschedule j is available to take callsduring interval i;

    0� otherwise.

    (8)

    Figure 7 shows the complete A-matrix for sched-ules that cover one 11-hour day (for simplicity, ratherthan 30 days). To enhance readability, only the matrix’sones are shown, not the zeros. Each of the 22 rowsrepresents a different half-hour interval, and each ofthe 10 columns represents a different schedule towhich employees may be assigned. Inspection revealsthat the first five columns all have the same struc-ture; the only difference among them is the time thatan employee assigned to the schedule would start.Similarly, the second set of five columns share thesame structure. Every one of the 10 schedules has anemployee take calls for seven hours of a nine-hour day.

    Letting the decision variables xj� j = 1� � � � � J , rep-resent the numbers of agents assigned to the various

    schedules, and letting Ni� i= 1� � � � � I , denote the half-hourly staffing requirements determined via (7), onesolves

    min�c′x Ax ≥ N�x ≥ 0�x integer� (9)

    to find a least-cost set of schedules. That is, the opti-mal solution to (9) defines the number of CSRs toassign to each monthly schedule, j , subject to thelower bounds on available CSRs imposed by theservice-level constraint. This formulation can becomequite large—with thousands of time slots (rows) andfeasible schedules (columns)—in which case it can-not be solved to optimality. For call centers in which∑

    i Ni is large, the rounded (up) solution of a linearprogramming relaxation may perform well, however(see Gans and Mandelbaum 2002).

    In practice, the formulation of the scheduling prob-lem may differ somewhat from (9). One alternative isto impose an aggregate service-level constraint for alonger period of time, such as a day, rather than onefor each half-hour or hour (see Koole and van derSluis, forthcoming). Another is to minimize the devia-tion between the recommended staffing levels, Ni, andthe actual staffing levels obtained from the assignedschedules (see Buffa et al. 1976). Both of these alterna-tives reduce overall staffing levels, in effect by relax-ing the service-level constraints.

    94 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Furthermore, a solution to (9) defines only howmany agents are assigned to the various schedules,not necessarily which person works on what sched-ule. For large call centers, the final assignment ofemployees to schedules, often called rostering, is aneven more complex problem for which even feasiblesolutions are difficult to construct. Here, heuristics areoften used. One common method ranks employeesby job tenure or seniority and allows higher-rankingemployees to choose their schedules first.

    Figure 8 shows how the number of busy CSRstracks the arrival of work at a fairly large (virtual) callcenter, under study by Brown et al. (2002b). Note that,although call-centers’ WFM systems typically sched-ule CSRs to start working every 15 or 30 minutes, thefigure shows the number of busy CSRs closely track-ing the offered load in the morning. This may be dueto one of two factors, or perhaps a combination of thetwo: either additional CSRs log into the ACD everyfew minutes in the morning as the arrival rate growsor there exist additional (underutilized) CSRs who areavailable to take calls but do not show up in the chartbecause they never take calls. Note also the systemovercapacity during the peak of the day, an intervalover which the center operates at about 80% utiliza-tion. We believe that this relatively low occupancy isdue to a skills-based routing scheme in which spe-cialized CSRs are prohibited from taking regular callsand are, hence, underutilized.

    The solution to (9) also defines the total number ofemployees required to be assigned to monthly sched-

    Figure 8 The Numbers of CSRs Working Tracks the Offered Load (fromBrown et al. 2002b)

    0

    100

    200

    300

    400

    500

    R

    6 8 10 12 14 16 18 20 22 24

    time

    CSRs working

    offered load

    ules, 1′x. Typically this number is then “grossed up”—say by a factor ∈ �0�1�—to account for unplannedbreaks, time spent training and in meetings, absen-teeism, and other factors that reduce employees’ pro-ductive capacity. For example, statistics, such as (OnProd %) and (Sch Avail %) in Figure 3, can be usedin the estimation of . Thus, the number of agentsneeded in month t becomes nt = 1′x/ .

    At the top of the planning hierarchy, a long-term hir-ing problem is solved to ensure that monthly staffingrequirements are met. The horizon for the hiring prob-lem, � , may be on the order of six months to one year.

    The gross numbers of employees needed eachmonth over the planning horizon, �nt# t = 1� � � � �� �,are found by solving the scheduling problem (9)(and the underlying staffing problems, (7)) for eachmonth t. Other input data for the hiring probleminclude the following: An estimate of the monthlyturnover rate, $; and an estimate of the lead time, % ,that is required to recruit and train a new employeeonce the decision to hire has been made.

    It is worth noting that these latter two factors canbe significant. For example, in many centers employeeturnover exceeds 50% per year; hiring and traininglead times can be two or three months, and sometimessignificantly more.

    Given these data, a simple method of addressingthe long-term problem that we have seen is to myopi-cally hire enough new employees so that, by the timethey are trained, the projected number of employeeson hand meets or exceeds the projected requirements.More formally, suppose yt employees are on hand atthe start of month t and that, due to previous months’hiring decisions, yj employees will start working inmonths j = t+1� � � � � t+%−1. Then the number, zt , tohire in month t is

    zt =(nt+% −

    t+%−1∑j=t

    yj �1−$�%−�j−t�)+� (10)

    so that yt+% = zt +∑t+%−1

    j=t yj ≥ nt+% . Here, the termswithin the summation account for the after-turnovernumbers of employees on hand at t+ % , before thehiring decision at t is made. By hiring the differ-ence between nt+% and that total, (10) assumes thatno turnover occurs among employees during recruit-ment and training.

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 95

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    3.3. ForecastingThe hierarchy of capacity-planning models, describedabove, requires the following inputs: Arrival rates �i,service rates �i, productivity factor , turnover rate$, and lead time % . Much of the data required tobuild estimates for these parameters come from ACDreports, such as that shown in Figure 3. For exam-ple, the (Rcvd) and (AHT) columns of the report stateactual arrival rates and average service times for eachhalf-hour of the day.

    The sources of the other data vary. WFM systemssometimes track productivity figures for employ-ees, such as Figure 3’s (On Prod %), through theACD. Employee turnover rates, hiring lead times,and training requirements are (clearly) not capturedby ACD systems, however. These data are collectedfrom employee records by the call-center’s humanresources (HR) department.

    Arrival rates are often forecast on a “top-down”basis. The process begins by aggregating the reportednumber of calls arriving each half-hour into monthlytotals, such as those found in the upper left of Fig-ure 5. These totals are the historical basis of forecaststhat are to be built on a combination of simple time-series methods, such as exponential smoothing, andmanagerial opinion regarding what will happen to thebusiness that the call center supports. (For an earlybook on exponential smoothing see Brown 1963; for arecent one see Makridakis et al. 1998.) The result is amonth-by-month forecast of call volumes.

    Once these top-level forecasts are set, the monthlytotals are then allocated by day-of-week and day-of-month, as well as by time of day. (See the upper rightand lower left of Figure 5.) For example, it may beassumed that 20% of July’s calls are handled in thefirst week of the month and Mondays account for 27%of each week’s total volume. Similarly, each half-hourmay be allocated a fixed percentage of a day’s totalcall volume.

    Common call-center practice is then to assumeconstant arrival rates over individual half-hours orhours. Such an approximation, by a piecewise con-stant arrival-rate function, allows one to use standard,steady-state models. This is reasonable if steady stateis achieved relatively quickly, in particular when theevent rate (�+N� in an M/M/N queue) is large

    when compared to the duration of the interval, andwhen predictable factors that drive the rates are rela-tively stable over the interval.

    In addition to using day-of-week and day-of-monthallocations, managers may flag certain days as spe-cial and increase or decrease anticipated call volumesaccordingly. For example, suppose July 4th falls on aTuesday. Then the anticipated volume for the 4th maybe adjusted down, below normal. Conversely, the vol-ume for the 5th may be adjusted upward, in anticipa-tion of customers who put their calls off from the 4thto the 5th. Again, these adjustments are made usinga combination of data analysis and experience-basedjudgment.

    In theory, the half-hourly ACD records of averageservice times could also be used to generate detailedforecasts of �is. In practice, however, many call cen-ters do not forecast service times or other parametersin detail. Instead, grand averages for historical ser-vice rates, productivity rates, and turnover rates arecalculated.

    For capacity-planning purposes, the parameters �, , and $ are often assumed to be objects of man-agerial control, and how they are set is the resultof negotiation. For example, upper management mayassign call-center managers an objective of reducingemployee turnover rates by 3% or of reducing aver-age handle times by five seconds.

    3.4. The Forecasting and Planning CycleIn most call centers there is a planner who is responsi-ble for agent rosters. Every week or every few weeks,this person begins preparing a forecast for the spec-ified period. Based on this forecast, required num-bers of CSRs are determined and, together with agentand management input (concerning days off, meet-ings, etc.), a roster is determined. This process is veryoften supported by WFM software, whose core func-tion is to forecast arrival rates and average servicetimes and then solve (7) and (9). These WFM packagesallow call-center planners and managers to refine andredefine their operating plans.

    The establishment of an initial agent roster is notyet the end of story, however. Forecasts are continu-ally updated and changes are made to the roster untilthe scheduled day itself. When the roster is executed,

    96 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    a supervisor is responsible for service levels and CSRproductivity. He or she monitors abandonment ratesand waiting times and changes agents’ deployments,based on real-time operating conditions. During theday, data are fed back into the workforce managementtool, forecasts are updated, and the process repeatsitself.

    3.5. Longer-Term Issues of System DesignBeyond workforce management, more strategic deci-sions concern the design of the service process andsystem. Often, HR planning and the use of technologyare tied together through service process design.

    In the case of a single call center with universal(flexible) agents, these issues can be easily illustrated.For example, such a call center may attempt to reduceHR costs by having more calls resolved in its IVR.In this case, additional IVR resources may need tobe purchased, and the IVRs must be programmed tohandle the newly added service. At the same time, theexpected change in CSR load must be estimated: Thefraction of customers that decides to self-serve usingthe IVR causes arrival rates to decline; the eliminationof these calls from the original mix causes the averageservice time to change as well. The changes then flowthrough the staffing models described above, and anestimated reduction in CSR head count may be made.Thus, investment in the IVR is traded off against HRsavings.

    Newer technology expands the possibilities for call-center design, and it also makes the task of evaluat-ing and implementing the options more complex. Forexample, consider how IVRs and skills-based routingmakes the use of part-time CSRs become economi-cally attractive. In general, “part-timers” are valuablebecause they may work only during the daily peakin arriving calls, thereby reducing the number of full-time agents that are (paid but) not well utilized atother times of the day. CSR training is expensive,however, and turnover among part-time employeesis high. To make cost-effective use of the part-timers,their training may need to be reduced. This impliesthat they will be able to serve only a subset of thecalls handled by the center. To identify which incom-ing calls can be handled by part-time CSRs, the IVRis programmed so that customers identify the type

    of service they desire. To make sure that only simplecalls are routed to part-time CSRs, the center investsin skills-based routing.

    Interestingly, while skills-based routing allows formore efficient use of CSR resources, many call cen-ters also see the technology as a means for reducingemployee turnover. The idea expands on the use ofpart-time workers described above. A set of skills isdesigned to act as a career path; as agents learn newskills they move up the ladder. This, in turn, is hopedto improve employee motivation and morale and toreduce job burnout and turnover.

    4. Research Within theBase-Example Framework

    In this section we review research that bears directlyon the capacity-planning problems described in §2.As such, it reflects the state of the art within a narrowcontext: A single type of call is handled by a homoge-neous pool of CSRs at a single location. (We considermodels of multiple call types, CSR skills, and loca-tions in §5.) Even so, this special case provides a chal-lenging set of problems, and its results offer essentialinsights into the nature of capacity management in allcall centers, simple and complex.

    In §§4.1–4.4 we cover queueing models used todetermine short-term staffing requirements. Then §4.5reviews research devoted to the problem of schedul-ing CSRs. Next, §4.6 addresses models for long-termhiring and training. Finally, §4.7 discusses open prob-lems in each of the three areas of research.

    4.1. Heavy-Traffic Limits for Erlang CThe Erlang C model described in §3.2 has been widelyadopted primarily because of its ease of use. In partic-ular, there exist simple expressions such as (4)–(6) formost performance measures of interest. At the sametime, the model has notable limitations.

    Although the Erlang C formula is easily imple-mented, it is not easy to obtain insight from itsanswers. For example, to find an approximate answerto questions such as “how many additional agents doI need if the arrival rate doubles?” we have to per-form a calculation. An approximation of the Erlang Cformula that gives structural insight into this type

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 97

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    of question would be of use to better understandeconomies of scale in call-center operations.

    Erlang-C-based predictions can also turn out to behighly inaccurate because of violations of underlyingassumptions, and these violations are not straight-forward to model. For example, non-exponential ser-vice times lead one to the M/G/N queue which, instark contrast to the M/M/N system, is analyticallyintractable.

    Thus, approximations are useful both to aid insightand to extend modelling scope, and when modellingcall centers, the most useful approximations are typi-cally those for heavy-traffic regimes—those in whichagent utilization is high. The heavy-traffic assumptionnaturally reflects the highly utilized nature of largecall-center operations, particularly the peak-hour con-ditions that drive overall system scale.

    Consider the M/G/N queue. For small to mod-erate numbers (1s to 10s) of highly utilized agents,Kingman’s classical “Law of Congestion” asserts thatdelay in queue is approximately exponential, withmean as given by

    EWait for M/G/N�

    ≈EWait for M/M/N� × 1+c2s

    2(11)

    (see Whitt 1993). Here cs =)�S�/ES� denotes the coef-ficient of variation of the service time, a unitless quan-tity that naturally quantifies stochastic variability.(When cs = 1 and N = 1, the approximation reduces tothe well-known Pollaczek-Khintchine formula and isexact.) Furthermore, the heavy-traffic regime assumedby Kingman 1962—and, more broadly, traditionalheavy-traffic analyses—implies that essentially allcustomers experience some delay before being served.(For recent texts on heavy traffic, see Chen and Yao2001 and Whitt 2002a.)

    Then, given C�N�R�≈ 1, (11) becomesEWait for M/G/N�

    ≈(

    1N

    )ES�

    (�

    1−�)(

    1+ c2s2

    )� (12)

    From (12) we clearly see that the effect on conges-tion of both utilization, �, and stochastic variability,cs , is nonlinear—in fact, increasing convex. Indeed,

    even small increases in load (utilization), �, can havean overwhelming, negative effect on highly utilizedsystems. Performance also deteriorates with longerand more variable service times, ES� and c2s , and itimproves with increased parallelism, N .

    4.1.1. Square-Root Safety Staffing. The use ofKingman’s Law for call centers was advocated by Sze(1984), where it was attributed to Lee and Longton(1959). Sze was motivated by a traffic mix problemin a call center with the following characteristics. Weloosely quote from Sze (1984): “The problems faced inthe Bell System’s operator service differ from queue-ing models in the literature in several ways: (1) Serverteam sizes during the day are large, often 100–300operators. (2) The target occupancies are high, butare not in the heavy traffic range. While approxima-tions are available for heavy and light traffic systems,our region of interest falls between the two. Typically,90%–95% of the operators are occupied during busyperiods, but because of the large number of servers,only about half of the customers are delayed” (p. 229).

    Sze (1984) tests a number of asymptotic approxima-tions for M/G/N systems and, interestingly, favorsApproximation (11). This approximation, in particu-lar, identifies exponential service times with any otherservice time for which cs = 1. But, as will be seen laterin Figure 14, this identification can be inaccurate inthe case of many highly utilized servers. (Perhaps theconclusion in Sze (1984) is due to testing only phase-type service-time distributions, which allows (11) tobe a reasonable approximation.)

    Indeed, for many call centers, N is in the tens orhundreds, rather than ones, and larger N gives riseto an asymptotic regime that differs from that ofKingman’s Law in that significantly many customersdo not wait and service quality is carefully balancedwith server efficiency. For this reason, we call it aQual-ity and Efficiency Driven (QED) operational regime.

    The QED regime for the M/M/N queue was firstanalyzed by Halfin and Whitt (1981). Formally, in thisregime a service rate � is fixed, as well as a targetvalue ∈ �0�1� for P�Wait> 0�. Thus, it is defined asone in which some, but not all, customers wait forservice. Then scaling � ↑ � and N ↑ �, Halfin andWhitt demonstrate

    P�Wait> 0�→ ⇐⇒ √N�1−�N �→ $ (13)

    98 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Figure 9 Optimal � for Linear Waiting and Staffing Costs (from Borst et al. 2000)

    optimal 's for small r's

    0

    0.5

    1

    1.5

    0 2 4 6 8 10

    r

    optimal 's for large r's

    0

    1

    2

    3

    0 100 200 300 400 500

    r

    for some fixed service grade $ ∈ �0���, so that �N =�/N� ↑ 1. They then derive the following asymptoticexpression for the Erlang-C formula:

    P�Wait> 0�≈ P�$�=[1+ $+�$�

    ,�$�

    ]−1� (14)

    where =P�$� in (13). Here + and , are, respectively,the distribution and density functions of the standardnormal distribution (mean = 0, variance = 1).

    For a fixed service grade, $, (13) suggests a square-root safety-staffing principle that recommends the num-ber of servers N to be

    N = R+-= R+$√R� 0< $ 0� decreases with $.

    Recalling that �WaitWait > 0� is exponentially dis-tributed with mean �N�− ��−1, one deduces fromExpressions (5), (6), and (15) that square-root safety-staffing with -= $√R obtains

    EWait� = P�Wait> 0� ·EWait Wait > 0�≈ P�Wait> 0� · ES�

    -� (16)

    as well as the following simple expression for the dis-tribution of delay:

    P�Wait> T �≈ P�Wait> 0� · e−�T /ES��-� (17)

    While Halfin and Whitt’s formal analysis did notappear until the early 1980s, “folk” versions of thissquare-root law have long been recognized. Erlang(1948) himself described the square-root relationshipas early as 1924, and he reports that square-root ruleshad been in use at the Copenhagen Telephone Com-pany since 1913.

    Related, infinite-server heuristics that generatesquare-root staffing rules also have been long recog-nized (see Whitt 1992 and the references in Borst et al.2000). In infinite-server systems, the number of busyCSRs found by an arriving call has a Poisson distri-bution, and the heuristic assumes that in large finitesystems, this number is nearly Poisson if delays arenot prevalent. In turn, a Poisson random variable withmean R is approximately a normally distributed ran-dom variable with mean R and standard deviation√R. Then, given a target delay probability of , one

    chooses $ in (15) such that

    = 1−+�$�≡ +̄�$��This is justified by

    P�Wait> 0� = P�Number of busy servers>N�≈ P�R+Z√R> R+$√R�= +̄�$�� (18)

    Here Z denotes a standard normal random variable,and the PASTA property ensures that P�Wait > 0� =P�Number of busy servers>N�. For smallP�Wait> 0�,+̄−1�$�≈ P−1�$�, and the heuristic’s recommendationessentially matches that of Halfin and Whitt (1981).

    Borst et al. (2000) prove that, for a variety of naturaldelay-cost functions, staffing based on the square-root

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 99

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    principle is, in fact, (asymptotically) optimal for large,heavily loaded systems. That is, the paper showsthat to minimize cost, it is optimal to operate in theQED regime. The same conclusion applies when min-imizing staffing levels subject to constraint on perfor-mance measures, which is more common in practice.

    Square-root safety staffing turns out to be excep-tionally accurate and robust: It is tested in Borst et al.(2000) over all regimes, from very light to very heavytraffic, and it rarely deviates by more than a sin-gle server from the exactly optimal staffing level. Theintroduction to Borst et al. (2000) offers further detailsthrough a set of staffing scenarios.

    Borst et al. (2000) also derives an explicit meansof determining the optimal $, a problem which theyterm “dimensioning.” Figure 9 graphs the optimal $for the case in which delay costs and staffing costsare both linear functions of time. In this case, let rdenote the ratio of delay cost per hour to CSR cost perhour. Then, the optimal $ can be seen to be growingexceptionally slowly with r :

    $�r� ≈

    √r/�1+r�√1/2−1�� 0≤r

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Figure 10 A Quality-Driven Call Center that Takes Sales Orders (from Koole and Mandelbaum 2002)

    Figure 11 Performance of 12 Call Centers in the QED Regime (courtesy of a member of the Wharton Call Center Forum)

    Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003 101

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    agent utilization, in fact, over 95% in a couple ofthe call centers. (Note also that 2.8% of calling cus-tomers abandoned. Customer impatience, however, isbeyond the explanatory scope of Erlang C, and weaddress it in §4.2.2.)

    Recall from (13) that the QED regime is character-ized by a fraction of delayed customers that is neitherclose to zero (quality-driven) nor to unity (efficiency-driven). Indeed, more refined data from the above-mentioned health insurance company show that, over-all, only about 40% of the customers were delayed,while the other 60% accessed an agent immediately,without any delay. Thus, the call-center characteristicsdescribed by Sze (1984) identify the QED regime.

    Economies of scale are the enabler that allowsthe QED regime to circumvent the traditional trade-off between service level and resource efficiency.To sharpen this insight, we consider the followingproblem that is commonly addressed by call-centermanagers: The pooling of geographically dispersedcall centers. This pooling may be achieved eitherphysically—by closing some operations and expand-ing others—or “virtually”—through the use of net-working technology that allows calls to be routed tovarious sites. For this problem we can compare howthe different regimes affect the economies of scaleenabled through pooling.

    As a first step, we use (16)–(17) to define the fol-lowing analogues to (5)–(6):

    ÃSA = E[WaitES�

    ∣∣∣∣ Wait> 0]≈ 1-� (20)

    and

    T̃SF = P{

    WaitES�

    > T

    ∣∣∣∣ Wait> 0}≈ e−T-� (21)

    Note that these definitions modify the standard ver-sions of ASA and TSF in two ways: They are condi-tioned on the event that delay is nonzero, and waitingtime is measured in units of expected service dura-tion, ES�. This gives rise to simple expressions thatare straightforward to compare across regimes.

    We observe that in each of the three regimes a singlemeasure of system performance is fixed, which thendetermines the other performance measures:

    • in the efficiency-driven regime, excess capacity -and, in turn, ÃSA and T̃SF are fixed;

    • in the quality-driven regime, system utilization �=R/�R+-� is held constant; and

    • in the QED regime, the service grade $ and, inturn, P�$�≈ P�Wait> 0� are fixed.The above scalings have been formalized in Whitt(2001).

    Now consider the pooling of m statistically identicalcall centers into a single operation. Each call center hasthe same � and �. The arrival rate to the pooled callcenter is m×�, and its � is unaltered. Figure 12 sum-marizes the results. Note that, within each column, theboxed entries highlight the performance measures thatare fixed under that regime’s scaling.

    Under efficiency-driven staffing, the service gradedecreases from $ to $/

    √m, and the delay probability

    increases from P�$� to P�$/√m� (which can be sig-

    nificant even for small ms). Note, however, that ÃSAand T̃SF are unchanged. As m ↑ �, we observe fastconvergence to a system in which servers are 100%utilized—so that the system behaves as a single serverthat processes m times more quickly—and essentiallyall customers are delayed.

    For the quality-driven system, there is a signifi-cant overall improvement of the service level: ÃSAdecreases to ÃSA/m, T̃SF decreases to �T̃SF�m, and thedelay probability decreases from P�$� to P�$

    √m�. As

    m ↑ �, essentially all customers are served immedi-ately upon arrival.

    Finally, in the QED regime, the service grade andprobability of wait remain constant (by definition).In contrast, ÃSA decreases to ÃSA/

    √m, and T̃SF

    decreases to �T̃SF�√m. Note that it is both efficiency

    driven (occupancy increases to 100%) and qualitydriven (a significant fraction, namely 1−P�$�, of thecustomers is served immediately).

    4.2. Busy Signals and AbandonmentThe Erlang C model provides an exceedingly sim-ple means of trading off capacity and accessibility.In turn, its heavy-traffic limits provide insight intothese trade-offs that deepen our understanding ofeconomies of scale in call centers and how theyshould be managed. There are, however, significantlimitations to the Erlang C model.

    In particular, recall that arriving calls have threeways in which they may exit the system: A call that

    102 Manufacturing & Service Operations Management/Vol. 5, No. 2, Spring 2003

  • GANS, KOOLE, AND MANDELBAUMTelephone Call Centers

    Figure 12 Erlang C in the Efficiency, Quality, and QED Regimes (homework exercise of Mandelbaum and Zeltyn 2001)

    Economies of ScaleBase case: M/M/N with parameters λ, µ, N

    Scenario: λ → mλ (R → m