go ahead, talk. the web is listening. - rhodes university references/nortel - go... · go ahead,...

12
White Paper Go ahead, talk. The Web is listening. How Voice eXtensible Markup Language (VXML) is revolutionizing the way humans interact with their telephones and the World Wide Web.

Upload: truongkhanh

Post on 10-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

White Paper

Go ahead, talk.The Web is listening.How Voice eXtensible Markup Language (VXML) is revolutionizing the way humans interact with their telephones and the World Wide Web.

Page 2: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

2

Executive summaryWhat if you could empower your telephony-based points of contact by giving access to awealth of resources on your Web infrastruc-ture? What if callers could access all the rich-ness of information on your Web servers—from their telephones? What if you couldshare your Web site with people who didn’teven have Internet access?

This blending would completely redefine the possibilities for interactive voice response(IVR) applications. The technology toachieve it is available today, in the form ofVoice eXtensible Markup Language (VXML).

VXML is a markup language for program-ming Web-based applications, like HTML(Hyper-Text Markup Language) and XML(eXtensible Markup Language). But whereHTML delivers visual applications in theform of Web pages, VXML delivers audioapplications—over the telephone.

Nortel Networks self-service solutions useVXML to evolve IVR and advanced speechapplications into a logical browser, which can access millions of pages of informationon the Internet. The Nortel NetworksVoiceXML R2.0 client runs on the NortelNetworks Media Processing Server (MPS)self-service telephony platforms (MPS 1000and MPS 500).

A key advantage of the Nortel Networksstrategy is the choice of using VXML scriptsthat can stand alone or inter-work with applications created with Nortel NetworksPeriProducer, the powerful, icon-baseddevelopment environment for creatinginteractive voice or Web applications on MPS platforms. This preserves the customer’sinvestment and empowers the customer touse either of the two environments.

Read on to find out more about VXML...what it is, what it can do, what businessadvantages it offers, and what NortelNetworks is doing to bring voice access to the Web.

Contents

Executive summary 2

Blending communications channels 3

What is VXML? 4

Table 1. Comparison of traditional Web and voice-enabled Web 4

What can you do with VXML? 5

What does VXML buy you? 5

How do VXML applications work? 7

Figure 1. The VXML architecture separates application and 7

presentation logic onto open, industry-standard components.

Figure 2. VXML evolves IVR systems into interactive clients 8

that bridge telephony and Web environments.

A closer look at VXML features 8

Figure 3. This typical application shows how VXML can voice-enable 9

a Web site that provides stock quotes.

Why is now the time for VXML? 9

Nortel Networks and VXML 10

Summary 11

Page 3: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

Advanced speech processing removed the tedium of using IVR systems byenabling callers to speak their requestsand have the system interpret andrespond to their speech—making IVRsystems more inviting and natural for a greater range of applications.

Customer-centric organizations can benefit from all these communicationsilos to interact with their customers andsupply chain partners. After all, somechannels lend themselves more readily to certain types of content delivery. For example:

• The telephone is ideal for communi-cating concise pieces of verbal infor-mation, such as airline flight times,stock prices, and movie listings.

• The desktop browser is ideal forcommunicating detailed visual andmultimedia information, such asproduct catalogs and maps.

• Live agents are ideal for resolvingcomplex inquiries during reasonableworking hours.

• Automated systems are ideal for efficiently delivering repetitiveinformation in response to pre-dictable queries—24 hours a day, 365 days a year.

Given the value of each communica-tion channel, most enterprises todayhave both a telephony-based contactcenter for agent-assisted and self-service support, and a well-developedWeb presence that provides multi-media information and e-mail contactchannels into the organization.

What if you could blend those commu-nication channels? What if you couldempower your telephony-based points ofcontact by opening up access to a wealthof resources on your Web infrastructure?What if callers could access all the rich-ness of information on your Web servers—from their telephones? What if youcould share your Web site with peoplewho didn’t even have Internet access?

3

Blending communicationschannels Once upon a time, enterprises interactedwith their external audiences through segregated, discrete communication chan-nels. First, there was telephony—with itspromise of anywhere, anytime voice andtouch-tone access for anyone who hadthat most useful and ubiquitous accessdevice: a telephone. “If you are calling toreport a total service outage, press 1... ifyou are calling to request new service,press 2.”

Then came the Web—and the promise ofanywhere, anytime text-and-image accessfor anyone who had an Internet-connectedcomputer. At first, the corporate worldregarded the Web as little more than anew way to publish an online brochure.In a few fast-paced years though, the Webhas emerged as a new stage on which tointeract with customers, not just presentto them.

First there was a verbal world... then avisual one. First there was simplistic accessfrom the always-on, always-available telephone... then multimedia-rich accessfrom a plugged in, logged in, turned-oncomputer. First there was access to ahuman or automated system at the otherend of the phone line... then access tomillions of pages of content on the otherside of the Internet connection. Cust-omer service evolved from the humantouch, to touch-tone, to keyboard andmouse. Speak and hear, point and clickand see.

Yet today, the typical enterprise still usesrelatively disconnected options for con-necting with callers:

• Live agents respond to telephonecalls. This option is costly and entailsall the issues of staffing, yet live agentshave been indispensable for their abilityto understand spoken requests, espe-cially those that contain multiple

pieces of information (“Do you haveany drive-through locations open after6 p.m. in the San Fernando Valley,maybe near Sherman and Canoga?”)

• Live agents respond to e-mails and“click-to-call” requests from theWeb. This option begins to blendthe Web and telephony componentsof the enterprise’s public presence—but it is limited to giving contact center agents and systems access toWeb communication channels, ratherthan truly integrating them, and stillrequires the availability of live agents.

• Automated systems respond to telephone calls. This option is cost-effective and available around theclock, but automated systems areonly as good as their ability to (1)gather information from dial-padresponses to menu listings (or spokenwords that match a predefined vocab-ulary), (2) correlate this limited inputwith pre-programmed responses, and(3) deliver output. (“The-date-of-your-last-payment-was-April-six-teen-two-thousand-and-three.”)

Callers love automated systems whenthey want (and get) speedy access tobasic information. If you just want toknow when the parcel shipped, if theflight is on time, or when the overduepayment cleared, it can be better to gaininstant access through an automated system rather than wait on hold for ahuman’s help.

Yet if you’ve ever had to enter text stringsin a telephone number pad... or punchin a long account number... or listenthrough 10 menu selections to find theright one... or endure multiple menu layersto reach your goal... you know firsthandthat IVR applications can be frustratingand cumbersome, leaving you wonderingjust how much the organization caresabout serving you.

Page 4: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

4

This blending would completely rede-fine the possibilities for interactive voiceresponse (IVR) applications. The tech-nology to achieve it is available today, in the form of Voice eXtensible MarkupLanguage (VXML).

What is VXML?A voice-based application developmentlanguage designed primarily for phonetransactions, XML (eXensible MarkupLanguage)—a standard of the WorldWide Web Consortium (W3C)—is the most widely accepted, platform-independent standard for building struc-tured documents for Web applications,and is becoming a de facto standard fordata exchange between applications.VXML (Voice XML) is a dialect ofXML developed to write voice dialogsfor self-service solutions.

In the Web world, VXML is to speechwhat HTML is to visual displays.HTML applications are accessed via agraphical Web browser with display,keyboard, and mouse. In contrast,VXML applications are accessed via avoice-capable device that accepts audioand keypad input and delivers audiooutput... such as, well, a telephone.

In short, VXML empowers the Web to interact with users through voice—making Internet content available byvoice and phone to anyone, whether ornot they have a computer or Internetaccess. For users, VXML enables themto interact with the application in themost natural way possible: by speakingand listening.

VXML scripts use a combination ofspeech and touch-tone commands toexchange data between people andmachines, independent of the vendor’shardware. Developers can create audiodialogs that use speech and DTMF key-pad tones as input, and deliver synthe-sized speech or digitized, pre-recordedaudio as outputs. Version 2.0 of thisopen, standard markup language waspublished in January 2003.

Instead of keyboard and mouse, VXML lets users access the

Web using speech recognition and touch-tones for input, and

pre-recorded audio and text-to-speech synthesis for output.

Table 1. Comparison of traditional Web and voice-enabled Web

Access device

Input/output

Request processing

Data retrieval

Information delivery

World Wide Web

� Internet-connected device, such as a PC, laptop, or personal digital assistant.

� Must be turned on, with Internetbrowser software installed and running.

� Input: URL of a Web page, enteredthrough keyboard or mouse.

� Output: Visual—Text and graphics displayed on the device screen.

� Based on the URL entered, the Webbrowser makes an HTTP (Hyper-TextTransport Protocol) request to aserver for an HTML page.

� HTML pages are fetched usingCommon Gateway Interface (CGI) protocol.

� Technologies such as Perl, PHP,ColdFusion*, ASP, and JSP can be used to write code thatgenerates scripts dynamically.

� Web browser parses HTML to render a visual page that interactswith the user through keyboard and mouse.

VXML-empowered voice-enabled Web

� Any mobile, cordless, or landline telephone.

� No need to "turn on" device.� No special features required; user

says or dials the phone number orextension of the service.

� Input: Audio—Natural speech ortouch-tone key presses (DTMF).

� Output: Audio—Pre-recordedaudio streams or text-to-speech(TTS) audio.

� Based on the telephone numberdialed, the voice browser ona telephony server sends an HTTP request to a server for aVXML page.

� Same as Web.

� Voice browser renders the voicemarkup as a sequential dialog con-sisting of prompts using eithertext-to-speech (TTS) or prerecorded audio.

Page 5: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

5

What can you do with VXML?Voice-enable a Web site or buildnext-generation IVR servicesVXML is best suited for applicationsthat require relatively little input fromthe user and deliver highly-targeted output that generally is (or could be)available from an HTML Web interface.

A typical application is a service, wherebycallers dial a phone number to retrieveinformation, such as stock quotes, air-line flight information, or weather froma Web site. Early adopters tend to usethe technology in this way, but VXMLwill gain ground for more diverse appli-cations, such as voice-enabled intranetsand contact centers, notification services,and other innovative telephony services.Some typical applications include the following:

Information retrieval. VXML is idealfor applications where input requires afew navigational commands and moder-ate data entry, such as “Dial or say ‘1’for yesterday’s assembly line perfor-mance statistics,” “Say the name of theproduct” for updated market-develop-ment notes, “or your department num-ber and password” for company newsfrom the intranet. Voice input can usequite a large vocabulary, such as free-form street addresses for a city, or stockquotes for a specified company andperiod. Natural language speech recog-nition makes this interaction easier thanever. “I need flight information fromRDU to Baltimore on Friday.” “Whenis order 54362A for Jean Smith due toarrive in Columbus?”

Electronic commerce. VXML is naturallywell suited for customer service applica-tions (such as tracking parcel shipments,checking account updates, and using call center services), as well as financialapplications, such as getting stockquotes or conducting online banking. If

the customer has specific ordering infor-mation (from a catalog or direct mailflyer, for instance), VXML can be usefulfor order-taking applications.

Telephony services. Personal namedialing, one-number “follow-me” ser-vices, teleconferencing set-up, and othertelephony features can be voice-enabledthrough VXML. For example, a companyor service provider could place a phonedirectory of its employees or subscriberson its Web site, which could then beused to voice-dial just by speaking their names.

Directory assistance. Nortel Networkshad already automated directory assis-tance service in the mid-1990s (“Whatcity... what listing...”), but VXML givesthis type of automation new power andflexibility through integration with theWeb. Corporate name dialing as a pack-aged application makes it easy to reachcolleagues whose numbers you don’tknow simply by speaking their name or department.

Internal processes. Because security features that apply to the Web, such asfirewalls and encryption, can be appliedto voice applications as well, VXML canbe used to create secure intranet applica-tions that voice-enable internal processes,such as supply ordering, HR self-service,and corporate news.

Unified messaging. For mobileemployees and those with multiple contact channels (which is just abouteveryone these days), VXML can unifyvoice and electronic channels, for exam-ple, by reading and recording e-mail overthe phone, and originating and termi-nating pager messages on the phone.

With the advent of sophisticatedspeech recognition algorithms andlarge vocabularies—coupled with theability to access vast resources on theWeb—the application possibilities for

VXML are limited only by imagination,opportunity, and market demand.

What does VXML buy you?VXML brings the advantages of Web-based development and content delivery to telephonyservices, especially to IVR self-service applications.

Everyone can access the Web. For all that the PC has been heralded asthe multimedia communications portalof the future, the phone is and will con-tinue to be important. Phones are avail-able just about everywhere in the world,and there are many more of them thanInternet-connected computers. Phonesare always on; they don’t have to bebooted up. Mobile phones are smallenough to be carried everywhere, muchmore portable than the slimmest laptop,at a tiny fraction of the price. And theirbatteries last longer.

With VXML, any telephone, even themost primitive old-style phones, canbecome a voice portal into the Web.A voice browser running on a telephonyserver interprets the input (speech ordial-pad tones) and passes it to the appli-cation logic running on the Web server.There’s no need for a cumbersome PCwith Web browser and special Internetconnection. When voice-activated “uni-versal remotes” take hold, they will parse VXML content from all devicesin the vicinity.

Self-service applications can be much more sophisticated. Blending the advanced speech process-ing capabilities of telephony serverswith the virtually limitless informationrepositories on the Web, VXML makesit feasible to implement very powerful,flexible applications.

Page 6: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

6

For example, a VXML script couldrecognize part numbers, manufacturingstages, plant locations, trouble tickets,and warranty claims information—andreconcile this information to deliverupdated information to support rapidmarketing, engineering, or recall deci-sions. Or, a VXML script could recog-nize street number, street name, and cityfor starting and ending destinations, anddeliver driving directions between thetwo—an application that requires a very large vocabulary and interpretiveprocessing capability. Using natural lan-guage speech recognition, a script couldrecognize free-style speech, spoken in aconversational way. "What is the interestrate for a 20-year fixed-rate jumbomortgage?"

Call treatments can be customized on the fly. Scripts written in VXML do not needto be pre-compiled; they are usuallyinterpreted as they are executed. Thismakes it possible to write applicationsthat dynamically generate VXML scripts,providing on-the-spot customizationand personalization of the applicationtailored to each customer. A customer’sentry at one point can define how thescript proceeds. For example, if a bankcustomer only has a checking account,then the options for savings account andmoney-market transactions don’t needto be offered. Or if the product manag-er selects the Dallas plant, the systemwon’t offer options that relate only toproducts manufactured in Tucson.

Applications can be developed quickly.VXML leverages the Web paradigm fordevelopment and deployment. Voiceapplications can now be constructedwith plentiful, inexpensive, and power-ful Web application development tools.The scripting language uses the familiartag-based structure of XML and HTML.

As a high-level, domain-specific markuplanguage, VXML shields developersfrom low-level implementation details.

All of these features enable rapid cre-ation of new applications. This ease ofdevelopment is especially valuable forfast-changing organizations whereproducts, teams, and programs arealways in flux.

Applications are easy to deploy. VXML supports the clean separation ofapplication logic on the Web server andpresentation logic in VXML pages deliv-ered to the telephony server. With thismodularity, the application logic on theWeb server can serve up different typesof presentation logic, depending on theuser’s access device—one delivery stylefor a standard telephone, another for aphone with integral screen, perhaps.

Another advantage of VXML as anindustry standard operating on Webservers is that applications don’t have to reside on a proprietary voice server.They can be deployed anywhere on theInternet and accessed from any VXML-compliant voice server.

Code can be reused among applications and platforms. In a traditional self-service IVR environ-ment, the complete speech application(Presentation, Business Logic, and DataAccess) is written in a proprietary IVRdevelopment language unique to eachsupplier’s hardware. If an enterprise hashardware from multiple suppliers, thenan application must be written in eachplatform’s proprietary language. Also, ifan enterprise writes a traditional IVRvoice-based application as well as a Web-based application that accomplishes thesame task (such as Web banking andvoice banking), you cannot reuse thecode from either.

Because VXML is an industry specifica-tion under governance of the W3C,applications that work on one standards-compliant VXML platform will work on others as well. Just as HTML andHTTP (Hyper-Text Transport Protocol)hide much of the complexity of buildinginteractive Web applications, VXMLshields developers from the intricacies oftelephony platforms. The code writtenfor an advanced speech application canbe reused in a parallel Web-based appli-cation; for example, order tracking byphone or on the Web. By having a com-mon language, application developers,platform vendors, and tool providers all can benefit from code portability and reuse.

You can “engage” with customers, not just respond to them.VXML supports the creation of“Engaged” Applications that deliver new personalized, anticipatory services.Engaged applications combine naturallywith your contact centers and self-service applications to achieve a balancebetween your unique knowledge of your customers and your people. Thisdynamic, adaptive approach redefinesyour contact centers as the focus forunderstanding and managing customerinteractions across the enterprise, notjust by telephone.

Previously disparate applications andsystems can now cooperate to deliveroutstanding, personalized multimediacustomer services, to appropriately prioritize and distribute the workloadbetween people and automation, and to seamlessly, easily, and consistentlymanage the entire framework.

Page 7: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

7

How do VXML applicationswork?A VXML application runs on an architecture that includes the followingcomponents:

• A standard Web server runs applica-tion-development tool(s) and VXMLapplication logic which is stored in“documents” created with an XML-type markup. This Web server can be located anywhere, as long as secureconnectivity is available. This servertypically contains a database of infor-mation referenced by the application,or interfaces to external databases or transaction servers to perform the tasks that are invoked in theVXML script.

• A telephony server equipped withVXML capabilities, such as theNortel Networks Media ProcessingServer platforms (MPS 1000 andMPS 500), runs a VXML interpreterthat acts as a client to the Web appli-cation server. This specialized voicegateway node is connected both to

the Internet (or managed IP network)and the public switched telephonenetwork. The VXML interpreter on this server understands VXMLdialogs and controls speech and tele-phony resources, such as automatedspeech recognition and speech synthesis systems, audio play, andrecord functions.

• A TCP/IP-based packet network,such as a corporate intranet or managed IP network from a serviceprovider, transports HTTP requestsand responses between the applica-tion server and telephony server.

• A voice network—such as the publicswitched telephone network (PSTN),private telephone network (such as aPBX system), or Voice-over-IP packetnetwork—connects the user’s accessdevice to the telephony server.

• Any telephone that can connect tothe telephone network can call theVXML platform and invoke thefunctions of the VXML interpreter,speech recognition system, and text-to-speech engine.

In a typical VXML application interaction...

1. The caller dials a telephone numberthat is routed to a telephony serverrunning an IVR system and VXMLclient, as shown in Figure 1.

2. The telephony server translates thetelephone number into a URL, andthe VXML client on the telephonyserver passes an HTTP request to theWeb server and VXML documentdesignated by that URL.

3. The Web server runs the applicationlogic—a VXML document, a scriptthat outlines how to interact with thecaller, and how and when to invokeother VXML documents.

4. The Web server passes the VXMLdocument to the telephony server,using standard HTTP over the packetnetwork. Once downloaded to thetelephony server, documents arecached if possible, so there’s no needto make repetitive fetch requests to the Web server just to get frequently-requested, static documents.

Web server• Service logic - CGI, servlet, GSP• Common application• Host access• Multiple types of access

End user

Voice gateway• Voice and telephony• VXML interpreter

End user

IVR platform• Voice and telephony• Service logic• Host access

Welcome to our company.Please listen...

our options have changed.

PSTN network

evolves

VXML HTML

Figure 1. The VXML architecture separates application and presentation logic onto open, industry-standard components.

Page 8: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

5. The VXML client on the telephony server receives and interprets theVXML document to interact withthe caller. Under control of theVXML document, the telephonyserver may perform any of thesefunctions:

• Play audio prompts, messages,music, or other audio content for the caller.

• Accept DTMF input entered bythe user via the telephone keypad.

• Accept and recognize spokenwords, or simply accept andrecord the caller’s speech withoutrecognizing it.

• Send the user’s information to a Web server.

6. The telephony server passes informa-tion collected from the caller (if any) back to the Web server.

7. The Web server processes the caller’s input and invokes another VXML document—either a static document or a dynamic one created during the session—to continue the session.

8. Ultimately, the last document ends the session and disconnects from the phone call.

In summary, VXML enables the IVRplatform to serve as an interaction client,while moving application logic to a Webserver. By separating application logic(which runs on a standard Web server),from voice dialogs (which run on a tele-phony server and its peripherals), VXMLmakes self-service applications highlyportable. Also, application logic doesn’thave to be duplicated. VXML docu-ments prepared for one platform can be easily reused for other platforms.Furthermore, developers can createvoice-enabled Web services without having to own telephony systems.

A closer look at VXML featuresThe current state of the VXML markuplanguage includes features to controlthe following key elements of theinteraction/dialog:

• Audio input, which can be DTMFtouch-tone keypad entry, naturalspeech recognition, or simply record-ing the caller’s speech

• Audio output, which can be pre-recorded audio files or streams, ortext-to-speech synthesized speech

• Presentation logic, including controlof dialog flow, client-side scripting,and dynamic generation of scripts

• Event-handling, such as what to doif the caller makes no entry, or makesan incorrect entry

• Basic connection control, includingcall transfer (which terminates theVXML session when the call is suc-cessfully transferred), bridging (whichtemporarily suspends the VXML session while the caller is connectedto the third party), and disconnectingthe call.

VXML documents can perform a variety of useful functions, such as the following:

• User prompting options includemenus (“For current interest rates,press “2”); directed dialogs, in whichthe system leads the user through adata collection process (“Enter the 6-digit number that appears in theaddress label on the back of your cat-alog.”); and mixed-initiative dialogs.

• Natural language speech recogni-tion achieves more life-like interac-tions. Callers are no longer con-strained to a narrow range of expres-sions that must be uttered one wordat a time. Powerful statistical modelsenable the self-service application torecognize natural, free-style speech—including colloquialisms and regionaldialects—and extract key concepts todetermine the meaning.

• Mixed-initiative dialogs enable thecaller to provide multiple pieces ofinformation in a single utterance.“List flights to Dallas on Monday,April 28.” This capability reducesfrustration and menu layers for theuser, and minimizes server round-trips for the network.

8

Voice XML provides an applicationprogramming interface for speech and telephony systems.

Architecture components� Document server

- Web server

� VoiceXML gateway

- IVR or media server

� Implementation platforms

- Telephones, mobile phones,

PCs, personal digital assistants

(PDA), and other devices

Gateway

Wired/wireless network

Wired networkResponse Response

Response (voice)

Request (voice)

VoiceXML interpreter

Implementation platform

Document server

Figure 2. VXML evolves IVR systems into interactive clients that bridge telephony and Web environments.

Page 9: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

9

• Event-handling features supportcustom actions in response to user-defined events, such as a request forhelp, or built-in events, such as atimeout or unrecognizable input.Depending on the event, the responsecould be to play some new output tothe caller, continue with the samedialog, or switch to another dialog.

• Arithmetic calculations and textmanipulation enable the applicationto validate the user’s input. Forinstance, if the system has asked theuser for an account number, it canrecognize whether the numeric stringentered by the user contains out-of-range numbers or the wrong numberof digits.

• Branching (“if-then-else”) andBoolean (“and, or, not”) logicenable the application to tailor thedialog based on previous user entries.For instance, the system could recog-nize that the caller’s account numberrepresented a checking account, andthereafter not offer options that per-tained only to savings accounts.

• Progressive prompting uses an incremental counter that records thenumber of invalid responses to aprompt, and enables the applicationto offer progressively more helpfulprompts each time. For example, thesystem might first prompt the user,“What city?” If the response is a no-match, the system could re-promptwith “Please say the name of the city,”or “Say the name of the city whereyou are starting your travel.” The system can assign a confidence level to the user’s response—how close toan intelligible or correct entry it is—and use that information to choosethe subsequent action.

The functions above are enabled byVXML scripts. In conjunction withVXML, traditional Web application programming techniques and otherXML extensions can provide deeperapplication logic, database operations,interfaces to legacy systems, etc. Forexample, CCXML (Call Control XML)is a dialect of XML that can be usedwith VXML to provide call redirection,conferencing, and monitoring based onthe occurrence of unplanned events.SALT (Speech Application LanguageTags) is an XML-based standard thatsupports multi-modal applications, suchas speaking to your PocketPC to “get my current itinerary” instead of enteringa URL.

Why is now the time for VXML?The market is prime for the emergenceof VXML applications, due to severalimportant industry trends:

The World Wide Web has grown and matured. Web servers that formerly delivered onlystatic content can now generate contentdynamically using scripts, server pages,servlets, Java applets, and access to data-bases and legacy systems. Browsers thatwere once limited to point-and-click orentry of URLs can now accept entry informs, templates, and other interactiveformats. Advancements in Web datarepresentation using XML make it easy to exchange data across platformsand formats.

Internet services and content

Content providers

Data access250 million internet users

Voice access450 million wireless users

Opportunity1.3 billion telepnones

• Sales automation• News reports• Live ports• Stock prices• My portfolio

WAP

WML

• Voice activated calling• United messaging• Traffic• Movie listings

VoiceXML

Figure 3. This typical application shows how VXML can voice-enable a Web site

that provides stock quotes.

Page 10: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

Powerful programming tools, such asXML-aware editors and word-processorsthat generate HTML, make it easier fordevelopers to create applications basedon XML and its dialects. Improvementsin Web bandwidth, performance, andquality of service lead to new opportuni-ties for Web applications and services.VXML takes advantage of all theseenhancements.

Automated speech processing has improved. Better algorithms and acoustic modelshave vastly improved the capabilities ofautomated speech-recognition systems.The general growth in computing powerhas made it feasible for systems to sup-port huge vocabularies with thousandsof words. Natural language speech technology can recognize and interpret free-style, conversational speech contain-ing multiple elements of informationand logic.

Advances in speech synthesis using pre-recorded waveforms make text-to-speechsynthesized speech sound less roboticand more lifelike. Although VXMLdoesn’t require either automated speechrecognition or speech synthesis, thesecapabilities make applications more powerful and natural to use.

The World Wide Web has expandedbeyond the desktop PC. Now it reaches laptop PCs, hand-heldpersonal digital assistants (PDAs) withwireless connections, mobile phonessupporting Wireless ApplicationProtocol (WAP), and more.

It will soon reach many more devices,such as overnight delivery drop-boxesthat record their contents and schedulepickups, and vending machines thatreorder supplies when running low. For these futuristic Web devices thatmight not have room for full-fledgedkeyboards and screens, speech will be a more natural interface.

Nortel Networks and VXMLFor more than a century, NortelNetworks has been an industry leader in voice processing. From the days ofAlexander Graham Bell... to the days of Baby Bells... to the transformation of voice onto multi-service packet net-works... we have created innovations invoice networking, speech processing systems, and contact center systems.Nortel Networks is unique in having a heritage rooted in both voice andInternet technologies—augmented by the 1999 acquisition of Periphonics*Corporation, a global leader in interactivevoice response and advanced speechrecognition solutions.

By 2001, Nortel Networks had achieved#1 ranking for IVR solutions, by a widemargin both in the U.S. and worldwide,according to In-Stat. In 2001, the company also garnered the #1 spot incontact center solutions, according toInfoTech and Gartner Dataquest.

With this momentum in customer contact network services, it’s only naturalthat Nortel Networks is working tomake the voice-enabled Web a reality.Nortel Networks is an active contributorto the World Wide Web Consortium(W3C) Voice Browser Working Group,which is responsible for VXML andCCXML specifications.

This commitment is reflected in our self-service portfolio:

• Nortel Networks has developed aVXML interpreter based on the most current draft specification(VoiceXML 2.0, the candidate recommendation as of January 20,2003).

• Nortel Networks self-service solu-tions use this VXML interpreter toevolve IVR and advanced speechapplications into a logical browserthat can access millions of pages of

information on the Internet. ThisVXML interpreter can be integratedinto Nortel Networks MediaProcessing Server (MPS) 1000 and MPS 500—resilient, self-serviceplatforms that combine sophisticatedtransaction handling with applica-tion-controlled call handling forenterprises and service providers.

• The Nortel Networks self-serviceportfolio supports speech recognitionfor 20 naturally spoken languages,text-to-speech, and speaker verifica-tion, and supports multiple modes ofoperation: small vocabulary (up to200 words) for basic menu automa-tion; medium vocabulary (from 200to 2500 words), for example, for traveldestinations; and large vocabulary(from 2500 to 10,000+ words) sup-porting, for example, order entry for a stock brokerage.

• Cross-pollination among develop-ment tools. Nortel Networks offers a choice of options for writing self-service applications: (1) the intuitive,graphical development environmentof PeriProducer, and (2) the text-based markup language of VXML.

PeriProducer applications are con-structed using visual icons, making iteasy to understand how they work,and easy to change them. PeriProduceralso provides capabilities the VXMLspecification doesn’t currently address,such as pre-answer processing andintegration with SS7/C7. On theother hand, developers familiar withXML-based markup languages mayfind VXML more comfortable, espe-cially if the data to support transac-tions comes from Web servers.

10

Page 11: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

The good news is that it’s not aneither-or proposition. VXML appli-cations can invoke PeriProducerapplications and subroutines, andPeriProducer applications can invokeVoiceXML applications. This integra-tion strategy provides our customerswith investment protection and amigration path from one environ-ment to the other, if and whenrequired.

• Support for world-class speech processing engines. Nortel NetworksVoiceXML R2.0 supports Nuance*speech recognition, SpeechWorks*OSR speech recognition engines, andFonix* Faast* text-to-speech synthesis.Additional work in progress includesspeech engines SpeechWorks Speechifytext to speech, ScanSoft* RealSpeak*text-to-speech, and Scansoft Speech-Pearl* speech recognition.

• Integration with management systems. Nortel Networks Voice XMLR2.0 integrates with the PeriphonicsPeriView* system for collection andreporting of call statistics, alarmsreporting, and management ofVXML applications running on MPS self-service platforms.

SummaryVoiceXML is a powerful yet simple lan-guage for building dialogs that blend thevoice world with the Web—effectivelybringing the Web to your telephone, toenable innovative new self-service appli-cations. As the standard matures, gain-ing new features with each iteration, it isgaining widespread adoption, especiallyby the several hundred members of theVoiceXML Forum.

VXML brings the advantages of Web-based development and content deliveryto telephony services, especially to IVRself-service applications:

• Everyone can access the Web. WithVXML, any telephone, even the mostprimitive old-style phones, canbecome a voice portal into the Web.

• Self-service applications can bemore sophisticated. Blending theadvanced speech processing capabil-ities of telephony servers with the virtually limitless data resources ofthe Web, VXML makes it feasible toimplement very powerful, flexibleapplications that recognize natural,freestyle speech.

• Call treatments can be customizedon the fly, providing on-the-spotcustomization and personalization of the application tailored to eachcustomer based on their input.

• Applications can be developedquickly, using familiar and inexpen-sive Web development tools—as wellas the powerful, graphical environ-ment of PeriProducer, with its pre-packaged building blocks.

• Applications are easy to deploy,because the application logic canreside on any standard Web server,anywhere, and be accessed from anyVXML-compliant voice server.

• Code can be reused among applica-tions and platforms. For example,code written for an online bankingapplication can be reused for self-service banking by phone.

• Enterprises can “engage” with customers by providing dynamic,adaptive, personalized services thatthey can interact with using naturalspeech and that most ubiquitousaccess device—the telephone.

For all these advantages and more,Nortel Networks developed VXMLcapabilities for its self-service port-folio—and enabled the developmenttools for this self-service portfolio tointerwork with VXML applications for maximum flexibility and invest-ment protection.

For more information about NortelNetworks IVR and advanced speech processing solutions, including VXMLcapabilities for building the voice-enabled Web, visit our Web site atwww.nortelnetworks.com.

11

Page 12: Go ahead, talk. The Web is listening. - Rhodes University References/Nortel - Go... · Go ahead, talk. The Web is listening. ... Nortel Networks self-service solutions use ... media

Nortel Networks is an industry leader and innovator focused on transforming how the world communicates and exchanges information. The company is supplying its service provider and enterprise customers with communications technology and infrastructure to enable value-added IP data, voice and multimedia services spanning Wireless Networks, Wireline Networks, EnterpriseNetworks, and Optical Networks. As a global company, Nortel Networks does business in more than 150 countries. More information about Nortel Networks can be found on the Web at:

www.nortelnetworks.comFor more information, contact your Nortel Networks representative, or call 1-800-4 NORTEL or 1-800-466-7835 from anywhere in North America.

*Nortel Networks, the Nortel Networks logo, the globemark, Periphonics, and PeriView are trademarks of Nortel Networks. Speech Works is a trademark of Speech Works International Inc. Nuance is a trademark of Nuance Communications Inc. Fonix and Faast are trademarks of Fonix Corporation. ScanSoft, RealSpeak and SpeechPearl are trademarks of ScanSoft Inc.ColdFusion is a trademark of Macromedia Inc.

Copyright © 2003 Nortel Networks. All rights reserved. Information in this document is subject to change without notice. Nortel Networks assumes no responsibility for any errors that may appear in this document.

GSA Schedule GS-35F-0140L1-888-GSA-NTEL

NN104680-081203

In the United States:Nortel Networks

35 Davis Drive

Research Triangle Park, NC 27709

USA

In Canada:Nortel Networks

8200 Dixie Road, Suite 100

Brampton, Ontario L6T 5P6

Canada

In Caribbean and Latin America:Nortel Networks

1500 Concorde Terrace

Sunrise, FL 33323

USA

In Europe:Nortel Networks

Maidenhead Office Park

Westacott Way

Maidenhead Berkshire SL6 3QH

UK

In Asia:Nortel Networks

6/F Cityplaza 4,

Taikooshing,

12 Taikoo Wan Road,

Hong Kong