voice browser original

Upload: akhilsajeedran

Post on 14-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Voice Browser Original

    1/27

  • 7/27/2019 Voice Browser Original

    2/27

    Presented By

    Akhil Sajeendran

    R7A CS

    Roll No:05

    Reg No : 10016175

    SNGCE

  • 7/27/2019 Voice Browser Original

    3/27

    Browser technology is changing very fast these daysand we are moving from the visual paradigm to thevoice paradigm. Voice browser is the technology toenter this paradigm. A voice browser is a device whichinterprets a (voice) markup language and is capable ofgenerating voice output and/or interpreting voiceinput, and possibly other input/output modalities.

  • 7/27/2019 Voice Browser Original

    4/27

    Avoice browseris a device :

    that interprets voice input and interprets voice markuplanguages to generate voice output.

    that interprets a script which specifies exactly what toverbally present to the user as well as when to presenteach piece of information.

  • 7/27/2019 Voice Browser Original

    5/27

    Time frame: 1998 to ??

    Hands-free accessing of web.

    Pragmatic interface for functionally blind users.

  • 7/27/2019 Voice Browser Original

    6/27

    Speech Recognition

    Speech Synthesis

  • 7/27/2019 Voice Browser Original

    7/27

    Voice inputVoXML file Text

    Automatic speech recognition is the process by whicha computer maps an acoustic speech signal to text.

    Speech is first digitized and then matched against a

    dictionary of coded waveforms. The matches areconverted into text

  • 7/27/2019 Voice Browser Original

    8/27

    Text VoXML file Output(Pre-recorded)

    The specification defines a markup language forusers via a combination of prerecorded speech,synthetic speech and music.

    You can select voice characteristics (name, genderand age) and the speed, volume, pitch, andemphasis. There is also provision for overriding thesynthesis engine's default pronunciation.

  • 7/27/2019 Voice Browser Original

    9/27

    World Wide Web Consortium(W3C)

    Voice Browser Working Group

    Speech Interface Framework

  • 7/27/2019 Voice Browser Original

    10/27

    Established on 26 March 1999.

    Re-chartered through 31 January 2009.W3C Team Contacts are KazuyukiAshimura and Matt

    Womer.

    Co-chaired byJimLarson and ScottMcGlashan .

  • 7/27/2019 Voice Browser Original

    11/27

    VoiceXML Speech Synthesis

    Speech Recognition Speech Grammars

    Semantic Interpretation

    Stochastic Language Models

  • 7/27/2019 Voice Browser Original

    12/27

    VoiceXML is a dialog markup language designed for

    telephony applications, where users are restricted tovoice and DTMF (touch tone) input.

    text.html

    text.vxml

    WebServer

    Internet

    Browser

  • 7/27/2019 Voice Browser Original

    13/27

    The specification defines a markup language for

    prompting users via a combination of prerecorded

    speech, synthetic speech and music. We can selectvoice characteristics (name, gender and age) andthe speed, volume, pitch, and emphasis. There isalso provision for overriding the synthesis engine's

    default pronunciation.

  • 7/27/2019 Voice Browser Original

    14/27

    SpeechGrammars

    StochasticLanguageModels

    SemanticInterpretation

    USER

    Speech

  • 7/27/2019 Voice Browser Original

    15/27

    In most cases, user prompts are very carefully designed to

    encourage the user to answer in a form that matchescontext free grammar rules.

    Speech Grammars allow authors to specify rules coveringthe sequences of words that users are expected to say inparticular contexts. These contexual clues allow therecognition engine to focus on likely utterances, improving

    the chances of a correct match.

  • 7/27/2019 Voice Browser Original

    16/27

    In some applications it is appropriate to use open

    ended prompts (how can I help). In these cases,

    context free grammars are unuseful. The solution is to use a stochastic language model.

    Such models specify the probability that one wordoccurs following certain others. The probabilities

    are computed from a collection of utterancescollected from many users.

  • 7/27/2019 Voice Browser Original

    17/27

    The recognition process matches an utterance to a

    speech grammar, building a parse tree as a byproduct.

    There are two approaches to harvesting semanticresults from the parse tree:

    1. Annotating grammar rules with semanticinterpretation tags.

    2. Representing the result in XML.

  • 7/27/2019 Voice Browser Original

    18/27

    It can be divided into three categories :

    Web Browsing

    Limited information Access

    Spoken Dialog Systems

  • 7/27/2019 Voice Browser Original

    19/27

    Browse any web pages using speech input.

    Parsing for the purpose of voice recognition donewhen the page is accessed.

    May or may not produce a voice feed back.

  • 7/27/2019 Voice Browser Original

    20/27

    Useful information in limited domains like weather ina city, checking stock updates etc.

    Audio feed back

  • 7/27/2019 Voice Browser Original

    21/27

    Client-server architecture is used

    Used for connecting to a remote server by a Javaapplet(client).

    Examples are connecting to email servers

  • 7/27/2019 Voice Browser Original

    22/27

    Voice is a very natural user interface which speeds upbrowsing.

    Less space requirements.

    Portable voice browsers can also be implemented.

    Practical interface for functionally blind users.

    Users can browse web while keeping there hands and

    eyes for other jobs

  • 7/27/2019 Voice Browser Original

    23/27

    Voice browsing will become visual(Multi-model)

    Can be integrated to an OS

    Integrated to every application.

  • 7/27/2019 Voice Browser Original

    24/27

    Browser technology is changing very fast these daysand we are moving from the visual paradigm to thevoice paradigm.

    Voice browser is the technology to enter thisparadigm.

    Voice browser is a device which interpret voice input

    and generate voice output.

  • 7/27/2019 Voice Browser Original

    25/27

    http://www.w3.org/standards/webofdevices/voice

    http://xml.coverpages.org/ccxml.html

    http://reactos.ccp14.ac.uk/Voice/ http://www.w3.org/Voice/1998/Workshop/PhilJenkins

    .html (for IBM)

  • 7/27/2019 Voice Browser Original

    26/27

  • 7/27/2019 Voice Browser Original

    27/27