speech and speech recognition resources

Upload: hermiit

Post on 31-May-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Speech and Speech Recognition resources

    1/25

  • 8/14/2019 Speech and Speech Recognition resources

    2/25

    OFTWARE DREAMS AND TALKING MACHINES

    ould someone provide a more informative description?5.2: PERFORMING SPEECH SYNTHESIS

    here are several algorithms. The choice depends on the task they're used for. Theasiest way is to just record the voice of a person speaking the desired phrases. Thisseful if only a restricted volume of phrases and sentences is used, e.g. messages inain station, or schedule information via phone. The quality depends on the waycording is done.ore sophisticated but worse in quality are algorithms which split the speech intomaller pieces. The smaller those units are, the less are they in number, but the qualiso decreases. An often used unit is the phoneme, the smallest linguistic unit.epending on the language used there are about 35-50 phonemes in western Europe

    nguages, i.e. there are 35-50 single recordings. The problem is combining them asuent speech requires fluent transitions between the elements. The intellegibility iserefore lower, but the memory required is small.solution to this dilemma is using diphones. Instead of splitting at the transitions, th

    ut is done at the center of the phonemes, leaving the transitions themselves intact.his gives about 400 elements (20*20) and the quality increases.he longer the units become, the more elements are there, but the quality increases

    ong with the memory required. Other units which are widely used are half-syllablesyllables, words, or combinations of them, e.g. word stems and inflectional endings.5.3: REFERENCES/BOOKS ON SYNTHESIS

    OOKS AND PAPERS

    Douglas O'Shaughnessy, Speech Communication: Human and Machine Addisonesley series in Electrical Engineering: Digital Signal Processing, 1987.

    D. H. Klatt, "Review of Text-To-Speech Conversion for English", Jnl. of the Acousticociety of America (JASA), Vol 82, pp 737-793.

    "Talking Machines, Theories, Models and Designs" Eds, G. Bailly & C. BenoitElsevier: North Holland)

    . H. Witten. Principles of Computer Speech, London: Academic Press, Inc., 1982.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (2 of 25)2004/12/10 02:42:38 .

  • 8/14/2019 Speech and Speech Recognition resources

    3/25

    OFTWARE DREAMS AND TALKING MACHINES

    W.B. Kleijn and K.K. Paliwal (Eds.), Speech Coding and Synthesis, Elsevier,msterdam, 1995.

    John Allen, Sharon Hunnicut and Dennis H. Klatt, "From Text to Speech: The MITalkystem", Cambridge University Press, 1987.urvey of the State of the Art in Human Language Technology Report edited by Rona

    Cole et. al. with a section on Text-to-Speech Technologies.

    IBLIOGRAPHIES AND REFERENCE LISTS

    WW searchable online-bibiliography for Phonetics and Speech Technology with moan 8000 entries.

    Provided by Institut fur Phonetik at Johann Wolfgang Goethe-Universitat

    Frankfurt.

    omputational Speech ProcessingSpeech Analysis, Recognition, Understanding, Compression, Transmission,Coding, Synthesis ; Text to Speech Systems, Speech to Tactile Displays, SpeakIdentification, Prosody Processing : BIBLIOGRAPHY, by Conrad F.Sabourin, 192 volumes, 1187p, ISBN 2-921173-21-2, INFOLINGUA inc., P.O. Box 187 SnowdoMontreal, H3X 3T4, Canada.

    ee also: http://gomer.mlink.net/infolingua.html

    5.4: SPEECH SYNTHESIS ON THE WWW

    ost of the following are links to WWW pages with demonstrations of speechynthesis. Plenty more links are included in the detailed list of speech synthesisoftware/hardware in Q5.5.peech Synthesis "Museum" URL: http://www.cs.bham.ac.uk/~jpi/synth/museum.htm

    aintained by Jon Iles ([email protected]) at the University of Birmingham.formation and speech samples for

    orkTalkoughborough Sound Imagesniversity of Birmingham - FDFSurovocsECtalkT&T Bell Labs Synthesiser

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (3 of 25)2004/12/10 02:42:38 .

    http://www.cse.ogi.edu/CSLU/HLTsurvey/ch5node1.htmlhttp://www.uni-frankfurt.de/~ifb/bib_engl.htmlhttp://www.uni-frankfurt.de/~ifb/bib_engl.htmlhttp://gomer.mlink.net/infolingua.htmlhttp://www.cs.bham.ac.uk/~jpi/synth/museum.htmlhttp://www.cs.bham.ac.uk/~jpi/synth/museum.htmlhttp://gomer.mlink.net/infolingua.htmlhttp://www.uni-frankfurt.de/~ifb/bib_engl.htmlhttp://www.uni-frankfurt.de/~ifb/bib_engl.htmlhttp://www.cse.ogi.edu/CSLU/HLTsurvey/ch5node1.html
  • 8/14/2019 Speech and Speech Recognition resources

    4/25

    OFTWARE DREAMS AND TALKING MACHINES

    W.A.Ll.C. - Welsh Synthesis from CSTRl-Prosodic Speech Synthesis - IPOXrator from Bellcoreavarobotti

    WWW demo of the Pavarobotti synthesis technology developed at the National

    Center for Voice and Speech

    ay...WWW demo of the rsynth speech synthesis software. The WWW capability was

    implemented by Axel Belinfante.usee sonore de la synthese de la Parole en francais

    Speech synthesis examples from a series of French language speech

    synthesisers plus links to other speech synthesis demo pages.

    CP-GrenobleNET-Lannion (with TD-PSOLA)TH-Stockholmniversite-Mons - several versionsT&T Bell Laboratories Voices

    WWW interface to the Demo of the Laureate speech synthesis system - not yet

    commercially available. (this link may be good but it gives odd error messages

    RATOR from BellcoreOnline demo of the ORATOR system developed at Bellcore.

    VOX from TIK, ETH in ZurichDemo of German speech synthesis from Institut fur Technische Informatik und

    Kommunikationsnetze.ulti-Lingual TTS from Gerhard-Mercator University, Duisburg

    Synthesis in German, English or Japanese.MH: Institutionen for Taloverforing och Musikakustik, Kungliga Tekniska Hogskolan

    Synthesis in Swedish, Finish, Norwegian, Icelandic, Danish, British and AmericEnglish, French, German, Italian, Spanish, LA Spanish and Greek.

    xamples of several types of speech synthesis.

    Articulatory Synthesis by HyperASY. SineWave Synthesis. Gestural

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (4 of 25)2004/12/10 02:42:38 .

    http://www.shc.uiowa.edu/fun/pavarobotti/pavarobotti.htmlhttp://www.shc.uiowa.edu/ncvs_home.htmlhttp://www.shc.uiowa.edu/ncvs_home.htmlhttp://wwwtios.cs.utwente.nl/sayhttp://ophale.icp.grenet.fr/exFr.htmlhttp://www.research.att.com/cgi-bin/voices.form/'%3EAT&T%20Bell%20Laboratories%3C/a%3E%20text%20to%20speech%20(TTS)%20synthesizer.%3C/b%3E%3Cp%3E%3Cdt%3E%3Cb%3ELaureate%20from%20British%20Telecom%3C/b%3E%3Cdd%3E%3Cb%3E%20%3CA%20HREF=http://www.bellcore.com/ORATOR/http://www.tik.ee.ethz.ch/cgi-bin/w3svoxhttp://www.fb9-ti.uni-duisburg.de/demos/speech.htmlhttp://www.speech.kth.se/info/software.htmlhttp://www.haskins.yale.edu/Haskins/MISC/special.htmlhttp://www.haskins.yale.edu/Haskins/MISC/special.htmlhttp://www.speech.kth.se/info/software.htmlhttp://www.fb9-ti.uni-duisburg.de/demos/speech.htmlhttp://www.tik.ee.ethz.ch/cgi-bin/w3svoxhttp://www.bellcore.com/ORATOR/http://www.research.att.com/cgi-bin/voices.form/'%3EAT&T%20Bell%20Laboratories%3C/a%3E%20text%20to%20speech%20(TTS)%20synthesizer.%3C/b%3E%3Cp%3E%3Cdt%3E%3Cb%3ELaureate%20from%20British%20Telecom%3C/b%3E%3Cdd%3E%3Cb%3E%20%3CA%20HREF=http://ophale.icp.grenet.fr/exFr.htmlhttp://wwwtios.cs.utwente.nl/sayhttp://www.shc.uiowa.edu/ncvs_home.htmlhttp://www.shc.uiowa.edu/ncvs_home.htmlhttp://www.shc.uiowa.edu/fun/pavarobotti/pavarobotti.html
  • 8/14/2019 Speech and Speech Recognition resources

    5/25

    OFTWARE DREAMS AND TALKING MACHINES

    Computational Model. Pattern Playback system of the 1940's!eSTspeech from Berkeley Speech Technologies, Inc., (BST)urovocs Multilingual Speech Synthesis

    Based on Lernout and Hauspie technology.ADIFIX German Speech Synthesis

    Provided by the Instituts fur Kommunikationsforschung und Phonetik, UniversBonn.

    entigram's TruVoice Demo

    Allows control of speech rate, pitch and other prosodic characteristics.stitute of Phonetic Sciences

    Links to lots of on-line speech synthesis demonstrations provided by the Institof Phonetic Sciences of the Faculty of Arts of the University of Amsterdam.

    ahoo page on speech generation5.5: SPEECH SYNTHESIS SOFTWARE/HARDWARE

    ease email any updates, corrections or additions to the following list. The range of

    ommercially available synthesis software is growing rapidly so any help in keeping date will be appreciated.ther lists of speech synthesis software on the WWW include:

    evin Lenzo's list of Macintosh Speech Resources and Apps

    peech Toys Speech Synthesis InformationN THE FAQ...

    he following speech recognition software/hardware is described in the comp.speechAQ.sTeReSTspeech from Berkeley Speech Technologies, Inc., (BST)heBigMouth

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (5 of 25)2004/12/10 02:42:38 .

    http://www.bestspeech.com/weblang.htmlhttp://www.elis.rug.ac.be/ELISgroups/speech/research/eurovocs.htmlhttp://asl1.ikp.uni-bonn.de/~tpo/Hadiq.en.htmlhttp://www.centigram.com/centigram/TruVoice/index.htmlhttp://fonsg3.let.uva.nl/IFA-Features.htmlhttp://www.yahoo.com/Science/Computer_Science/Artificial_Intelligence/Natural_Language_Processing/Speech_Generation/http://www.speechtoys.com/spchtoys/http://www.cs.cmu.edu/~lenzo/mac_speech_apps.htmlhttp://www.speechtoys.com/spchtoys/spsyn.htmlhttp://www.speechtoys.com/spchtoys/spsyn.htmlhttp://www.cs.cmu.edu/~lenzo/mac_speech_apps.htmlhttp://www.speechtoys.com/spchtoys/http://www.yahoo.com/Science/Computer_Science/Artificial_Intelligence/Natural_Language_Processing/Speech_Generation/http://fonsg3.let.uva.nl/IFA-Features.htmlhttp://www.centigram.com/centigram/TruVoice/index.htmlhttp://asl1.ikp.uni-bonn.de/~tpo/Hadiq.en.htmlhttp://www.elis.rug.ac.be/ELISgroups/speech/research/eurovocs.htmlhttp://www.bestspeech.com/weblang.html
  • 8/14/2019 Speech and Speech Recognition resources

    6/25

    OFTWARE DREAMS AND TALKING MACHINES

    reative TextAssist and TextAssist APISRE: Computerized Speech Research EnvironmentECtalk: Text-to-Speech from Digitaloquence

    macspeak - A Speech Output Subsystem For EmacsurovocsADIFIX

    fovox Product RangeOX: All Prosodic Speech Synthesis Architecture

    SRUatt-style synthesiserPE80 - A Klatt Synthesiser and Parameter Editorearph": Trainable text-to-phoneme software by Antonio Luccaernout and Hauspie Text-To-Speech (3 products)ernout and Hauspie Text-To-Speech Windows SDKacintosh Speech Output Applications

    acinTalkonologue for Windows from First Bytearrator Translator LibraryarratorextToSpeech Kit (NeXT)rator from BellcoreAM - A Text-To-Speech ApplicationroVerbe Speech Engine for WindowsroVoice Developer's Speech Toolkit from First Byte

    C Systems V8600/V8601 Text to Speech synthesizersynth

    ENSYN speech synthesizerGI Developers Toolbox SynthesiserMTEL

    ound Bytes DeveloperUs Kitpchsyn.exepeakpeech Manager and PlainTalk

    ext to Phoneme Program 1ext to phoneme program 2ext to phoneme program 3nytalk

    rueTalkruVoice from CentigraminSpeechsTeR

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (6 of 25)2004/12/10 02:42:38 .

  • 8/14/2019 Speech and Speech Recognition resources

    7/25

    OFTWARE DREAMS AND TALKING MACHINES

    atform: UNIXescription:

    TTS front-end program which encodes structural information about documentsspeech synthesis. For more information check out:http://www.research.digital.com/CRL/personal/raman/aster/aster-toplevel.html

    peration requirements: Lisp: Lucid, clispontact: T. V. RamanWW page

    mail: [email protected] from Berkeley Speech Technologies, Inc., (BST)

    atform: ?

    escription: BeSTspeech reads ASCII text no vocabulary limits. Available for Dutch,nglish (male and female), French, German, Italian, Portuguese, Spanish, Arabic,antonese, Japanese, Korean, Malay, Mandarin and Russian.

    rice: ?ontact: Berkeley Speech Technologies, Inc.246 Sixth Street, Berkeley, California 94710, USAh: (510) 841-5083, Fax: (510) 841-5093mail: [email protected]

    WW

    heBigMouth - a Text to Speech Program

    atform: NeXTescription: Text to speech program based on concatenation of pre-recorded speechegments. NeXT equivalent of "Speak" for Suns.vailability: try NeXT archive sites such as sonata.cc.purdue.edu.reative TextAssist

    atform: Windowsescription: Based on DECtalk speech synthesis. A detailed technical description ofextAssist is provided on the Creative WWW pages.

    vailability: Creative TextAssist is bundled with most (all?) Creative Sound Blasterudio cards.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (7 of 25)2004/12/10 02:42:38 .

    http://www.research.digital.com/CRL/personal/raman/aster/aster-toplevel.htmlhttp://www.research.digital.com/CRL/personal/raman/raman.htmlmailto:[email protected]:[email protected]://www.bestspeech.com/index.htmlhttp://www.bestspeech.com/index.htmlmailto:[email protected]:[email protected]://www.research.digital.com/CRL/personal/raman/raman.htmlhttp://www.research.digital.com/CRL/personal/raman/aster/aster-toplevel.html
  • 8/14/2019 Speech and Speech Recognition resources

    8/25

    OFTWARE DREAMS AND TALKING MACHINES

    ontact: Creative Labs, Inc.ddress, phone, email etc unknownWW

    fo

    reative TextAssist API

    atform: Windowsescription: The TextAssist API (TAAPI) is created for Microsoft Windows 3.1x andindows 95 developers who intend to develop 16-bit Text-to-Speech software

    pplications using Creative's TextAssist speech engine. It supports direct control ofpeech output characteristics, concurrent playback of text-to-speech and wave files,reign language support, speech synchronization, and exception dictionaries. It alsocludes a voice editing tool for creating new custom voices, a Visual Basic Customontrol for high-level text-to-speech support in Visual Basic and other languages andome sample programs.

    vailability: The TextAssist API is released to registered developers at no cost.ontact: WWW

    SRE: Computerized Speech Research Environment

    atform: PCescription: CSRE is a software system which includes in an implementation of the

    att speech synthesizer. See the CSRE entry in Q1.9 and the AVAAZ WWW pages foore detail.

    ontact: AVAAZ Innovations Inc.O.Box 8040, 1225 Wonderland Rd. N, London, Ontario, CANADA, N6G 2B0

    h: +1-519-472-7944 , Fax: +1-519-472-7814mail: [email protected]

    WW

    ECtalk Speech Synthesis

    atform: Windows NT, Alpha with Digital UNIX and RS232 portsescription:

    Converts ordinary text into natural-sounding, intelligible speech. Providespersonalized voices, and extensive user controls. DECtalk technology is availafor the following packaging options.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (8 of 25)2004/12/10 02:42:38 .

    http://www.creaf.com/http://www.creaf.com/wwwnew/tech/devcnr/tassist.htmlhttp://www.creaf.com/mailto:[email protected]://www.icis.on.ca/homepages/avaaz/http://www.icis.on.ca/homepages/avaaz/mailto:[email protected]://www.creaf.com/http://www.creaf.com/wwwnew/tech/devcnr/tassist.htmlhttp://www.creaf.com/
  • 8/14/2019 Speech and Speech Recognition resources

    9/25

    OFTWARE DREAMS AND TALKING MACHINES

    ECtalk PC card option:An industry-standard ISA/EISA bus card implementation that can be integratedwith any Intel 486 processor-based system running DOS or Windows.Applications can be interfaced to the bus via a DOS Terminate and Stay Reside(TSR) driver or a Windows Dynamic Link Library (DLL). This option is availablewith an external speaker with volume control and headphone jack.

    ECtalk Express external package:An external, portable package that you can plug in to any PC or serial port. Theexternal package includes a built-in speaker and headphone jack, plus combineon/off and volume controls and a rechargeable battery pack.

    ECtalk Software solution:Software-only text to speech for Alpha or Intel systems running Windows NT oAlpha systems running Digital UNIX. Provides complete speech synthesiscapabilities so developers can enhance applications with DECtalk technology.

    DECtalk Software output can be directed to audio devices, into WAVE files, or imemory buffers.

    ricing:DECtalk-Speech-Synthesis

    ore Information:Digital Equipment Corporation WWW pages:

    Ph: 1-800-DIGITAL

    ECtalk Software

    atform: Digital UNIX and Windows NTescription:

    DECtalk converts standard ASCII text into natural, intelligible speech. Speechoutput through any audio device is supported by Microsoft Video for Windows Multimedia Services for Digital UNIX. An API gives developers direct access totext-to-speech functions. Provides nine voice personalities (4 female, 4 male, 1child). Provides punctuation and tonal control, supports customized

    pronunciation of trade jargon and acronyms. Common programming interfaceworks with both Alpha and Intel platforms.

    ore Information:Digital Equipment Corporation WWW pages:

    DECtalk Software page:

    Ph: 1-800-DIGITAL

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (9 of 25)2004/12/10 02:42:38 .

    http://www2.service.digital.com/ddi/html/DECtalk-Speech-Synthesis-oi.htmlhttp://www.digital.com/http://www.digital.com/http://www2.service.digital.com/ddi/html/DECtalk-Software.htmlhttp://www2.service.digital.com/ddi/html/DECtalk-Software.htmlhttp://www.digital.com/http://www.digital.com/http://www2.service.digital.com/ddi/html/DECtalk-Speech-Synthesis-oi.html
  • 8/14/2019 Speech and Speech Recognition resources

    10/25

    OFTWARE DREAMS AND TALKING MACHINES

    loquence

    atform: Windows, Solaris, SunOS, SGI, RS/6000escription:

    Software based text-to-speech package. Generates waveforms completelyalgorithmically instead of by concatenating waveforms, for maximum flexibilityand naturalism. For instance, when the user requests a deeper voice, the softw

    simulates a larger vocal tract, instead of simply pitch-shifting samples.

    Uses high-level linguistic parsing, which obviates the need for a huge dictionarHandles numbers, acronyms, currency, etc. Includes a set of annotation symbofor placing stress on particular words, expressing excitement/boredom, etc. Alallows phonetic input. Support for Windows DDL.

    Produces male and female voices for General American English. Dialects undedevelopment include Alabama, Brooklyn, and Boston.

    rice:Flexible license agreements on application.

    vailability:Eloquent Technology, Inc.2389 North Triphammer RoadIthaca, NY 14850Ph: (607) 607-266-7020 Fax: (607) 607-266-7030Email: [email protected]

    macspeak - A Speech Output Subsystem For Emacs

    atform: UNIX, Emacsescription:

    Emacspeak is a speech output system that will allow someone who cannot seework directly on a UNIX system. Emacspeak is built on top of Emacs. Withemacspeak loaded, Emacs provides spoken feedback for everything you do.

    Emacspeak currently supports the new Dectalk Express speech synthesizer, aswell as older versions of the Dectalk e.g. the MultiVoice. See the EmacspeakWWW page, the Emacspeak FAQ or the Emacspeak distribution for additionaldetails.

    equirements:Requires GNU FSF Emacs 19 (version 19.23 or later) and TCLX 7.3B (ExtendedTCL) to run Emacspeak.

    vailability:

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (10 of 25)2004/12/10 02:42:38 .

    mailto:[email protected]:[email protected]
  • 8/14/2019 Speech and Speech Recognition resources

    11/25

    OFTWARE DREAMS AND TALKING MACHINES

    Not known at this time (web sites are gone)ontact: T. V. Raman, [email protected]

    urovocs

    atform: Various - RS232 Connectionescription:

    Eurovocs is a stand-alone text-to-speech synthesizer which uses the text-to-speech technology of Lernout and Hauspie Speech Products. Available for DutFrench, German and American English with other languages planned for releassoon. One Eurovocs device can support two different languages. Eurovocs canbe connected to any computer via a standard serial interface (RS232). It supporpersonal dictionaries, generation of DTMF tones, and pronunciation of specialcharacter sequences such as digit strings, telephone-numbers, date and timeindications, abbreviations, alphanumeric strings etc.

    ontact:Technologie & RevalidatiePostbus 128, B-9000 Gent, BelgiumPh: +32-9-264 33 97, Fax: +32-9-264 35 94E-mail: [email protected] page:

    ADIFIX

    atform: Windowsescription:

    German speech synthesis system developed at the Institute for CommunicationResearch and Phonetics , University of Bonn. Provides conversion of input texphonemes, automatic prediction of stress, phrasing and pitch, and speechgeneration by concatenation of small units of natural speech. Demisyllables ansimilar units are used; they comprise all consonants before the vowel and thebeginning of the vowel (initial demisyllable) or the end of the vowel and thefollowing consonants (final demisyllable). For example, the word 'Strolch' is

    formed by concatenating 'Stro' and 'olch'.

    emo:Windows demo software available. Limited to synthesis of one short text (text.t

    at a time. Speech format limitations too. 1.3MB file.WW page

    n-line demo

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (11 of 25)2004/12/10 02:42:38 .

    mailto:[email protected]:[email protected]://www.elis.rug.ac.be/ELISgroups/speech/research/eurovocs.htmlftp://asl1.ikp.uni-bonn.de/pub/hadifix/hadidemo.ziphttp://asl1.ikp.uni-bonn.de/~tpo/Hadifix.en.htmlhttp://asl1.ikp.uni-bonn.de/~tpo/Hadiq.en.htmlhttp://asl1.ikp.uni-bonn.de/~tpo/Hadiq.en.htmlhttp://asl1.ikp.uni-bonn.de/~tpo/Hadifix.en.htmlftp://asl1.ikp.uni-bonn.de/pub/hadifix/hadidemo.ziphttp://www.elis.rug.ac.be/ELISgroups/speech/research/eurovocs.htmlmailto:[email protected]:[email protected]
  • 8/14/2019 Speech and Speech Recognition resources

    12/25

    OFTWARE DREAMS AND TALKING MACHINES

    fovox Product Range

    escription:Multilingual Text-to-speech systems, languages available: American English,British English, German, French, Spanish, Italian, Swedish, Norwegian, IcelandDanish and Finnish.

    roduct name:INFOVOX 500, PC BOARDProduct description: Half length expansion board for IBM PC, XT, AT, PS/2 mod30 or compatible personal computers. The board can also be connected via theserial port. Language and control program for downloading into RAM or mounton EPROMs

    r Platform: for IBM PC, XT, AT, PS/2 model 30 or compatibler Delivered standard interface: MS DOS I/O driver

    roduct name: INFOVOX 600, OEM BOARD

    Product description: OEM board built with CMOS IC's. Language and controlprogram are stored in on-board fixed memory.

    r Platform: any, Interface: 9-pole D-SUB (RS 232-C) 300-9600 Baud.r Delivered standard interfaces: MS DOS I/O driver and interface to Apple

    Speech manager.roduct name: INFOVOX 700, DESKTOP UNIT

    Product description: Desktop unit with built in Infovox 600 to be connected to acomputer or terminal via an RS 232-C serial interface. Built in loudspeaker andrechargable battery for 4 hours use, and control knobs for continuous control ospeech volume and speed.

    r Platform: anyr Delivered standard interfaces: MS DOS I/O driver and interface to Apple

    Speech managerroduct name: INFOVOX 650, OEM BOARD

    Product description: OEM-board built with CMOS IC's. Language and controlprogram are stored in on-board memory.

    r Platform: any, Interface: 9 pole D-SUB (RS 232-C) 300-9600 Baudr Delivered standard interfaces: MS DOS I/O driver and interface to Apple

    Speech managerroduct name: INFOVOX 750, DESKTOP UNIT

    Product description: Desktop unit with built in Infovox 650 to be connected to acomputer or terminal via an RS 232-C serial interface. Built in loudspeaker andrechargable battery for 5 hours use, and a control knob for continuous control speech volume.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (12 of 25)2004/12/10 02:42:38 .

  • 8/14/2019 Speech and Speech Recognition resources

    13/25

    OFTWARE DREAMS AND TALKING MACHINES

    r Platform: anyr Delivered standard interfaces: MS DOS I/O driver and interface to Apple

    Speech managerroduct name: Infovox 210, software for Apple Macintosh

    Product description: Software based text-to-speech conversion. Produces 16 band 8 bit sound. Delivered on 3.5" diskettes with user lexicon and a completedocumentation.

    r Platform: Apple Macintosh with minimum 68030, 33 MHz microprocessorr Delivered standard interfaces: Standard interface to Apple Speech manag

    roduct name: Infovox 220, software for Microsoft Windows.Product description: Software based text-to-speech conversion. Produces 16 bsound and conforms to Microsoft Windows multimedia standard MCI. Deliveredon 3.5" diskettes with user lexicon and a complete documentation.

    r Platform: IBM compatible PC with minimum 486, 25 MHz microprocessorr Delivered standard interfaces: Standard interface to Microsoft Windows 3

    and sound boards supporting Microsoft Windows multimedia driver foraudio.

    ontact:Telia Promotor Infovox ABTTS Sales DivisionP.O. Box 2069S-171 02 Solna, SwedenPh: +46 8 764 35 00 Fax: +46 8 735 78 76email: [email protected]

    POX: All Prosodic Speech Synthesis Architecture

    escription:IPOX is an experimental, all-prosodic speech synthesizer, developed by ArthurDirksen and John Coleman. IPOX is freely available (after registration) forevaluation and non-profit research purposes.

    equirements:PC (preferably a fast 486) running Windows 3.1 or higher. Sound output require16-bit Windows-compatible sound card

    vailability: By WWW

    SRU

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (13 of 25)2004/12/10 02:42:38 .

    mailto:[email protected]://www.tue.nl/ipo/people/adirksen/ipox/ipox.htmhttp://www.tue.nl/ipo/people/adirksen/ipox/ipox.htmmailto:[email protected]
  • 8/14/2019 Speech and Speech Recognition resources

    14/25

    OFTWARE DREAMS AND TALKING MACHINES

    atform: UNIX and PCost:

    100 pounds sterling (from academic institutions and industry)escription:

    A "C" version of the JSRU system, Version 2.3 is available. It's written in Turbobut runs on most Unix systems with very little modification. A Form of Agreemmust be signed to say that the software is required for research and developme

    only.

    ontact:Dr. E.Lewis [email protected]

    latt-style synthesiser

    atform: Unixost: Freeescription:

    Software posted to comp.speech in late 1992.vailability:

    By ftp from the comp.speech ftp site

    PE80 - A Klatt Synthesiser and Parameter Editor

    atform: Unix

    escription:The KPE80 program provides a graphical interface for the implementation of thKlatt 1980 formant synthesiser written by Jon Iles and Nick Ing-Simmons. It wainspired by IGE, a piece of code written by Rob Fletcher.

    echnical Desc.:It is comprised of an X-Window interface and version 3.03 of the synthesiser coThe interface allows users to display and edit Klatt parameters using a graphicdisplay which includes the time-amplitude waveform of both the original speec

    and its synthetic copy, and some signal analysis facilities. Most of the work inchoosing the parameter values to produce the synthetic copy has to be done bthe user. KPE will estimate the fundamental frequency contour from an originatoken; this estimate will need to be amended where errors occur. It is possible specify the formant trajectories with some precision by overlaying the approprformant frequency parameter tracks on the spectrogram of the target waveformnumber of facilities exist to help in the refinement of parameter values: originaand synthetic waveforms can be compared aurally, spectrally, andspectrographically using built-in speech analysis facilities.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (14 of 25)2004/12/10 02:42:38 .

    mailto:[email protected]://svr-ftp.eng.cam.ac.uk/pub/comp.speech/synthesis/klatt.3.04.tar.gzhttp://www.york.ac.uk/~rpf1/IGE.htmlhttp://www.york.ac.uk/~rpf1/IGE.htmlftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/synthesis/klatt.3.04.tar.gzmailto:[email protected]
  • 8/14/2019 Speech and Speech Recognition resources

    15/25

    OFTWARE DREAMS AND TALKING MACHINES

    le formats:KPE will read RIFF (.wav) files and SFS files. (SFS is a suite of speech-signalprocessing programs available free from Phonetics and Linguistics, UCL.)

    vailability:r KPE for SunOs 4.1.3 (statically compiled libraries)

    r KPE for Linux (statically compiled libraries)

    r The source code (needs gcc and SUIT to compile)r A postscript overview of KPE

    r The SFS distribution

    ee also: Public domain Klatt-style speech synthesis code.ontact: Andrew Simpsonepartment of Phonetics and Linguistics, University College Londonolfson House, 4 Stephenson Way, London NW1 2HE

    mail: [email protected]

    WW page

    earph": Trainable text-to-phoneme software by Antonio Lucca

    atform: UNIXescription: Experimental software which learns text to phoneme translation fromxamples using decision-tree-like data structures. It is based on the assumption thatach letter can correspond to different phoneme strings depending on the context.

    vailability: Examples and source are available on the WWWontact: Antonio Lucca: [email protected]

    ernout & Hauspie Text-to-Speech (3 products)

    ernout & Hauspie have three TTS products. The functionality of the products is simiowever, they differ in hardware implementation and other details where describedelow.

    &H tts2000/T: TTS for the Telephony and Telecommunications Market&H tts2000/M: TTS for the Computer and Multimedia Market&H tts3000/C: TTS for the Buisness and Consumer Electronics Marketescription:

    Text to Speech (TTS) software based on parameterized segment concatenation(diphones, triphones and tetraphones) algorithms. Available for US English,German, Dutch, French, Spanish (Castilian), Italian and Korean.

    eneral features include:r The control of volume, speech rate and speech pitch.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (15 of 25)2004/12/10 02:42:38 .

    ftp://pitch.phon.ucl.ac.uk/pub/kpe/kpe80.sun413.tar.Zftp://pitch.phon.ucl.ac.uk/pub/kpe/kpe80.linux.tar.Zftp://pitch.phon.ucl.ac.uk/pub/kpe/kpe80.src.tar.Zftp://pitch.phon.ucl.ac.uk/pub/kpe/OVERVIEW.psftp://pitch.phon.ucl.ac.uk/pub/sfs/mailto:[email protected]://www.phon.ucl.ac.uk/home/andrew/home.htmlhttp://www.silab.dsi.unimi.it/~al367212/lucca/TTS/ttsdoc.htmlmailto:[email protected]:[email protected]://www.silab.dsi.unimi.it/~al367212/lucca/TTS/ttsdoc.htmlhttp://www.phon.ucl.ac.uk/home/andrew/home.htmlmailto:[email protected]://pitch.phon.ucl.ac.uk/pub/sfs/ftp://pitch.phon.ucl.ac.uk/pub/kpe/OVERVIEW.psftp://pitch.phon.ucl.ac.uk/pub/kpe/kpe80.src.tar.Zftp://pitch.phon.ucl.ac.uk/pub/kpe/kpe80.linux.tar.Zftp://pitch.phon.ucl.ac.uk/pub/kpe/kpe80.sun413.tar.Z
  • 8/14/2019 Speech and Speech Recognition resources

    16/25

    OFTWARE DREAMS AND TALKING MACHINES

    r The use of control sequences to customize TTS output (adding pauses, usingphonetic input, etc.).

    r Switching between languages at run time.r A personal vocabulary editor is available for building exception dictionaries.r Readout modes: letter by letter, word by word or sentence by sentence.r Input formats: orthographic input, phonetic input, phonetic input with prosodic

    information.

    s2000/Tr Output formats: 8 bit mu-law PCM, 8 bit A-law PCM, 16 bit linear PCM.r Sampling Frequency: 8kHzr Single channel platform examples: SHARP SH7000, ARM6/ARM7, Intel i960, TI

    TMS320C31, AT&T DSP3210r Multi-channel platform examples: TI TMS320C31, AT&T DSP3210

    s2000/Mr Output formats: 8/16 bit wave format, 8 bit mu-law PCM, 8 bit A-law PCM, 16 bit

    linear PC.

    r Sampling Frequency: 8/10/11.025 kHzr Single processor platform examples: ARM6/ARM7, Intel 386/486/Pentium,

    Motorola 68040r Two processor platform examples: {Intel 386/486/Pentium or Motorola 68030} a

    {ADI ADSP21XX or Motorola 5600X or TI TMS320C25/20C5X}s3000/C

    r Output formats: 8 bit mu-law PCM, 8 bit A-law PCM, 16 bit linear PCM.r Sampling Frequency: 10kHzr Single processor platform examples: SHARP SH7000, ARM6/ARM7, Intel i960, T

    TMS320C31, AT&T DSP3210r Two processors platform examples: { SHARP SH7000 or ARM6/ARM7 or Intel

    386EX or Motorola 683XX} and {ADI ADSP21XX or Motorola 5600X or TITMS320C25/C5X or TI TSP50C10}

    ee also: L&H Windows TTS SDKore Information: on the Lernout & Hauspie WWW pages

    rice: Unknownontact: Lernout & Hauspie Speech Products00 West Cummings Park, Suite 3100

    oburn, MA 01801, USAel: (617) 932 4118ax: (617) 932 9209mail: [email protected]

    ernout & Hauspie Text-to-Speech Windows SDK

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (16 of 25)2004/12/10 02:42:38 .

    http://www.lhs.com/tts.htmlhttp://www.lhs.com/http://www.lhs.com/http://www.lhs.com/tts.html
  • 8/14/2019 Speech and Speech Recognition resources

    17/25

    OFTWARE DREAMS AND TALKING MACHINES

    atform: IBM-Compatibleescription: The L&H Text-to-Speech software developers kit is able to integrate textpeech technology with your own or existing PC applications under Microsoft Windo1. This software will allow conversion of written text into clear human sounding

    ynthetic speech.

    equirements:r IBM-compatible PC 386 DX/33, 8Mb RAMr MS DOS 5.0 and MS Windows 3.1 (or higher)r SoundBlaster compatible sound board.

    ee also: L&H TTS Productsore Information: on the Lernout & Hauspie WWW pages

    rice: Unknownontact: Lernout & Hauspie Speech Products00 West Cummings Park, Suite 3100oburn, MA 01801, USA

    el: (617) 932 4118ax: (617) 932 9209mail: [email protected], WWW page

    acintosh Speech Output Applications

    comprehensive list of Macintosh Speech Applications is provided by Kevin Lenzo a

    MU

    he Apple Speech WWW Site has some useful information

    acinTalk

    atform: Macintoshost: Freeescription: Formant based speech synthesis. There is also a program called "tex-edhich apparently can pronounce English sentences reasonably using Macintalk.

    ote: MacinTalk doesn't run reliably on Macintosh's with new sound hardware under stest OS (System 7.1 w/HUD 2.0). More recent software is listed above.vailability:

    By anonymous ftp from many archive sites (have a look on archie if you can). tedit is on many of the same sites.

    r http://www.riken.go.jp/archives/mac/umich/sound/speech/00index.txt

    r http://jumbo.com/util/mac/speech/

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (17 of 25)2004/12/10 02:42:38 .

    http://www.lhs.com/tts.htmlmailto:[email protected]://www.lhs.com/http://www.cs.cmu.edu/~lenzo/mac_speech_apps.htmlhttp://www.info.apple.com/apple.speech/http://www.riken.go.jp/archives/mac/umich/sound/speech/00index.txthttp://jumbo.com/util/mac/speech/http://jumbo.com/util/mac/speech/http://www.riken.go.jp/archives/mac/umich/sound/speech/00index.txthttp://www.info.apple.com/apple.speech/http://www.cs.cmu.edu/~lenzo/mac_speech_apps.htmlhttp://www.lhs.com/mailto:[email protected]://www.lhs.com/tts.html
  • 8/14/2019 Speech and Speech Recognition resources

    18/25

    OFTWARE DREAMS AND TALKING MACHINES

    his article by my friend Denise Lance will give you some ideas on the more modern

    peech offerings of Apple/Macintosh. When you have finished reading the article (thee some appropriate notes to read) you can also download English_Text-to-Speechom there.

    onologue for Windows from First Byte

    escription:Monologue is a software program that reads text from the clipboard in Window16 or 32 bit applications. It can be found as a bundled product with many soundcards and multimedia general purpose computer systems. Monologue can addthe element of speech to virtually any text oriented application. Anypronounceable combination of letters and numbers will be spoken clearly. It cabe applied to tasks such as eyes-free proofreading, data verification (e.g.spreadsheets), reading E-mail and more. User-changeable parameters providecontrol over the sound quality by allowing for changes in pitch, and the speed

    speech. An exception dictionary saves preferred pronunciation of words andabbreviations.

    Monologue Win32 now includes support for the Microsoft SAPI. Monologue ma"SpeechFonts" are available for US English, British English, German, French,Latin American Spanish, Italian. A US English Female SpeechFont is alsoavailable. For more detailed information and examples go to the First Byte WWpages.

    vailability: Currently bundled with many sound cards and multimedia general purpoomputer systems. For pricing, licensing details, and release information see the Firsyte WWW pages or email [email protected].

    ee also: ProVoice Developer's Speech Toolkit from First Byteontact: First Byte9840 Pioneer Ave., Torrance, CA 90503h: 310-793-0610 Fax: 310-793-0611

    mail: [email protected] or WWW page

    arrator Translator Library

    atform: Amigaescription:

    A replacement for the Commodore-supplied "translator.library" which is a part the Narrator speech synthesis package. It implements multi-lingual text-to-speefor an Amiga. The library allows the user to specify the language the text to be

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (18 of 25)2004/12/10 02:42:38 .

    http://www.sped.ukans.edu/~dlance/plaintalk.htmlmailto:[email protected]:[email protected]://www.firstbyte.davd.com/http://www.firstbyte.davd.com/mailto:[email protected]:[email protected]://www.sped.ukans.edu/~dlance/plaintalk.html
  • 8/14/2019 Speech and Speech Recognition resources

    19/25

    OFTWARE DREAMS AND TALKING MACHINES

    spoken should be translated as. This can be done by setting the default languaor by including markup codes in the text in a similar way to Latex or Html. eg:"\french{Bonjour}". There is currently support for American English, BritishEnglish, Swedish, Maori, Finnish, German, Icelandic, Klingon, Polish, Italian, anWelsh.P

    vailability:The library (but not source) is available by anonymous ftp from Aminet

    ore Information: is available on the WWW

    arrator

    atform: Amigaescription:

    Formant based speech synthesis. Includes a Engish-to-phoneme translationlibrary, and a SPEAK: pseudo-device for speech output.

    ardware: Standard Amiga hardwarevailability: Part of AmigaOSee Also: The Narrator Translation library

    extToSpeech Kit

    atform: NeXT Computersescription:

    The TextToSpeech Kit does unrestricted conversion of English text to

    synthesized speech in real-time. The user has control over speaking rate, medipitch, stereo balance, volume, and intonation type. Text of any length can bespoken, and messages can be queued up, from multiple applications if desiredReal-time controls such as pause, continue, and erase are included.Pronunciations are derived primarily by dictionary look-up. The Main Dictionaryhas nearly 100,000 hand-edited pronunciations which can be supplemented oroverridden with the User and Application dictionaries. A number parser handlenumbers in any form. A letter-to-sound knowledge base provides pronunciatiofor words not in the Main or customized dictionaries. Dictionary search order is

    under user control. Special modes of text input are available for spelling andemphasis of words or phrases. The actual conversion of text to speech is donethe TextToSpeech Server. The Server runs as an independent task in thebackground, and can handle up to 50 client connections.

    isc:The TextToSpeech Kit comes in two packages: the Developer Kit and the User The Developer Kit enables developers to build and test applications whichincorporate text-to-speech. It includes the TextToSpeech Server, the

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (19 of 25)2004/12/10 02:42:38 .

    ftp://ftp.doc.ic.ac.uk/pub/aminet/util/libs/translator42.lhahttp://www.sans.vuw.ac.nz/~ffranc/translator/index.htmlhttp://www.sans.vuw.ac.nz/~ffranc/translator/index.htmlftp://ftp.doc.ic.ac.uk/pub/aminet/util/libs/translator42.lha
  • 8/14/2019 Speech and Speech Recognition resources

    20/25

    OFTWARE DREAMS AND TALKING MACHINES

    TextToSpeech Object, the pronunciation editor PrEditor, several exampleapplications, phonetic fonts, example source code, and developer documentatThe User Kit provides support for applications which incorporate text-to-speecIt is a subset of the Developer Kit.

    ardware:Uses standard NeXT Computer hardware.

    ost:r TextToSpeech User Kit: $175 CDN ($145 US)r TextToSpeech Developer Kit: $350 CDN ($290 US)r Upgrade from User to Developer Kit: $175 CDN ($145 US)

    vailability: Trillium Sound Research500, 112 - 4th Ave. S.W., Calgary, Alberta, Canada, T2P 0H3el: (403) 284-9278 Fax: (403) 282-6778rder Desk: 1-800-L-ORATOR (US and Canada only)mail: [email protected]

    rator Text-to-Speech Synthesizer

    atform: SUN SPARC, Decstation 5000. Written in C, and therefore portable to otherNIX platforms. Some successful ports: --> HP, RS-6000, PC-Unix [Linux].escription:

    Sophisticated speech synthesis package. Has text preprocessing (forabbreviations, numbers), acronym rules, and human-like spelling routines.Natural-sounding synthesis based on demisyllable concatenation. Has highaccuracy for pronunciation of names of people, places and businesses inAmerica; good accuracy for English text; rules for stress and intonation markinvarious methods of user control and customization at most stages of processin

    A new version of the ORATOR system is under development. Both ORATOR anthis new "ORATOR II" system are capable of general text synthesis. The ORATII system has a more natural-sounding voice.

    ardware: Runs on common SPARC or Decstation workstations, using their internaludio output capability. Recommend at least 16M of memory.

    ore detailed information plus examples of ORATOR synthesis are available on theRATOR WWW pages

    isc 1: A free demo cassette is available.

    isc 2: Examples of Orator are also available on the University of Birmingham Speec

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (20 of 25)2004/12/10 02:42:38 .

    mailto:[email protected]://www.bellcore.com/ORATOR/http://www.bellcore.com/ORATOR/mailto:[email protected]
  • 8/14/2019 Speech and Speech Recognition resources

    21/25

    OFTWARE DREAMS AND TALKING MACHINES

    ynthesis "Museum" WWW site (see Q5.4).

    vailability and Pricing: Contact Bellcore's Licensing Officeel: 1-800-521-CORE (521-2673)ax: 1-908-336-2559mail to Anthony Lindsey: [email protected]

    AM - A Text-To-Speech Application

    atform: Windowsescription:

    PAM is a talking personal assistant and text reader application. It uses theProVoice TTS package. PAM will verbally advise about appointments andreminder messages at specified times during the day. It can read text files,clipboard text, and text sent in DDE messages. Using the full verbal interface,PAM can be used by visually challenged individuals. Shareware - thirty day freetrial.

    equirements: Any Windows sound card, speakers or headphones.in. memory - 4 megs, 8 megs recommended.more complete description is available on the JTS homepage

    vailability:The shareware and associated files can be downloaded by ftp

    rice: $US40 for the registered version.

    ontact: Tom Slemko:[email protected] Micro Consulting Ltd0931 Lytton Road, RR#4adysmith, B.C., Canada, V0R 2E0

    roVerbe Speech Engine for Windows (95 and NT)

    escription: The ProVerbe Speech Engine produces natural sounding speech fromritten text. Naturalness is achieved by using the TD-PSOLA process from the CNET

    rance telecom's research lab.) which is based on the concatenation of elementarypeech units (including diphones). Supported languages are British English, Germanrench and Spanish. For multi-channel applications Elan Informatique also providesardware platforms. The Elan Informatique provides a SDK reference document (sdkxe: WinWord6 format in a self extractable compressed format).

    emo versions:r Telephone demonstration: +33-61 17 6701

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (21 of 25)2004/12/10 02:42:38 .

    mailto:[email protected]://www.islandnet.com/~tslemko/homepage.htmlftp://ftp.islandnet.com/jts/mailto:[email protected]:[email protected]://ftp.islandnet.com/jts/http://www.islandnet.com/~tslemko/homepage.htmlmailto:[email protected]
  • 8/14/2019 Speech and Speech Recognition resources

    22/25

    OFTWARE DREAMS AND TALKING MACHINES

    r Anonymous ftp

    he directory includes the following demos.r PVBSEDP.zip: French male voice (4.3MB)r PVBFRF.zip: French female voice (4.6MB)r PVBSPA.zip: Spanish male voice (4.6MB)r PVBGER.zip: German male voice (14.0MB)r PVBENG.zip: English male voice (9.9MB)

    he directory also includes synthesis samples for a French male voice, French femaloice, English male voice, and a German male voice. The readme file in the directoryescribes the memory requirements for the demos.CD-ROM with all these demonstrations is available. To request it, please email Elanformatique.

    ontact: Elan Informatiquerue Jean Rodier, 31400 TOULOUSE FRANCEontact person: Pierre Delrat

    hone: +33-61-36-0777 Fax: +33-61-36-0770BS: +33-61-36-0788-mail: [email protected] FTP

    roVoice Developer's Speech Toolkit from First Byte

    atform: ProVoice Developer's Toolkits are available for DOS, Windows 3.1, Window5, Windows NT, OS/2, and Macintosh.escription:

    ProVoice allows programmers to add synthesized speech to their applications.Your program passes text strings to the ProVoice speech engine that translatetext into audible speech. Male and/or female "SpeechFonts" are available formany languages; English, French, German, UK British English, Italian, andSpanish.

    ProVoice converts text to speech in two phases using a set of phonetictranslation and pronunciation rules. First, the software analyzes and translatestext into "sound descriptors", a phonetic language with pitch, duration, andamplitude codes which are needed to produce stress patterns in phrases andsentences. Rules are used to analyze words, numbers, and punctuation. Thesecond phase converts the intermediate phonetic language in speech signals;algorithms drive distinct speech signals into smooth flowing, continuous, clearspeech. Real time synchronization of mouth movement and word boundariesallows animation of a graphical talking character, or highlighting of displayed tas it is spoken.

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (22 of 25)2004/12/10 02:42:38 .

    ftp://www.cict.fr/pub/elan/ftp://ftp.cict.fr/pub/elan/ftp://ftp.cict.fr/pub/elan/ftp://www.cict.fr/pub/elan/
  • 8/14/2019 Speech and Speech Recognition resources

    23/25

    OFTWARE DREAMS AND TALKING MACHINES

    Necessary tools and examples are provided for programmers to manipulate theProVoice speech technology; including installation instructions, extensivesamples programs, and complete documentation. In addition, sample code isprovided on disk to illustrate speech programming techniques.

    ote 1: First Byte will perform custom work for embedded systems.

    ote 2: ProVoice Windows includes support for the Microsoft SAPI. It will speakrough any Windows-supported wave audio device.

    ote 3: Distribution of ProVoice for commercial use is subject to execution of aommercial Product Distribution License Agreement.

    or more detailed information and examples go to the First Byte WWW page.

    ee also: Monologue for Windows from First Byte

    rice and Availability:ontact: First Byte9840 Pioneer Ave., Torrance, CA 90503h: 310-793-0610, Fax: 310-793-0611mail: [email protected] or WWW page.

    C Systems V8600/V8601 Text to Speech synthesizers

    atform 1: IBM PC: ISA card.atform 2: Interface to PC/104 standard microcontrollers.atform 3: Standalone (or embedded) thru RS232 or parallel printer port or processo

    us.escription: Converts plain ASCII text to speech. Programmable voices, pitch rate,olume, etc. Built-in DTMF and tone generators.rice: $151-$299 US (qty 1)ontact: RC Systems609 England Avenue, Everett, WA 98203, USAh: (206) 355-3800 Fax: (206) 355-1098urope: +44181 539-0285

    synth

    atform: Various (including Solaris2.3, SunOS4.1.3, HPUX, SGI Irix4.x, Linux)escription: Public domain text-to-speech systm assembled from a variety of sourcesupports CMU and BEEP format dictionaries (as described in Q1.10) and now utilis

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (23 of 25)2004/12/10 02:42:38 .

    http://www.firstbyte.davd.com/mailto:[email protected]://www.firstbyte.davd.com/http://www.firstbyte.davd.com/mailto:[email protected]://www.firstbyte.davd.com/
  • 8/14/2019 Speech and Speech Recognition resources

    24/25

    OFTWARE DREAMS AND TALKING MACHINES

    ress marks in the dictionary in synthesising intonation.

    rice: Freeisc: Axel Belinfante has implemented a WWW rsynth demo

    vailability: anonymous ftp #1 or anonymous ftp #2

    ENSYN speech synthesizer

    atform: PC, Mac, Sun, and NeXtough Cost: $300escription:

    This formant synthesizer produces speech waveform files based on the (Klatt)KLSYN88 synthesizer. It is intended for laboratory and research use. Note thatthis is NOT a text-to-speech synthesizer, but creates speech sounds based upoa large number of input variables (formant frequencies, bandwidths, glottal pul

    characteristics, etc.) and would be used as part of a TTS system. Includes fullsource code.

    vailability: Sensimetrics Corporation4 Sidney Street, Cambridge MA 02139.ax: (617) 225-0470; Tel: (617) 225-2442.mail: [email protected]

    GI Developers Toolbox Synthesiser

    atform: SGIescription: The SGI Developer Toolbox 4.0 CDROM contains a basicpublic domain t-speech program in the publics/speak directory. The directory includes man pages

    nd source.

    vailability: on the SGI Developer Toolbox 4.0 CDROM

    IMTEL

    wide range of speech related software, sound-blaster software and signal processioftware for PCs is available on SimTel and its mirror sites. It can be obtained by ftpom:

    q ftp://www.cdrom.com/pub/simtelnet/msdos/sound/

    ote: Voicemaker - The archives include the program Voicemaker which synthesises

    le:///F|/summary/temp/GAR/SOFTWARE_DREAMS_AND_TALKING.HTM (24 of 25)2004/12/10 02:42:38 .

    http://wwwtios.cs.utwente.nl/sayftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/synthesis/rsynth-2.0.tar.gzftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/synthesis/rsynth-2.0.tar.gzmailto:[email protected]://www.cdrom.com/pub/simtelnet/msdos/sound/ftp://www.cdrom.com/pub/simtelnet/msdos/sound/mailto:[email protected]://svr-ftp.eng.cam.ac.uk/pub/comp.speech/synthesis/rsynth-2.0.tar.gzftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/synthesis/rsynth-2.0.tar.gzhttp://wwwtios.cs.utwente.nl/say
  • 8/14/2019 Speech and Speech Recognition resources

    25/25

    OFTWARE DREAMS AND TALKING MACHINES

    peech.

    OOD HUNTING AND ENJOY!

    op| ACSP Home | SuperAdaptoid Column

    http://www.geocities.com/HotSprings/Villa/6113/sw2.htm#Tophttp://www.geocities.com/HotSprings/Villa/6113/index.htmlhttp://www.geocities.com/HotSprings/Villa/6113/saorg.htmhttp://www.geocities.com/HotSprings/Villa/6113/saorg.htmhttp://www.geocities.com/HotSprings/Villa/6113/index.htmlhttp://www.geocities.com/HotSprings/Villa/6113/sw2.htm#Top