speech and language technologies in the next generation localisation cset

Download Speech and Language Technologies in the Next Generation Localisation CSET

If you can't read please download the document

Upload: yaholo

Post on 10-Jan-2016

17 views

Category:

Documents


0 download

DESCRIPTION

Speech and Language Technologies in the Next Generation Localisation CSET. Prof. Andy Way, School of Computing, DCU. Overview of Presentation. Speech & Language Technologies in the NGL CSET. Overview of Presentation. Speech & Language Technologies in the NGL CSET - PowerPoint PPT Presentation

TRANSCRIPT

  • Speech and Language Technologies in the Next Generation Localisation CSET

    Prof. Andy Way, School of Computing, DCU

  • Overview of PresentationSpeech & Language Technologies in the NGL CSET

  • Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL Applications

  • Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research Challenges

  • Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research Tracks

  • Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research TracksTypical LSPs Translation Process

  • Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research TracksTypical LSPs Translation ProcessKey Integration Challenges

  • Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research TracksTypical LSPs Translation ProcessKey Integration ChallengesConcluding Remarks

  • ILT - Integrated Language TechnologiesProf. Andy WayILT Area Coordinator

  • ILT: Facilitating Optimal Multilingual NGL Applications

    Machine Translation

    Text InputText OutputText Processinge.g. bulk localisation

  • ILT: Facilitating Optimal Multilingual NGL Applications

    Speech Technologies

    Machine Translation

    Text InputText OutputSpeech OutputSpeech InputText Processinge.g. bulk localisatione.g. personalisation

  • Machine Translation: SignificanceFor our industrial partners, volume of material needing translation increasing, while budgets remain the sameIn the EU, now 23 official languages (506 language pairs), and expanding In the US, huge investment in translation between Arabic, Chinese and UrduEnglish

  • Machine Translation: SignificanceFor our industrial partners, volume of material needing translation increasing, while budgets remain the sameIn the EU, now 23 official languages (506 language pairs), and expanding In the US, huge investment in translation between Arabic, Chinese and UrduEnglish

    Automation the only option (especially for PL)

  • Enhanced Translation Quality

    MT: Key Research Challenges

    Enhanced Translation QualityFaster Translation TimesScalabilityOther Modalities (Speech, SMS etc.)

  • The State-of-the-Art Source:Reference: The two sides highlighted the role of the World Trade Organization (WTO)Baseline: The two sides on the role of the WTO

  • Improving the State-of-the-Art

    Our MT systems have knowledge of syntaxParts of speech (nouns, verbs etc.)Roles in sentences (subject, object etc.)

    better translation qualitySource:Reference: The two sides highlighted the role of the World Trade Organization (WTO)Baseline: The two sides on the role of the WTOOur System: The two sides reaffirmed the role of the WTO

  • The State-of-the-Art Source:Reference: Mahmoud Abbas: The wall and settlements will not bring Israel securityBaseline: Mahmoud Abbas, the wall and settlements will provide security to IsraelOur System: Mahmoud Abbas, the wall and settlements will not provide security for Israel

  • Improving the State-of-the-Art

    better translation quality (especially where end-users are concerned)DCU ArabicEnglish system ranked first at international MT evaluation in Oct. 2007Source:Reference: Mahmoud Abbas: The wall and settlements will not bring Israel securityBaseline: Mahmoud Abbas, the wall and settlements will provide security to IsraelOur System: Mahmoud Abbas, the wall and settlements will not provide security for Israel

  • MT Novel Research: Handling Different Types of Text

    Translating patent applications, or doctors prescriptions, or visa applications: different tasks, as the content is different So is the form

  • MT Novel Research: Handling Different Types of Text

    Translating patent applications, or doctors prescriptions, or visa applications: different tasks, as the content is different So is the form Build different MT systems for each different task, using our industrial partners documentation

  • Text Processing: Significance and Challenges

    If texts are automatically annotated with:

    syntactic information (e.g. subject, object), todays MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)

  • Text Processing: Significance and Challenges

    If texts are automatically annotated with:

    syntactic information (e.g. subject, object), todays MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)

    text-type and genre information, this helps our MT systems disambiguate text and improve translation quality

  • Text Processing: Significance and Challenges

    If texts are automatically annotated with:

    syntactic information (e.g. subject, object), todays MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)

    text-type and genre information, this helps our MT systems disambiguate text and improve translation quality

    localisation information (e.g. Andy Way), then the workflows of our industrial partners (currently done manually) can be significantly improved (cf. LOC)

  • Speech Technology: SignificanceSpeech interfaces for eyes-busy, hands-busy scenairos

    Speech recognition and synthesis systems which can deal withpotentially an unlimited vocabularymultiple (and non-native) speakersmultiple languages

    and can be tightly integrated with MT

    localisation & personalisation volume & scalability access

  • themoreitsnowsthemoreitgoesSpeech Technology: Challenges

  • themoreitsnowsthemoreitgoesdemoreisnowsdemoregoesSpeech Technology: Challenges

  • themoreitsnowsthemoreitgoeslinguistic competence of native speakerrules and vocabulary of systemperformance of (native) speakerSpeech Technology: Challengesdemoreisnowsdemoregoes

  • themoreitsnowsthemoreitgoeslinguistic competence of native speakerperformance of (native) speakerSpeech Technology: Innovationswhich integrates explicit linguistic knowledgeRobust & Novel Speech Recognition Enginedemoreisnowsdemoregoes

  • themoreitsnowsthemoreitgoesdetverkarhavaritenstorstormhurmnlinguistic competence of native speakerJemehreschneitdestomehresgehtInnovations: Speech Recognition & MTRobust & Novel Speech Recognition EngineTight coupling with MT Engineswhich integrates explicit linguistic knowledge

  • themoreitsnowsthemoreitgoesdetverkarhavaritenstorstormhurmnJemehreschneitdestomehresgehtInnovations: MT & Speech SynthesisRobust & Novel Speech Synthesis Enginewhich integrates explicit linguistic knowledgeTight coupling with MT Engines

  • Typical LSPs Translation Process Requirement: minimal disruption of this process& Machine TranslationTM match score < 50 %: expensive

    50 % < TM match score < 70 %: medium

    TM match score > 70 %: cheap

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing cost

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlighted

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminology

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminologyDeal with markup, tags

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminologyDeal with markup, tags Produce true-cased translations

  • Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminologyDeal with markup, tags Produce true-cased translations

    Integrate into pre-existing workflows!

  • Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD students

  • Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small

  • Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SF

  • Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SFSignificant role in CNGL demonstrators

  • Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SFSignificant role in CNGL demonstratorsResearch tools Industrial prototypes

  • Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SFSignificant role in CNGL demonstratorsResearch tools Industrial prototypesWell placed to succeed in going beyond TMs

  • Speech & Language Technologies in the NGL CSET

    Thanks for listening!

    Questions?

    http://www.cngl.ie

    [email protected]