speech and language technologies in the next generation localisation cset
DESCRIPTION
Speech and Language Technologies in the Next Generation Localisation CSET. Prof. Andy Way, School of Computing, DCU. Overview of Presentation. Speech & Language Technologies in the NGL CSET. Overview of Presentation. Speech & Language Technologies in the NGL CSET - PowerPoint PPT PresentationTRANSCRIPT
-
Speech and Language Technologies in the Next Generation Localisation CSET
Prof. Andy Way, School of Computing, DCU
-
Overview of PresentationSpeech & Language Technologies in the NGL CSET
-
Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL Applications
-
Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research Challenges
-
Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research Tracks
-
Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research TracksTypical LSPs Translation Process
-
Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research TracksTypical LSPs Translation ProcessKey Integration Challenges
-
Overview of PresentationSpeech & Language Technologies in the NGL CSETFacilitating Optimal Multilingual NGL ApplicationsKey Research ChallengesNovel Research TracksTypical LSPs Translation ProcessKey Integration ChallengesConcluding Remarks
-
ILT - Integrated Language TechnologiesProf. Andy WayILT Area Coordinator
-
ILT: Facilitating Optimal Multilingual NGL Applications
Machine Translation
Text InputText OutputText Processinge.g. bulk localisation
-
ILT: Facilitating Optimal Multilingual NGL Applications
Speech Technologies
Machine Translation
Text InputText OutputSpeech OutputSpeech InputText Processinge.g. bulk localisatione.g. personalisation
-
Machine Translation: SignificanceFor our industrial partners, volume of material needing translation increasing, while budgets remain the sameIn the EU, now 23 official languages (506 language pairs), and expanding In the US, huge investment in translation between Arabic, Chinese and UrduEnglish
-
Machine Translation: SignificanceFor our industrial partners, volume of material needing translation increasing, while budgets remain the sameIn the EU, now 23 official languages (506 language pairs), and expanding In the US, huge investment in translation between Arabic, Chinese and UrduEnglish
Automation the only option (especially for PL)
-
Enhanced Translation Quality
MT: Key Research Challenges
Enhanced Translation QualityFaster Translation TimesScalabilityOther Modalities (Speech, SMS etc.)
-
The State-of-the-Art Source:Reference: The two sides highlighted the role of the World Trade Organization (WTO)Baseline: The two sides on the role of the WTO
-
Improving the State-of-the-Art
Our MT systems have knowledge of syntaxParts of speech (nouns, verbs etc.)Roles in sentences (subject, object etc.)
better translation qualitySource:Reference: The two sides highlighted the role of the World Trade Organization (WTO)Baseline: The two sides on the role of the WTOOur System: The two sides reaffirmed the role of the WTO
-
The State-of-the-Art Source:Reference: Mahmoud Abbas: The wall and settlements will not bring Israel securityBaseline: Mahmoud Abbas, the wall and settlements will provide security to IsraelOur System: Mahmoud Abbas, the wall and settlements will not provide security for Israel
-
Improving the State-of-the-Art
better translation quality (especially where end-users are concerned)DCU ArabicEnglish system ranked first at international MT evaluation in Oct. 2007Source:Reference: Mahmoud Abbas: The wall and settlements will not bring Israel securityBaseline: Mahmoud Abbas, the wall and settlements will provide security to IsraelOur System: Mahmoud Abbas, the wall and settlements will not provide security for Israel
-
MT Novel Research: Handling Different Types of Text
Translating patent applications, or doctors prescriptions, or visa applications: different tasks, as the content is different So is the form
-
MT Novel Research: Handling Different Types of Text
Translating patent applications, or doctors prescriptions, or visa applications: different tasks, as the content is different So is the form Build different MT systems for each different task, using our industrial partners documentation
-
Text Processing: Significance and Challenges
If texts are automatically annotated with:
syntactic information (e.g. subject, object), todays MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)
-
Text Processing: Significance and Challenges
If texts are automatically annotated with:
syntactic information (e.g. subject, object), todays MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)
text-type and genre information, this helps our MT systems disambiguate text and improve translation quality
-
Text Processing: Significance and Challenges
If texts are automatically annotated with:
syntactic information (e.g. subject, object), todays MT systems can learn syntax required for improved output quality and improved processing of multilingual queries (DCM)
text-type and genre information, this helps our MT systems disambiguate text and improve translation quality
localisation information (e.g. Andy Way), then the workflows of our industrial partners (currently done manually) can be significantly improved (cf. LOC)
-
Speech Technology: SignificanceSpeech interfaces for eyes-busy, hands-busy scenairos
Speech recognition and synthesis systems which can deal withpotentially an unlimited vocabularymultiple (and non-native) speakersmultiple languages
and can be tightly integrated with MT
localisation & personalisation volume & scalability access
-
themoreitsnowsthemoreitgoesSpeech Technology: Challenges
-
themoreitsnowsthemoreitgoesdemoreisnowsdemoregoesSpeech Technology: Challenges
-
themoreitsnowsthemoreitgoeslinguistic competence of native speakerrules and vocabulary of systemperformance of (native) speakerSpeech Technology: Challengesdemoreisnowsdemoregoes
-
themoreitsnowsthemoreitgoeslinguistic competence of native speakerperformance of (native) speakerSpeech Technology: Innovationswhich integrates explicit linguistic knowledgeRobust & Novel Speech Recognition Enginedemoreisnowsdemoregoes
-
themoreitsnowsthemoreitgoesdetverkarhavaritenstorstormhurmnlinguistic competence of native speakerJemehreschneitdestomehresgehtInnovations: Speech Recognition & MTRobust & Novel Speech Recognition EngineTight coupling with MT Engineswhich integrates explicit linguistic knowledge
-
themoreitsnowsthemoreitgoesdetverkarhavaritenstorstormhurmnJemehreschneitdestomehresgehtInnovations: MT & Speech SynthesisRobust & Novel Speech Synthesis Enginewhich integrates explicit linguistic knowledgeTight coupling with MT Engines
-
Typical LSPs Translation Process Requirement: minimal disruption of this process& Machine TranslationTM match score < 50 %: expensive
50 % < TM match score < 70 %: medium
TM match score > 70 %: cheap
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing cost
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlighted
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminology
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminologyDeal with markup, tags
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminologyDeal with markup, tags Produce true-cased translations
-
Key Integration ChallengesUse MT to automatically upgrade some TM matches to a cheaper cost class, cf. Dynamic Translation Memory [Bicici and Dymetman, 2008]Linking MT automatic evaluation metrics with post-editing costEnsuring that MT omissions are highlightedEnforcing customer terminologyDeal with markup, tags Produce true-cased translations
Integrate into pre-existing workflows!
-
Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD students
-
Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small
-
Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SF
-
Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SFSignificant role in CNGL demonstrators
-
Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SFSignificant role in CNGL demonstratorsResearch tools Industrial prototypes
-
Concluding RemarksFor ILT, ramp up almost complete, c. over 30 new researchers in addition to pre-existing PIs, postdoctoral researchers and PhD studentsLarge interest from industrial partners, both large and small Input from LOC, DCM and SFSignificant role in CNGL demonstratorsResearch tools Industrial prototypesWell placed to succeed in going beyond TMs
-
Speech & Language Technologies in the NGL CSET
Thanks for listening!
Questions?
http://www.cngl.ie