overview of technologies for translators and language service providers belinda maia university of...

32
Overview of technologies for translators and language service providers Belinda Maia University of Porto

Upload: loreen-freeman

Post on 30-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Overview of technologies for translators and language

service providersBelinda Maia

University of Porto

Translator asLanguage Services Provider

• MUST HAVE KNOWLEDGE OF:– Science and Technology– National and International Economics, Politics, Law

and Current Affairs– Multimedia– Human Language Technologies - HLT– Information Society Technologies - IST

• MUST BE:– A Multidisciplinary Communicator– A Multimedia Communicator– AND an Intercultural Communicator

Translator as Intercultural Communicator

• MUST HAVE KNOWLEDGE OF:– Psycholinguistics– Contrastive linguistics– Sociolinguistics– Cultural theory– Literary theory

• MUST BE:– Multi-lingual and multi-culturally sensitive

Translator as Multimedia Communicator

• MUST HAVE KNOWLEDGE OF:– General IT as user– Special IT for translators – MT, CAT etc– Subtitling and Dubbing programmes– Web Pages– ETC

• MUST BE:– Computer literate and aware of new media

Information Society Technologies

• European Programme at: http://cordis.europa.eu/ist/

• Focus on:– Technology for providing information– Language as vehicle of information– Language as structuring knowledge– Knowledge management

HLT(1)Calls for (research) proposals

• 1999/2000

• MLIS (Multi Lingual Information Society) – the provision of multilingual language

resources over global networks – the development of multilingual networked

services

HLT(2)Calls for (research) proposals

• 2000/2001

• Multilingual communication services and appliances – Multilingual e-service and e-commerce – Natural and multilingual interactivity– Multilingual web – Multimodal and multi-sensorial dialogue

modes

HLT (3)Calls for (research) proposals

• 2002/6 – Focus on– Knowledge and Interface Technologies

• Multi-modal interfaces• Semantic-based knowledge systems

– Cognitive systems– Bio-inspired Intelligent Information Systems

Multimodal Interfaces • Multilingual Communication –

> Facilitating translation for unrestricted domains, especially for spontaneous (unrestricted) or ill-formed (speech) inputs in task oriented settings.

• Areas to be addressed include:

Multimodal Interfaces • human-to-human;

• human-to-things;

• human-to-self;

• human-to-content;

• device-to-device; • human-to-embodied robots.

Multimodal Interfaces • Areas to be addressed include:

• speech-to-speech translation;

• statistical/mixed approaches to translation;

• adaptive techniques, incorporating learning;

• robustness of approach.

Don’t forget

• HLT research proposals are for cutting-edge technology

• The results will be in the future

• But the future is coming!

Technology FOR Translators

• Machine translation (MT)

• Machine assisted translation (MAT)

• Internet for information retrieval

• Corpora use

• Terminology Management

• Multimedia tools

• Summarisation and Revision

MT– a threat, a solution or a tool?

• A threat?• Under present circumstances - No• A solution?• Partially > ‘gist’ translation• A tool?• Increasingly > + pre- and post- editing • OR• Human Assisted MT (HAMT)

Online MT - uses

• Training in awareness of lexical and syntactic difficulties for both human and machine translation

• Our experiment with METRA

• It gets hundreds of hits per day, so who is using it?

• A lot of translators….. !

MAHT Commercial Programmes

• SDL + TRADOS - Check

• http://www.sdl.com/

• http://www.trados.com/

• DÉJA VU http://www.atril.com/

• STAR - TRANSIT http://www.star-group.net/eng/home.html

• WORDFAST - http://www.wordfast.net/

MAHT Basic tools

• Translation memories (TMs) + concordancer• TM created:

– As translator works– Using text aligner on previous texts + translations

• Terminology database created:– Pre-translation by terminologist / company / translator– Post- translation by aligning terms in text and

translation

MAHTAdditional tools

• Spelling and grammar checkers – in Word• Machine Translation• File formatting facilities• Terminology > knowledge databases• Project Management facilities• ETC• For further details come to the commercial

sessions on Wednesday!

eCoLoReTraining kits for TM technology

• Problem: OK – we have bought the TM software for our university – but it is empty!

• Solutions?

• Make your own TMs • eCoLoRe at http://ecolore.leeds.ac.uk/

Translation technology- Needs

• To find, keep and re-use information

• To work within multimedia technology

• Good understanding of Linguistics

• Understanding of how/why spelling and grammar checkers, MT, and other HLTs do(n’t) work

Using the Internet

• To find information• Understanding how the internet works• Using browsers intelligently

• To keep information• Collecting site links• Downloading useful information

• To convert information to knowledge• Studying special subjects

Internet information

• Eurodicautom, online terminology, glossaries, dictionaries

• On-line encyclopedias – e.g. Wikipedia

• Translators’ pages

• Translators’ forums and mailing lists

• Systematic finding, analysing and storage of relevant information / knowledge

Monolingual Corpora as tools

• Large quantities of varied types of text

• British National Corpus (BNC) – online at: http://sara.natcorp.ox.ac.uk/lookup.html

• Linguateca – Portuguese corpora – online at: http://www.linguateca.pt

• PLEASE inform of others!

Multilingual Corpora as tools

• EU documents at: http://europa.eu.int/

• Parallel corpora (Translation Memories?)– E.g. COMPARA > EN & PT (literary) online at:

http://www.linguateca,pt – 1 million x 2

• Comparable corpora – originals in different languages, but same domain and/or genre

Corpora - uses

• Monolingual corpora – finding the right word or collocation

• Multilingual / parallel corpora – finding terminology and translation suggestions

• Comparable corpora – discovering expert terminology and local text conventions

Terminology > KnowledgeFrom:

The ‘right word’Glossaries / dictionariesDatabasesThesauriConceptual organizationOntologiesKnowledge databases

Corpógrafo – integrated suite of online tools

• Corpora construction and analysis • Semi-automatic term extraction • Concept databases• Traditional terminology fields• Semi-automatic extraction of definitions and

semantic relations• Visualization of concept systems / ontologies• Produced by Linguateca – PoloCLUP and freely

available at: http://www.linguateca.pt/corpografo

Multimedia translation

• Localization• XML• Sub-titling• Dubbing• Web-pages• Software for interpreters• Speech-to-speech machine translation &

interpretation?

Other skills ± software

• Revision

• Translation evaluation

• Summarization

• Terminology management

• Information retrieval

• Project management

Linguistics

• Essential training for translators– General linguistics– Contrastive linguistics

• Translators > language experts > new specializations – Natural language processing – Translation and terminology tools

Group workHow much of this technology do you

• Use? • Find useful?- Know about? - Believe to be useful?- Don’t know about?- Want to find out more about?- Believe to be (ir)relevant to translating as

a profession?