taus mt showcase, sovee smart engine 2.0, a leap beyond base moses technology, scott gaskill, sovee

19
Wednesday, 4 June Sovee Smart Engine 2.0: A Leap Beyond Base Moses Technology Sco$ Gaskill, Sovee TAUS Machine TranslaDon Showcase 2014 Dublin (Ireland) The research within the project MosesCore leading to these results has received funding from the European Union 7th Framework Programme, grant agreement no 288487

Upload: taus-enabling-better-translation

Post on 20-Aug-2015

364 views

Category:

Technology


0 download

TRANSCRIPT

Wednesday,  4  June  

Sovee  Smart  Engine  2.0:  A  Leap  Beyond  Base  Moses  Technology  

Sco$  Gaskill,  Sovee  

TAUS  Machine  TranslaDon  Showcase  2014  Dublin  (Ireland)  

The  research  within  the  project  MosesCore  leading  to  these  results  has  received  funding  from  the  European  Union  7th  Framework  Programme,  grant  agreement  no  288487  

 

Presented by: Scott Gaskill

Christopher Klapp June 4, 2014

     MT  Showcase  

3  

I  skate  to  where  the  puck  is  going  to  be,  not  where  it  has  been.      

Wayne  Gretzsky,  Hockey  Star  

4  

Where  is  the  world  going?  

CNNTech,  “Google  boss:  EnDre  world  will  be  online  by  2020,”  April  2013  hXp://www.cnn.com/2013/04/15/tech/web/eric-­‐schmidt-­‐internet    Kenya    stat  from  ITU,  2-­‐13.  Photo  used  by  permission  of  Deseret  News.  

2016  the  world  will  have  internet  connecDvity    By  the  end  of  this  decade  everyone  in  the  world  will  be  on  the  Web,  with  Mobile  access  growing  as  the  preferred  interface    In  Kenya,  99%  of  Internet  connecDons  are  mobile    

5  

We  are  entering  the  Convergence  era:  translaDon  will  be  a  uDlity  embedded  in  every  app,  device  and  screen.  Businesses  will  prosper  by  finding  new  customers  in  new  markets….      Consumers  will  become  world-­‐wise,    communicaDng  as  if  language  barriers  never  existed.      

Jaap  van  der  Meer,  Director  of  TAUS,  2013  

6  

Transla9on  Memory  –  Is  More  Be?er?  

If  we  simply  add  an  addiDonal  1,000  TM  lines  to  a  database  of  40-­‐60  billion,  will  we  see  beXer  translaDons?      

Knowing  how  to  use  the  data  is  key  

7  

Challenges    Technology,  approach  &  process  

Progress  in  first  60  years   Progress  Needed  by  2016  

Engines  for  <  150  Languages   Engines  for  >  6000  languages  

<  3%  of  the  world’s  content  translated  

All  content  translated  

Cloud-­‐based  speed  providing  more  servers  for  translaDon  

92  billion  Servers  

StaDsDcal  translaDon  introduced,  but  “fuzzy  logic”  does  not  deliver  quality  businesses  need  

Quality  improvement  to  standards  required  to  meet  world  commerce  demand  

8  

4  (  n(n-­‐1)  2   )  

Generic  SMT  

92  million     9.2  billion  –  based  on  100  businesses  

92  billion  Based  on  1000  

customers  

Not  valued  as  pracDcal  –  infinite  servers  required  

MT  Assets  (cascades)  

Technology  Challenge  6800 languages

Generic  SMT  

Domain  

Generic  SMT  

Domain  

Customer  

Generic  SMT  

Domain  

Customer  Project  

Minimum  Server  Requirements  

9  

Accuracy  Challenge  

Relevant  Segments   General    Corpus    Adequacy  

Accuracy  

General  MT  (30-­‐40%)  

 TM  (40-­‐60%)  

 Post  EdiDng  (up  to  100%)  

Preparing  new  project  /  import  TM  /  CAT  Leverage  Exact  Fuzzy  Match  Post  Edit  Review    Deliver  to  customer    

 

 

Gather  past  TM  Package  and  send  TM  to  SMT  provider  Clean,  tokenize,  data  (prepare  data)  Train  –Tune-­‐Test  (3Ts)  Repeat  unDl  viewed  as  acceptable  (repeat  with  customer  data  each  Dme)  

10

           Post  Edi9ng                        Learning  Engine        SMT  Workflow  

Segments  are  not  just  a  string  of  text  –  they  are  a  living  learning  en99es  

Process Real-time Automation and Integration

Sovee    Smart  Engine  2.0  

11  

Smart  Engine  Advantages  

Language  from  

Scratch  

Seamless  integra9on  to  Post  Edi9ng  workflow  

Training  /  Learning  

Efficiency  Gains  (what  we  have  seen)      Post  ediDng  –  50%+  improvement        TM  /MT  management  and  training  –  100%  improvement  

 

Update  MT  on  the  fly     Watch  it  learn  before  your  eyes  

Never  leave  the  post  ediDng  environment  

12  

Learned  Transla9ons  

!"#"$%&'()"*+"&',( -"&".%#((/0.12,((34"52%67(

3662.%67(

-"&".%#(89(

:0+%;&(

(<.*%&;=%>0&(

/2,'0+".(

?.0@"6'(

3,,"'(

9%*,(

Cascading  Assets   Sovee  Smart  Engine  MT  

Learned  Segments  

Segment  output  

1   2  

Asset  Synchrony  (CAT  Tools)  

Post  edi9ng  interface  Smart  Engine  

13  

Asset  Push  (Past  TM)  

Real-­‐9me  progressive  transla9on  cycle  (Sovee  MT,  

save  /push  post  edits)  

1

2

14  

Demo  

15  

Seamless  Integra9on  

Apps  

Websites  

eCommerce  

elearning  

Videos  

Podcasts  

Sorware  

Live  chat  

Text  Messages  

email  

     

“Convergence  Era”  

Apps  

Websites  

eCommerce  

elearning  

Videos  

Podcasts  

Sorware  

Live  chat  

Text  Messages  

email  

     

Japan   Sovee  Smart  Engine  TranslaDon  

USA  

Yukiko  (Japan):  ホールインワンを決めたよ!    Robert  (USA):  I  just  scored  a  hole-­‐in-­‐one!  Original:  ホールインワンを決めたよ!  

         Japan  

SNAG  

17  

Jack  Nicklaus  Learning  Leagues  

Languages:  Spanish  and  Japanese    In  Process:  10  more  languages  

Video  and  Training  Materials  for  Golf  Instruc9on  

18  

R.E.  Michel  

19  

Ques9ons?