getting’computers’to’understand’ whattheyread...

31
Getting Computers to Understand What They Read (Or Hear) Christopher Manning http://nlp.stanford.edu/ Computer Forum 2012

Upload: others

Post on 27-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Getting  Computers  to  Understand  What  They  Read  (Or  Hear)  

Christopher  Manning  

http://nlp.stanford.edu/  

Computer  Forum  2012  

Page 2: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

The  future  was  …  

 

A  vast  quantity  of  information,  contained  in  knowledge  bases,  with  artificial  intelligence  systems  for  

reasoning  over  it  

Page 3: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

The  future  is  …  

 

A  vast  quantity  of  information  in  an  ugly  mess  known  as  The  Web.    

 

 

Page 4: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

But  it’s  all  indexed  and  easily  searchable,  and,  for  humans,    

most  of  the  time  it  actually  works  

amazingly  well.  

Page 5: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

But  how  can  we  use  it  to  get  computers  to  do  more  advanced  tasks  

 

which  require  getting  knowledge  from  language  and  putting  facts  together?  

Page 6: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

We  need  machine  reading.  

Page 7: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

We  need  more  than  word  counts  

Page 8: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Extracting  Knowledge  

Textual  abstract:    A  summary  for  humans  

LLNL  EQ  Lawrence  Livermore  National  Laboratory    LLNL  LOC-­‐IN  California  Livermore  LOC-­‐IN  California  LLNL  IS-­‐A  scientific  research  laboratory  LLNL  FOUNDED-­‐BY  University  of  California  LLNL  FOUNDED-­‐IN  1952  

“The  Lawrence  Livermore  National  Laboratory  (LLNL)  in  Livermore,  California  is  a  scientific  research  

laboratory  founded  by  the  University  of  California  in  1952.”  

Structured  knowledge:      A  summary  for  machines  

relation  entity  

Page 9: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Machine  Reading  with  Distant  Supervision  

[Mintz,  et  al.  ACL  2009;  Surdeanu  et  al.  2011]  

•  If  we  had  relations  marked  in  texts,  we  could  train  a  conventional  relation  extraction  system  …  

•  Can  we  exploit  the  abundant  found  information  about  relations  –  such  as  from  DBpedia  or  Freebase  –  to  be  able  to  bootstrap  systems  for  machine  reading?  

•  Method:  use  database  as  “distant  supervision”  of  text  •  The  challenge  is  dealing  with  the  “noise”  that  enters  

the  picture  

Page 10: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Results  

•  Precision  of  extracted  facts:  about  70%  

•  New  relations  learned:  

 Montmartre    IS-­‐IN    Paris   Fyoder  Kamesnky  DIED-­‐IN  Clearwater  

Fort  Erie  IS-­‐IN  Ontario   Utpon  Sinclair  WROTE  Lanny  Budd  

Vince  McMahon  FOUNDED  WWE   Thomas  Mellon  HAS-­‐PROFESSION  Judge  

Page 11: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Where  syntactic  knowledge  helps  

How  useful  are  syntactic  representations  for  this  goal?  

 Back  Street  is  a  1932  film  made  by  Universal  Pictures,  directed  by  John  M.  Stahl,  and  produced  by  Carl  Laemmle  Jr.  

– Back  Street  and  John  M.  Stahl  are  far  apart  in  the  surface  string  

– But  they  are  close  together  in  a  dependency  parse  

Page 12: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Stanford  Dependencies  as  a  representation  for  relation  extraction  

The little boy jumped over the fence.

 

 

jumped!

boy! fence!

the! the!little!

prep_over nsubj

det amod det

det(boy-­‐3,  The-­‐1)  amod(boy-­‐3,  little-­‐2)  nsubj(jumped-­‐4,  boy-­‐3)  det(fence-­‐7,  the-­‐6)  prep_over(jumped-­‐4,  fence-­‐7)  

S  

NP   VP  

NN  

JJ   PP  

DT  

NN  DT   VBD  

IN   NP  The   little  

over  

boy  

the   fence  

jumped  

[de  Marneffe  &  Manning  2008]  

Page 13: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Stanford  Dependencies  as  a  representation  for  relation  extraction  

•  Stanford  Dependencies  favor  short  paths  between  related  content  words  

Björne    et  al.  2009  

Over  ¾    

Page 14: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

How  do  we  design  a  human  language  understanding  system?    

•  Most  systems  use  a  pipeline  of  processing  stages  –  Tokenize    –  Part-­‐of-­‐speech    –  Named  entities    –  Syntactic  parse    –  Semantic  roles    –  Coreference    –  …  

Page 15: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Probabilistic  joint  inference  helps  component  tasks  [Finkel  &  Manning,  NAACL  2009,  2010]  

55

60

65

70

75

80

85

90

ABC CNN MNB NBC PRI VOA

Named Entity Recognition F1-score on OntoNotes (by section)

Baseline  

Joint  Inference  

  Goal:  Joint  modeling  of  the  many  phases  of  linguistic  analysis    Here,  parsing  and  named  entities  

  Fixed  24%  of  named  entity  boundary  errors  and  of  incorrect  label  errors  

  22%  improvement  in  parsing  scores  

Page 16: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

How  can  we  understand  relationships  between  pieces  of  text?  

•  Can  one  conclude  one  piece  of  text  from  another?  –  Emphasis  is  on  handling  the  variability  of  linguistic  expression  

•  This  textual  inference  technology  would  enable:  –  Semantic  search:    lobbyists attempting to bribe U.S. legislators  

 The A.P. named two more senators who received contributions engineered by lobbyist Jack Abramoff in return for political favors.  

–  Question  answering:    Who bought J.D. Edwards?  Thanks to its recent acquisition of J.D. Edwards, Oracle will soon be able …  

–  Customer  email  response  –  Paraphrase  and  contradiction  detection  

Page 17: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Natural  Logic  [MacCartney  &  Manning  2008,  2009]  

OK, the example is contrived, but it compactly exhibits containment, exclusion, and implicativity….

P Jimmy Dean refused to move without blue jeans. H James Dean didn’t dance without pants

yes

Natural  logic  attempts  to  capture  valid  inferences  from  their  surface  linguistic  forms    A  revival  of  Aristotelian  syllogistics    An  example:  

Page 18: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

7  basic  entailment  relations  

Venn symbol name example

P = Q equivalence couch = sofa

P ⊏ Q forward entailment (strict)

crow ⊏ bird

P ⊐ Q reverse entailment (strict)

European ⊐ French

P ^ Q negation (exhaustive exclusion)

human ^ nonhuman

P | Q alternation (non-exhaustive exclusion)

cat | dog

P _ Q cover (exhaustive non-exclusion)

animal _ nonhuman

P # Q independence hungry # hippo

Relations are defined for all semantic types: tiny ⊏ small, hover ⊏ fly, kick ⊏ strike, ���this morning ⊏ today, in Beijing ⊏ in China, everyone ⊏ someone, all ⊏ most ⊏ some

Page 19: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Lexical  entailment  classification  

P Jimmy Dean

refused to move without blue jeans

H James Dean did n’t dance without pants

edit���index 1 2 3 4 5 6 7 8

edit���type SUB DEL INS INS SUB MAT DEL SUB

lex���feats

strsim=���0.67

implic: ���–/o cat:aux cat:neg hypo hyper

lex���entrel = | = ^ ⊐ = ⊏ ⊏

Page 20: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

inversion

Entailment  projection  

P Jimmy Dean

refused to move without blue jeans

H James Dean did n’t dance without pants

edit���index 1 2 3 4 5 6 7 8

edit���type SUB DEL INS INS SUB MAT DEL SUB

lex���feats

strsim=���0.67

implic: ���–/o cat:aux cat:neg hypo hyper

lex���entrel = | = ^ ⊐ = ⊏ ⊏

projec-tivity ↑ ↑ ↑ ↑ ↓ ↓ ↑ ↑

atomic���entrel = | = ^ ⊏ = ⊏ ⊏

Page 21: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Final answer

Entailment  composition  

P Jimmy Dean

refused to move without blue jeans

H James Dean did n’t dance without pants

edit���index 1 2 3 4 5 6 7 8

edit���type SUB DEL INS INS SUB MAT DEL SUB

lex���feats

strsim=���0.67

implic: ���–/o cat:aux cat:neg hypo hyper

lex���entrel = | = ^ ⊐ = ⊏ ⊏

projec-tivity ↑ ↑ ↑ ↑ ↓ ↓ ↑ ↑

atomic���entrel = | = ^ ⊏ = ⊏ ⊏

compo-sition = | | ⊏ ⊏ ⊏ ⊏ ⊏

fish  |  human  

human  ^  nonhuman  

fish  ⊏  nonhuman  

For example:

✓  

Page 22: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Multiword  paraphrases  

•  But  this  system  is  not  so  good  at  working  out  “multiword  paraphrases”  

– walked  inland  – moved  away  from  the  coast  

– Pollack  said  the  plaintiffs  failed  to  show  that  Merrill  and  Blodget  directly  caused  their  losses    

– Basically  ,  the  plaintiffs  did  not  show  that  omissions  in  Merrill’s  research  caused  the  claimed  losses    

Page 23: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Hierarchical  Deep  Learning:  Unsupervised  Recursive  Autoencoder  

Page 24: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Recursive  autoencoders  capture  sematic  similarity  

Page 25: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Recursive  autoencoders  for  full-­‐sentence  paraphrase  detection  

Experiments  on  Microsoft  Research  Paraphrase  Corpus  (Dolan  et  al.  2004)    

Page 26: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Language  is  inherently    connected  to  people  

     “…  the  common  misconception  [is]  that  language  use  has  primarily  to  do  with  words  and  what  they  mean.    It  doesn’t.  It  has  primarily  to  do  with  people  and  what  they  mean.”  

         Asking  questions  and  influencing  answers  

       Clark  &  Schober,  1992  

Page 27: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

What  does  it  mean?  

 A:  Was  the  movie  good?    

 B:  Hysterical.  We  laughed  so  hard.  

         Was  it  a  good  movie?                YES/NO  ?  

     

 

 

The  outpouring  of  social  language  use  on  the  web  let’s  us  learn  what  people  mean  (as  never  

before)  

Page 28: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

 Review  ratings  can  teach  modifier  scales  

Page 29: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Grounded  learning  of  answer  interpretations  

A: Is this hurricane season extraordinary?

B: Very unusual in the sense of how many storms we've had.  

•  We  learn  “contingent  oppositions”  

A:  Is  Obama  qualified?    

B:  I  think  he  is  young.  

Page 30: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score

Envoi  

•  Probabilistic  models  have  given  us  very  good  tools  for  analyzing  human  language  sentences  

•  We  can  extract  participants  and  their  relations  with  good  accuracy  

•  There  is  exciting  work  in  text  understanding  and  inference  based  on  these  foundations  

•  This  provides  a  basis  for  computers  to  do  higher-­‐level  tasks  that  involve  knowledge  &  reasoning  

•  But  much  work  remains  to  achieve  the  language  competence  of  science  fiction  robots….  

Page 31: Getting’Computers’to’Understand’ WhatTheyRead (OrHear)’...[Finkel&Manning,NAACL2009,2010] 55 60 65 70 75 80 85 90 ABC CNN MNB NBC PRI VOA Named Entity Recognition F1-score