learning to interpret natural language instructions
DESCRIPTION
Learning to Interpret Natural Language Instructions. Monica Babeş-Vroman + , James MacGlashan * , Ruoyuan Gao + , Kevin Winner * Richard Adjogah * , Marie desJardins * , Michael Littman + and Smaranda Muresan ++ *Department of Computer Science and Electrical Engineering - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/1.jpg)
Learning to Interpret Natural Language Instructions
Monica Babeş-Vroman+ , James MacGlashan*, Ruoyuan Gao+ , Kevin Winner*
Richard Adjogah* , Marie desJardins* , Michael Littman+ and Smaranda Muresan++
*Department of Computer Science and Electrical EngineeringUniversity of Maryland, Baltimore County
+Computer Science Department, Rutgers University++School of Communication and Information, Rutgers University
NAACL-2012 WORKSHOPFrom Words to Actions: Semantic Interpretation in an Actionable Context
![Page 2: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/2.jpg)
Outline
• Motivation and Problem• Our approach• Pilot study• Conclusions and future work
![Page 3: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/3.jpg)
Motivation
Bring me the red mug that I left in the conference room
![Page 4: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/4.jpg)
Motivation
![Page 5: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/5.jpg)
Motivation
![Page 6: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/6.jpg)
Train an artificial agent to learn to carry out complex multistep tasks specified in natural language.
Problem
![Page 7: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/7.jpg)
Another Example of A Task of pushing an object to a room. Ex : square and red room
Abstract task: move obj to color room
move square to red room move star to green room go to green room
![Page 8: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/8.jpg)
Outline
• What is the problem?• Our approach• Pilot study• Conclusions and future work
![Page 9: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/9.jpg)
Training Data:
S2
F1 F2 F3 F4 F5 F6
S3
S4
S1
Object-oriented Markov Decision Process (OO-MDP) [Diuk et al., 2008]
![Page 10: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/10.jpg)
“Push the star into the teal room”
![Page 11: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/11.jpg)
“Push the star into the teal room”
Task Learning from Demonstrations
Semantic Parsing
Task Abstraction
F2 F6
Our Approach
push(star1, room1)P1(room1, teal)
F1 F2 F3 F4 F5 F6
![Page 12: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/12.jpg)
“Push the star into the teal room”
Task Learning from Demonstrations
Semantic Parsing
Task Abstraction
F2 F6
Our Approach
“teal”push(star1, room1)P1(room1, teal) P1 color
“push“ objToRoom
![Page 13: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/13.jpg)
“Push the star into the teal room”
Task Learning from Demonstrations
Semantic Parsing
Task Abstraction
Our Approach
![Page 14: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/14.jpg)
π*
S1
S2 S3
S430%70%
0%100%
100%0%
0%0%
RL
Task Learning from Demonstration
= Inverse Reinforcement Learning (IRL)
• But first let’s see Reinforcement Learning Problem
S1
S2 S3
S4MDP (S, A, T, γ)
R
![Page 15: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/15.jpg)
πE
S1
S2 S3
S430%70%
0%100%
100%0%
0%0%
Task Learning From Demonstrations
• Inverse Reinforcement Learning
![Page 16: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/16.jpg)
πE
S1
S2 S3
S430%70%
0%100%
100%0%
0%0%
Task Learning From Demonstrations
• Inverse Reinforcement Learning
![Page 17: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/17.jpg)
πE
S1
S2 S3
S430%70%
0%100%
100%0%
0%0%
R
Task Learning From Demonstrations
• Inverse Reinforcement Learning
S1
S2 S3
S4MDP (S, A, T, γ)
IRL
![Page 18: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/18.jpg)
• Maximum Likelihood IRL[M. Babeş, V. Marivate, K. Subramanian and M. Littman 2011]
θ Pr( )R π
Boltzmann Distribution
update in the direction of ∇ Pr( )
φ
MLIRL
Assumption: Reward function is a linear combination of a known set of features. (e.g., )
Goal: Find θ to maximize the probability of the observed trajectories.
(s,a): (state, action) pairR: reward functionφ: feature vectorπ: policyPr( ): probability of demonstrations
),(),(),R( asasasi ii
![Page 19: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/19.jpg)
“Push the star into the teal room”
Task Learning from Demonstrations
Semantic Parsing
Task Abstraction
Our Approach
![Page 20: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/20.jpg)
Semantic model (ontology)
NP (w, ) → Adj (w1, ), NP (w2, ): Φi NP (w, ) → Det (w1, ), NP (w2, ): Φi
…Syntax + Semantics + Ontological Constraints
Grammar
PARSING
Semantic representationtext
Semantic Parsing• Lexicalized Well-Founded Grammars (LWFG)
[Muresan, 2006; 2008; 2012]
![Page 21: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/21.jpg)
Empty Semantic model
Grammar
PARSING
X1.is=teal, X.P1=X1, X.isa=roomteal room
Semantic Parsing
• Lexicalized Well-Founded Grammars (LWFG) [Muresan, 2006; 2008; 2012]
room1
teal
P1 uninstantiated variable
≅
NP (w, ) → Adj (w1, ), NP (w2, ): Φi NP (w, ) → Det (w1, ), NP (w2, ): Φi
…
![Page 22: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/22.jpg)
Syntax/Semantics/Ontological Constraints
Grammar
PARSING
X1.is=green, X.color=X1, X.isa=room
teal room
Semantic Parsing
• Lexicalized Well-Founded Grammars (LWFG) [Muresan, 2006; 2008; 2012]
room1
color≅
Semantic model (ontology)
green
NP (w, ) → Adj (w1, ), NP (w2, ): Φi NP (w, ) → Det (w1, ), NP (w2, ): Φi
…
![Page 23: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/23.jpg)
Learning LWFG• [Muresan and Rambow 2007; Muresan, 2010; 2011]
Currently learning is done offline No assumption of existing semantic model
(loud noise, )
(the noise, )
…
Small Training Data
RELATIONAL LEARNING MODEL
LWFG
NP
NP
PP
Representative Examples
(loud noise)
(the noise)
(loud clear noise)
(the loud noise)
…
Generalization SetNP
NP
NP
NP
NP (w, ) → Adj (w1, ), NP (w2, ):Φi NP (w, ) → Det (w1, ), NP (w2, ):,Φi
…
![Page 24: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/24.jpg)
Using LWFG for Semantic Parsing
• Obtain representation with same underlying meaning despite different syntactic forms“go to the green room” vs “go to the room that is green”
–Dynamic Semantic ModelCan start with an empty semantic modelAugmented based on feedback from the TA componentpush blockToRoom, teal green
![Page 25: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/25.jpg)
“Push the star into the teal room”
Task Learning from Demonstrations
Semantic Parsing
Task Abstraction
Our Approach
![Page 26: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/26.jpg)
Task Abstraction
• Creates an abstract task to represent a set of similar grounded tasks– Specifies goal conditions and reward function in
terms of OO-MDP propositional functions
• Combines information from SP and IRL and provides feedback to both– Provides SP domain specific semantic information– Provides IRL relevant features
![Page 27: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/27.jpg)
Similar Grounded TasksGoal ConditionobjectIn(o2, l1)Other Terminal FactsisSquare(o2)isYellow(o2)isRed(l1)isGreen(l2)agentIn(o46, l2)…
Goal ConditionobjectIn(o15, l2)Other Terminal FactsisStar(o15)isYellow(o15)isGreen(l2)isRed(l3)objectIn(o31, l3)…
Some grounded tasks are similar in logical goal conditions, but different in object references and facts about those referenced objects
![Page 28: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/28.jpg)
Abstract Task Definition
• Define task conditions as a partial first order logic expression
• The expression takes propositional function parameters to complete it
)(ObjectIn o,l Locationl Object,o
)isGreen()isStar()isYellow(
)(ObjectIn
loo
lo , Locationl Object,o
![Page 29: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/29.jpg)
System Structure
Semantic Parsing Inverse Reinforcement Learning (IRL)
Task Abstraction
![Page 30: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/30.jpg)
Pilot StudyUnigram Language Model
Data: (Si, Ti).Hidden Data: zji : Pr(Rj | (Si, Ti) ).
Parameters: xkj : Pr(wk | Rj ).
![Page 31: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/31.jpg)
Pilot Study
Data: (Si, Ti).Hidden Data: zji : Pr(Rj | (Si, Ti) ).
Parameters: xkj : Pr(wk | Rj ).
Unigram Language Model
Pr(“push”|R1) Pr(“room”|R1)...
Pr(“push”|R2) Pr(“room”|R2)...
![Page 32: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/32.jpg)
Pilot StudyNew Instruction, SN : “Go to the green room.”
Pr(“push”|R1) Pr(“room”|R1)...
Pr(“push”|R2) Pr(“room”|R2)...
422
711
101.4)|Pr()|Pr(
,106.8)|Pr()|Pr(
Nk
Nk
SwkN
SwkN
RwRS
RwRS
![Page 33: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/33.jpg)
Pilot StudyNew Instruction, S’N : “Go with the star to green.”
Pr(“push”|R1) Pr(“room”|R1)...
Pr(“push”|R2) Pr(“room”|R2)...
5
'22
7
'11
101.2)|Pr()|'Pr(
,103.8)|Pr()|'Pr(
Nk
Nk
SwkN
SwkN
RwRS
RwRS
![Page 34: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/34.jpg)
Outline
• What is the problem?• Our approach• Pilot study• Conclusions and future work
![Page 35: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/35.jpg)
Conclusions
• Proposed new approach for training an artificial agent to learn to carry out complex multistep tasks specified in natural language, from pairs of instructions and demonstrations
• Showed pilot study with a simplified model
![Page 36: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/36.jpg)
Current/Future Work• SP, TA and IRL integration• High-level multistep tasks• Probabilistic LWFGs– Probabilistic domain (semantic) model– Learning/Parsing algorithms with both hard and soft
constraints• TA – add support for hierarchical task definitions
that can handle temporal conditions• IRL – find a partitioning of the execution trace
where each partition has it own reward function
![Page 37: Learning to Interpret Natural Language Instructions](https://reader036.vdocuments.us/reader036/viewer/2022062500/5681591e550346895dc64736/html5/thumbnails/37.jpg)
Thank you!