cognitive robotics, embodied cognition and human-robot ... · cognitive robotics, embodied...
TRANSCRIPT
Cognitive Robotics, Embodied Cognition and Human-Robot
Interaction
Greg Trafton, Ph.DNaval Research Laboratory
Wednesday, November 3, 2010
Report Documentation Page Form ApprovedOMB No. 0704-0188
Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering andmaintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information,including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, ArlingtonVA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if itdoes not display a currently valid OMB control number.
1. REPORT DATE SEP 2010
2. REPORT TYPE N/A
3. DATES COVERED -
4. TITLE AND SUBTITLE Cognitive Robotics, Embodied Cognition and Human-Robot Interaction
5a. CONTRACT NUMBER
5b. GRANT NUMBER
5c. PROGRAM ELEMENT NUMBER
6. AUTHOR(S) 5d. PROJECT NUMBER
5e. TASK NUMBER
5f. WORK UNIT NUMBER
7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) Naval Research Laboratory Washington, USA
8. PERFORMING ORGANIZATIONREPORT NUMBER
9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S)
11. SPONSOR/MONITOR’S REPORT NUMBER(S)
12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release, distribution unlimited
13. SUPPLEMENTARY NOTES See also ADA560467. Indo-US Science and Technology Round Table Meeting (4th Annual) - Power Energyand Cognitive Science Held in Bangalore, India on September 21-23, 2010. U.S. Government or FederalPurpose Rights License
14. ABSTRACT The last decade has seen a strong push for more and better autonomous systems. In many cases,autonomous systems and people need to interact to accomplish joint goals. We have been working ondeveloping autonomous systems that interact with people in a variety of ways. Our approach has beenfrom a strong cognitive science viewpoint: we build models of how people think, perceive, and behave, andthen put those models on our robots. I will describe our research paradigm and show severaldemonstrations of our working robots (through video).
15. SUBJECT TERMS
16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT
SAR
18. NUMBEROF PAGES
36
19a. NAME OFRESPONSIBLE PERSON
a. REPORT unclassified
b. ABSTRACT unclassified
c. THIS PAGE unclassified
Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18
Quick Background
• As a cognitive scientist, I am interested in building computational cognitive models of people
• Our models are process descriptions of how people think, reason, perceive, etc. They express behavior through a series of human representations and strategies
• Building models of people facilitates human-robot interaction: People have expectations on how to interact with other people
Wednesday, November 3, 2010
Online vs. Offline Embodied Cognition
• A recent drumbeat in cognitive science is that cognition is for action (embodied cognition)
• We are building embodied models for cognitive robotics and human-robot interaction
• Online Cognition: Our perceptual and motoric processing are primarily for here and now interactions (Brooks)
• Offline Cognition: Even when decoupled from the environment, thinking is grounded in body-based mechanisms (sensing and motor control)
Wednesday, November 3, 2010
Cognitive Models
• We build models for online and offline cognition using a computational cognitive architecture (ACT-R; Anderson et al., 2004; Anderson, 2007)
• A cognitive architecture is a specification of the structure of the brain at a level of abstraction that explains how it achieves the function of the mind (Anderson, 2007)
• Hybrid symbolic and sub-symbolic system
• Connects with psychological data (experiments) and neuroscience data (fMRI)
• ACT-R models are precise process descriptions of how people perform different tasks
Wednesday, November 3, 2010
ACT-R/E (Embodied)
Procedural ModuleM
atc
hin
g
Selection
Intentional
module
Goal
Buffer
Temporal
Module
Temporal
Buffer
Imaginal
Module
Declarative
Module
Retrieval
Buffer
Visual
Module
Visual
Buffer
Aural
Module
Aural
Buffer
Motor
Buffer
Motor
Module
Configural
Buffer
Configural
Module
Manipulative Buffer
Manipulative Module
Vocal
Buffer
Vocal
Module
Robot Sensors and Effectors
Environment
Imaginal
Buffer
Execu
tion
Wednesday, November 3, 2010
ACT-R can make predictions about brain regions (fMRI)
Wednesday, November 3, 2010
Embodied Cognitive Modeling
• We use an MDS robot (Trafton et al., 2010) [Mobile/Dextrous/Social] named Octavia
• 2 color cameras (eyes)
• ToF camera (swiss ranger in forehead)
• Segway base
• 52 DOFs
Wednesday, November 3, 2010
ACT-R/E• We’ve hooked up ACT-R/E to our robotic sensors
and effectors
• Visual input enters into ACT-R’s visicon and is then processed the same way ACT-R processes any other visual stimuli
• Spatial information is also extracted through our sensors
• ACT-R/E motor commands move the robot
• Eyes follow attention, then head, then body
• Auditory information enters into the audicon
• We have lots of good sensor / perception systems which I will not talk about here...
Wednesday, November 3, 2010
Gaze-Following
• As a concrete example of online embodied cognition, I will show a classic gaze-following experiment with young children from Corkum & Moore (1998)
• Debate in the literature about maturation (via stages) vs. learning
• Gaze following is important for joint attention, early social development, theory of mind, later interaction
Wednesday, November 3, 2010
Corkum & Moore, 1998Participants
• 3 ages:
• 6-7 months old
• 8-9 months old
• 10-11 months old
Wednesday, November 3, 2010
Corkum & Moore, 1998Procedure
• Child came in and sat on their parent’s lap
• Parent was blindfolded so they could not give the child cues
• Experimenter got child’s attention so the child would look at the experimenter
Child
Experimenter
Toy 1 Toy 2
Wednesday, November 3, 2010
Corkum & Moore, 1998Procedure
• The experimenter looked at one of two toys
• The experimenter did not point or talk to the child while looking at the toy
Child
Experimenter
Toy 1 Toy 2
Wednesday, November 3, 2010
Corkum & Moore, 1998Procedure
• If the child looked at the same toy the experimenter did, it was coded as a “joint gaze”
Child
Experimenter
Toy 1 Toy 2
Wednesday, November 3, 2010
Corkum & Moore, 1998Procedure
• If the child looked at the toy the experimenter was not looking at, it was coded as a “non-target gaze”
• Navel gazing and non-looks were not coded
Child
Experimenter
Toy 1 Toy 2
Wednesday, November 3, 2010
Corkum & Moore, 1998Procedure
• There were three phases to the experiment
• For all trials, experimenter looked at a toy
Phase # Trials Description
Baseline 4 (2 on each side) Toy did not light up
Shaping 4 (2 on each side)Regardless of what infant did, toy lit up and rotated when experimenter
gazed at it
Testing 20 (10 on each side)Toy lit up and rotated only if both experimenter and infant gazed at it
Wednesday, November 3, 2010
Corkum & Moore, 1998Scoring
• Accuracy was scored as
• Perfect gaze-following = 100%
• Random gaze-following = 50%
• Perfect anti-gaze-following = 0% (should not happen)
# Joint Gazes
Total # Gazes
Wednesday, November 3, 2010
Corkum & Moore, 1998Research Questions
• At what age does gaze-following naturally occur?
• Can children who did not show gaze-following at baseline learn to follow gaze?
• This “learning to follow gaze” hypothesis was quite novel when it came out
Wednesday, November 3, 2010
Corkum & Moore, 1998Results
• At Baseline:
• 6-7 m.o.: Chance
• 8-9 m.o.: Chance
• 10-11 m.o.: Joint!
• After Training:
• 6-7 m.o.: Chance
• 8-9 m.o.: Joint!
• 10-11 m.o.: Joint!
6−7 8−9 10−11 6−7 8−9 10−11
Gaze−FollowingFrom Corkum and Moore (1998)
Age (months)
Accu
racy
of f
ollo
win
g ga
ze (%
)
0
25
50
75
100
BaselineAfter training
Wednesday, November 3, 2010
Corkum & Moore
• Corkum & Moore showed that gaze-following can be increased during a lab setting but:
• They just showed that children could learn how to follow gaze, not the mechanism of learning (e.g., what type of learning)
• Were 10-11 m old children at the right “stage?” Or was it continuous learning?
• They did not show that the experimental trial learning was what happened during ‘normal’ development
• There are no process descriptions of this finding
• So we built an ACT-R/E model (details light here)
Wednesday, November 3, 2010
ACT-R/E Model of Gaze-Following
• When the model is young, it looks around the world for interesting objects
• When it habituates to an object, it looks for another interesting object
• When it looks at the same object a person does, it gets a small reward, which is propagated back in time
• In “real-life” the object of gaze is relevant to both parent and child (Deak et al., 2008) [reward is tied to relevance]
• As model ages, it gradually learns to select the right spatial information and follow gaze
Wednesday, November 3, 2010
ACT-R/E Model of Gaze-Following
• Age is modeled by providing the model with opportunities to follow gaze
• The older the model, the more gaze-following experience it has had
• After aging each model, we ran it through the exact same experimental procedure that Corkum & Moore (1998) used
• We collected accuracy at each age group for the experiment and compared it to the empirical data
Wednesday, November 3, 2010
ACT-R/E Model of Gaze-Following
• Dots are model fits
• R2 = .95
• RMSD = .3
• All model points are within 95% CI
• Process support for learning hypothesis, not stage maturation
Trafton, Harrison, Fransen, & Bugajska, 2009
6−7 8−9 10−11 6−7 8−9 10−11
Gaze−FollowingFrom Corkum and Moore (1998)
Age (months)
Accu
racy
of f
ollo
win
g ga
ze (%
)
0
25
50
75
100
●
●
●
●
●
●
BaselineAfter training
Wednesday, November 3, 2010
Embodied Cognitive Modeling
• We also ran our model up to 10-11 m on our MDS robot, Octavia
• Several reasons for emphasizing embodied cognition
• No cheating allowed: very explicit theory of spatial cognition, learning, development
• Must deal with the complexity of the environment (dynamic, 3D, etc): the complexity of the real world tests the model, theory, framework
Wednesday, November 3, 2010
Model of 10-11 month old
Wednesday, November 3, 2010
We can also use head position to infer what someone sees
Wednesday, November 3, 2010
ACT-R/E Model of Level 1 Visual PT
• Dots are model fits
• R2 = .99
• RMSD = 10.4
• All model points are within 95% CI
• Trafton, Harrison, Fransen, & Bugajska, 2009
Accuracy of Level 1 Perspective−TakingFrom Moll and Tomasello (2006)
Mea
n pr
opor
tion
of h
andi
ng ta
rget
0
20
40
60
80
100
18 months 24 months
● ●●
●
ControlExperimental
Wednesday, November 3, 2010
Beyond Gaze-Following
• Gaze following is a strong example of online embodied cognition: very much “here and now”
• What about another classic cognitive phenomena that uses offline embodied cognition: Theory of Mind?
Wednesday, November 3, 2010
Theory of Mind
• Recognition that others can have beliefs, desires and intentions different from your own
• Typically develops around age 4
• Ability to infer what the beliefs, desires and intentions of others
• Critical ability for interacting with others
• The Sally-Anne task is one of the typical tasks used to study Theory of Mind
Wednesday, November 3, 2010
Sally-Anne Task(Baron-Cohen et al., 1985)
A child watches while:1. Sally puts a marble in her box
2. Sally leaves the room
3. Anne moves the marble to Anne’s box
4. Sally returns to the room
5. The child is asked where Sally believes the marble is
!" #"!"
#"
$" %"
#"
&"
!" #"
'"
Wednesday, November 3, 2010
Theory of Mind
• Wellman (2001) performed a meta-analysis on the existing experiments with the Sally-Anne false-belief task (and there were very many)
• False-beliefs are beliefs that one may have but which are not actually true in the physical world
• They built a statistical model (not a process model)
• We built a process model of the Sally-Anne task and ran it many times at different age levels
• (Had a combination of maturation + learning)
Wednesday, November 3, 2010
Results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
20 30 40 50 60 70 80 90 100
Pro
porti
on C
orre
ct
Age (Months)
662 Child Development
only what we termed primary conditions. These wereconditions in which (1) subjects were within 14months of each other in age, (2) less than 20% of theinitially tested subjects were dropped from the re-ported data analyses (due to inattention, experimen-tal error, or failing control tasks), and (3) more than80% of the subjects passed memory and/or realitycontrol questions (e.g., “Where did Maxi put thechocolate?” or “Where is the chocolate now?”). Ourreasoning was that age trends are best interpretable ifeach condition’s mean age represents a relatively nar-row band of ages; interpretation of answers to the tar-get false-belief question is unclear if a child cannot re-member key information, does not know where theobject really is, or cannot demonstrate the verbal facil-ity needed to answer parallel control questions. Inmost of the studies, few subjects were dropped, veryhigh proportions passed the control questions, andages spanned a year or less, so primary conditions in-cluded 479 (81%) of the total 591 conditions available.The primary conditions are enumerated in Table 1;they were compiled from 68 articles that contained128 separately reported studies. Of the 479 primaryconditions, 362 asked the child to judge someoneelse’s false belief; we began our analyses by concen-trating on these conditions. On average in the pri-mary conditions, 3% of children were dropped from acondition, children were 98% correct on control ques-tions, and ages ranged 10 months around their meanvalues.
In an initial analysis only age was considered as afactor. As shown in Figure 2, false-belief performancedramatically improves with age. Figure 2A showseach primary condition and the curve that best fits thedata. The curve plotted represents the probability ofbeing correct at any age. At 30 months, the youngestage at which data were obtained, children are morethan 80% incorrect. At 44 months, children are 50%correct, and after that, children become increasinglycorrect. Figure 2B shows the same data, but in thiscase the dependent variable, proportion correct, istransformed via a logit transformation. The formulafor the logit is:
,
where “ln” is the natural logarithm, and “
p
” is theproportion correct. With this transformation, 0 rep-resents random responding, or even odds of predict-ing the correct answer versus the incorrect answer.(When the odds are even, or 1, the log of 1 is 0, so thelogit is 0.) Use of this transformation has three majorbenefits. First, as is evident in Figure 2B, the curvilin-ear relation between age and proportion correct is
logit ! ln p1 p–------------! "
# $
straightened, yielding a linear relation that allowssystematic examination of the data via linear regres-sion; second, the restricted range inherent to propor-tion data is eliminated, for logits can range fromnegative infinity to positive infinity; and third, thetransformation yields a dependent variable and ameasure of effect size that is easily interpretable interms of odds and odds ratios (see, e.g., Hosmer &Lemeshow, 1989).
The top line of Table 2 summarizes the initial anal-ysis of age alone in relation to correct performance
Figure 2 Scatterplot of conditions with increasing age show-ing best-fit line. (A) raw scatterplot with log fit; (B) proportioncorrect versus age with linear fit. In (A), each condition is rep-resented by its mean proportion correct. In (B), those scores aretransformed as indicated in the text.
Welman, 2001 Our data
R2 = 0.55 R2 = 0.56
Hiatt & Trafton, 2010Wednesday, November 3, 2010
Theory of Mind
• Work so far has focused on development
• It is impossible to program every situation an autonomous agent will encounter, so we have focused on learning because people are the best, most robust learning systems we know of
• Besides building theoretical models of people (which advances cognitive science by providing process models of how people think, learn, embodied cognition, etc.), we can show the robustness of our models by using/demonstrating them in a different context
Wednesday, November 3, 2010
Wednesday, November 3, 2010
Embodied Cognition• We build computational cognitive models that takes
embodiment seriously
• Our models are process models of how people think
• Our models match human data at multiple levels
• We take our models and put them on robots so they are functional and have an embodied presence and interaction
• Our approach is radically different from typical robotics architectures (and from other EC researchers)
• Our last video shows interactive behavior with a focus on computational vision and perception...
Wednesday, November 3, 2010
Wednesday, November 3, 2010
Thanks!
Wednesday, November 3, 2010
•
• • I