innorobo 2016 keynote - making robots dream to face open environments

16
Making robots dream to face open environments Stéphane Doncieux

Upload: innoecho-innorobo

Post on 23-Jan-2017

189 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Innorobo 2016 Keynote - Making robots dream to face open environments

Making robots dream to face open environments

Stéphane Doncieux

Page 2: Innorobo 2016 Keynote - Making robots dream to face open environments

What a machine can do

YASKAWA BUSHIDO PROJECT / industrial robot vs sword master

Deep Blue vs G. Kasparov 1997

Motion Problem resolution

Page 3: Innorobo 2016 Keynote - Making robots dream to face open environments
Page 4: Innorobo 2016 Keynote - Making robots dream to face open environments
Page 5: Innorobo 2016 Keynote - Making robots dream to face open environments

Doncieux, S. (to appear) Creativity: A Driver for Research on Robotics in Open Environments, Intellectica

Perf

orm

ance

Context

Robot A

Robot B

??? ???

Known

Unknown Unknown

Page 6: Innorobo 2016 Keynote - Making robots dream to face open environments

How can a robot face a new environment ?

1. Robustness

2. Learning

3. Development

Page 7: Innorobo 2016 Keynote - Making robots dream to face open environments

Manual development

Autonomous development

1. Robustness

Page 8: Innorobo 2016 Keynote - Making robots dream to face open environments

Manual development

Autonomous development

Learning

2. Learning

Reward

High

Low A

Learning the action to apply in a state to maximize reward.

Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge: MIT press.

Page 9: Innorobo 2016 Keynote - Making robots dream to face open environments

2. Learningcontinuous actions & states

EvaluationGenotype

Fitness

Random generation

Selection

Variation

8.300110100111

Termination

Initial conditionsEvaluation

Genotype

Phenotype

Behavior

Environment

Fitness

Evolutionary Robotics

Mouret, J.B., Bredeche, N. et Doncieux S. La robotique évolutionniste Pour la science n°87, Avril-Juin 2015

Doncieux, S., Bredeche, N., Mouret, J.-B., & Eiben, A. E. (2015). Evolutionary Robotics: What, Why, and Where to.

Frontiers in Evolutionary Robotics, doi: 10.3389/frobt.2015.00004

Page 10: Innorobo 2016 Keynote - Making robots dream to face open environments

A

Kober, J., Bagnell, J. a., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274. doi:10.1177/0278364913495721

2. Learning The representation is critical !

???• Reinforcement Learning: fast but requires an efficient representation • Evolutionary Robotics: low level representation, but slow…

Page 11: Innorobo 2016 Keynote - Making robots dream to face open environments

3. Development

Weng, J. (2004). Developmental robotics : Theory and experiments. International Journal of Humanoid Robotics, 1(2), 199–236.

Manual development

Autonomous development

Page 12: Innorobo 2016 Keynote - Making robots dream to face open environments

3. DevelopmentInsights from psychology

The importance of redescribing knowledge representations

«   A specifically human way to gain knowledge is for the mind to exploit internally the information that it has already stored (both innate and acquired), by redescribing its representations or, more precisely, by iteratively re-representing in different representational formats what its internal representations represent » [Karmiloff-Smith 1996]

When to restructure and consolidate knowledge ?

« Sleep consolidates recent memories and, concomitantly, could allow insight by changing their representational structure. » [Wagner, 2004]

Kick-off meeting DREAM, Paris, 26/01/2015

Page 13: Innorobo 2016 Keynote - Making robots dream to face open environments

Deferred Restructuring of Experience in Autonomous Machines

H2020 FET Proactive « Knowing, doing, being » 01/2015-12/2018

http://www.robotsthatdream.eu/ https://twitter.com/robotsthatdream

3. Development Changing representations

Daytime experience

(large batch)

Daytime

Consolidated knowledge- task-relevant features- task contexts- abstract knowledge- new motivations

No initial policyNo single taskMotivations:- curiosity- satisfying humans- global mission

Behavior explorationKnowledge improvement

Knowledge adaptation

Small batch

Skill Knowledge validation

Sequence of learning episodes driven by motivations

New situation:-no reprogramming-fast adaptation

Knowledge sharingbetween robots:- better generalization- faster learning

Nighttime

Dream

Collective scale

Individual scale

Knowledge restructuring

Transfer from STM to LTM

Page 14: Innorobo 2016 Keynote - Making robots dream to face open environments

Learning 10 to 100

times faster

Generates examplesof behaviours

Discrete actions and sensors to consider

Passive analysis

Representationredescription

2

1

Learning

Direct policy search(neuroevolution)

Task-agnosticrepresentations

Slow learningLimited generalization

3

Learning

Discrete reinforcementlearning

Task-specific representations

Fast learningGood generalization

Page 15: Innorobo 2016 Keynote - Making robots dream to face open environments

Development: bootstrapping simple manipulation skills

1. Day 1: sensori-motor babbling 2. «Night» Learning to manipulate an object in simulation

3.  Day 2 : Back to reality

Page 16: Innorobo 2016 Keynote - Making robots dream to face open environments

Thank you ! Questions ?

[email protected] https://twitter.com/SDoncieux http://people.isir.upmc.fr/doncieux