learning control at intermediate reynolds...

51
Learning Control at Intermediate Reynolds Numbers Experiments with perching UAVs and flapping-winged flight Russ Tedrake Assistant Professor MIT Computer Science and Artificial Intelligence Lab IROS 2008 Learning Workshop September 22, 2008 Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Upload: others

Post on 25-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Learning Control atIntermediate Reynolds Numbers

Experiments with perching UAVsand flapping-winged flight

Russ Tedrake

Assistant ProfessorMIT Computer Science and Artificial Intelligence Lab

IROS 2008 Learning WorkshopSeptember 22, 2008

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 2: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Burden of Learning Control

Reinforcement learning has the potential to generatehigh-performance nonlinear, adaptive control policies forcomplicated systems...

but our techniques must be validated vs. more traditionalcontrol techniques.

There are many problems in robotics where traditionalnonlinear control offers relatively few solutions

Underactuated systems, many stochastic systems, ...Learning control can quickly become the state-of-the-art

Control in complicated fluid dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 3: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Burden of Learning Control

Reinforcement learning has the potential to generatehigh-performance nonlinear, adaptive control policies forcomplicated systems...

but our techniques must be validated vs. more traditionalcontrol techniques.

There are many problems in robotics where traditionalnonlinear control offers relatively few solutions

Underactuated systems, many stochastic systems, ...Learning control can quickly become the state-of-the-art

Control in complicated fluid dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 4: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Burden of Learning Control

Reinforcement learning has the potential to generatehigh-performance nonlinear, adaptive control policies forcomplicated systems...

but our techniques must be validated vs. more traditionalcontrol techniques.

There are many problems in robotics where traditionalnonlinear control offers relatively few solutions

Underactuated systems, many stochastic systems, ...Learning control can quickly become the state-of-the-art

Control in complicated fluid dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 5: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Burden of Learning Control

Reinforcement learning has the potential to generatehigh-performance nonlinear, adaptive control policies forcomplicated systems...

but our techniques must be validated vs. more traditionalcontrol techniques.

There are many problems in robotics where traditionalnonlinear control offers relatively few solutions

Underactuated systems, many stochastic systems, ...

Learning control can quickly become the state-of-the-art

Control in complicated fluid dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 6: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Burden of Learning Control

Reinforcement learning has the potential to generatehigh-performance nonlinear, adaptive control policies forcomplicated systems...

but our techniques must be validated vs. more traditionalcontrol techniques.

There are many problems in robotics where traditionalnonlinear control offers relatively few solutions

Underactuated systems, many stochastic systems, ...Learning control can quickly become the state-of-the-art

Control in complicated fluid dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 7: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Burden of Learning Control

Reinforcement learning has the potential to generatehigh-performance nonlinear, adaptive control policies forcomplicated systems...

but our techniques must be validated vs. more traditionalcontrol techniques.

There are many problems in robotics where traditionalnonlinear control offers relatively few solutions

Underactuated systems, many stochastic systems, ...Learning control can quickly become the state-of-the-art

Control in complicated fluid dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 8: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Example: Landing on a perch

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 9: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The State-of-the-Art in Perching

Approach 1: Morphing plane Approach 2: Over-powered hover

Both sacrifice performance to use linear control on modifiedvehicles (can’t compete w/ birds!)

Can learning nonlinear control produce superior performanceon existing vehicles?

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 10: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Technical challenges for control

Dynamics quickly get too complicated for conventionalcontrol design

Fluid dynamics are time-varying and very nonlinearCFD simulations in these regimes can take days to computeSevere lack of compact (design accessible) models

Limited control authority

Flow is only partially observableStalls result in intermittent losses of control authority“Underactuated control” - control actions have long-termconsequences

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 11: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Learning Control for Perching

Accurate Navier-Stokes simulations takes days to compute,but...

Model-based Reinforcement Learning:1 Learn approximate model of unsteady dynamics from real flight

data2 Formulate the goal of control as the long-term optimization of

a scalar cost3 Offline model-based numerical optimal control4 Online model-free optimal control (“learning”)

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 12: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Experiment Design

Glider (no propellor)

Dihedral (passive rollstability)

Offboard sensing andcontrol

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 13: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

System Identification

Nonlinear rigid-body vehicle model

Linear (w/ delay) actuator modelReal flight data

Very high angle-of-attack regimesSurprisingly good match to theoryVortex shedding

Lift Coefficient

−20 0 20 40 60 80 100 120 140 160−1.5

−1

−0.5

0

0.5

1

1.5

Angle of Attack

Cl

Glider DataFlat Plate Theory

Drag Coefficient

−20 0 20 40 60 80 100 120 140 160−0.5

0

0.5

1

1.5

2

2.5

3

3.5

Angle of Attack

Cd

Glider DataFlat Plate Theory

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 14: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

A dynamic model

(x, z)m, I

θF

F

g

w

el

lw

el

−φ

Planar dynamics

Aerodynamics from model

State: x = [x , y , θ, φ, x , y , θ]

Only actuator is the elevatorangle, u = φ

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 15: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Feedback Control Design

Optimal feedback control formulation:

Jπ(x) = min[(x− xd)T Q(x− xd), Jπ(x′)

],

x′ = f (x, π(x)))

x is the estimated state (from motion capture)π is the feedback policy, commanding elevator angle as afunction of statef is the identified system modelQ is a positive definite cost matrix

Discretize dynamics on a mesh over state space

Optimized Dynamic Programming algorithm approximates theoptimal policy, π∗(x), from model

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 16: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Glider Perching

Enters motion capture @ 6 m/s.

Perch is < 3.5 m away.

Entire trajectory @ 1 second.

RequiresSeparation!

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 17: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Flow visualization (very preliminary)

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 18: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Flow visualization (very preliminary)

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 19: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Transition to Outdoors: Gust Response

Current experiments are in “still air”

Outdoor flights require robustness to

ambient wind conditionspersistent flow structures (e.g., around the perch)short-term “gust” disturbances

Capture statistical environment model

Instrument motion capture arena with known aerodynamicdisturbances, obstacles, and wind

Already performed detailed LTV controllability analysis withdifferent actuator configurations

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 20: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Intermediate Reynolds Number regime

Reynolds number (Re) - dimensionless quantity that correlateswith the resulting kinematics of the fluid

Low Re - viscousity dominates, flow is laminarHigh Re - turbulenceIntermediate Re - complicated, but structured flow (eg, vortexshedding). Glider perching example is Re 50,000 down to Re15,000.

At Intermediate Re:

Lots of interesting control problemsAlmost no good control solutions

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 21: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Bird-scale flapping flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 22: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Autonomous Flapping-Winged Flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 23: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Motivation

Bird-scale flapping vehicles will not surpass the speed orefficiency of fixed-wing aircraft for steady-level flight in still air

Propellors produce thrust very efficientlyAircraft airfoils can be highly optimized (for speed orefficiency)

But looking more closely...

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 24: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Motivation

Bird-scale flapping vehicles will not surpass the speed orefficiency of fixed-wing aircraft for steady-level flight in still air

Propellors produce thrust very efficientlyAircraft airfoils can be highly optimized (for speed orefficiency)

But looking more closely...

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 25: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Efficient flying machines

An albatross can fly for hours (or even days) without flapping,even migrating upwind (exploiting gradients in the shear layer)

Butterflies migrate thousands of kilometers, carried by thewind

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 26: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Super-maneuverability

Peregrine falcons have been clocked at 240+ mph in dives,and have the agility to snatch moving prey

Bats have been documented...

Catching prey on their wingsManuevering through thick rain forests at high speedsMaking high speed 180 degree turns...

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 27: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Manipulating the Air

Birds far surpass the performance of our best engineeredsystems (especially UAVs) in metrics of efficiency,acceleration, and maneuverability.

The secret:

Birds (and fish, ...) exploit unsteady aerodynamics atintermediate Re

A manipulation problem

Requires unconventional mechanical and control designsOnce you start thinking of bird flight as manipulating the air,it becomes harder to appreciate fixed-wing flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 28: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Manipulating the Air

Birds far surpass the performance of our best engineeredsystems (especially UAVs) in metrics of efficiency,acceleration, and maneuverability.

The secret:

Birds (and fish, ...) exploit unsteady aerodynamics atintermediate Re

A manipulation problem

Requires unconventional mechanical and control designsOnce you start thinking of bird flight as manipulating the air,it becomes harder to appreciate fixed-wing flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 29: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Manipulating the Air

Birds far surpass the performance of our best engineeredsystems (especially UAVs) in metrics of efficiency,acceleration, and maneuverability.

The secret:

Birds (and fish, ...) exploit unsteady aerodynamics atintermediate Re

A manipulation problem

Requires unconventional mechanical and control designsOnce you start thinking of bird flight as manipulating the air,it becomes harder to appreciate fixed-wing flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 30: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Manipulating the Air

Birds far surpass the performance of our best engineeredsystems (especially UAVs) in metrics of efficiency,acceleration, and maneuverability.

The secret:

Birds (and fish, ...) exploit unsteady aerodynamics atintermediate Re

A manipulation problem

Requires unconventional mechanical and control designsOnce you start thinking of bird flight as manipulating the air,it becomes harder to appreciate fixed-wing flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 31: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Example: Efficient swimming upstream

from George Lauder’s Lab at Harvard(Liao et al, 2003)

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 32: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Prospects for machine learning

Key observations about fluid-body interactions at intermediateRe

Considerable previous work in system identification permits theuse of approximate modelsWon’t always be able to discretize the state spaceRelatively compact policies (few parameters) can generate alarge repetoire of behaviors

Formal analysis of the policy gradient algorithms reveals:

Performance (via SNR) degrades with the number of controlparametersPerformance is (locally) invariant to the complexity of theplant dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 33: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Prospects for machine learning

Key observations about fluid-body interactions at intermediateRe

Considerable previous work in system identification permits theuse of approximate modelsWon’t always be able to discretize the state spaceRelatively compact policies (few parameters) can generate alarge repetoire of behaviors

Formal analysis of the policy gradient algorithms reveals:

Performance (via SNR) degrades with the number of controlparametersPerformance is (locally) invariant to the complexity of theplant dynamics

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 34: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Heaving Foil

work with Jun Zhang (NYU Courant)

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 35: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The Heaving Foil

[Vandenberghe et al., 2006]

Rigid, symmetric wing

Driven vertically

Free to rotatehorizontally

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 36: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Symmetry breaking leads to forward flight

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 37: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Flow visualization

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 38: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Effect of flapping frequency

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 39: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The control problem

Previous work only used sinusoidal trajectories

Optimize stroke form to maximize the “efficiency” of forwardflight

Add vertical load cell (measures Fz(t))Dimensionless cost of transport:

cmt =

∫T|Fz(t)z(t)|dt

mg∫T

x(t)dt

Fortunately

min cmt = min

∫T|Fz(t)z(t)|dt∫

Tx(t)dt

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 40: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

The control problem

Previous work only used sinusoidal trajectories

Optimize stroke form to maximize the “efficiency” of forwardflight

Add vertical load cell (measures Fz(t))Dimensionless cost of transport:

cmt =

∫T|Fz(t)z(t)|dt

mg∫T

x(t)dt

Fortunately

min cmt = min

∫T|Fz(t)z(t)|dt∫

Tx(t)dt

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 41: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Prospects for optimization

CFD model[Alben and Shelley, 2005]

Takes approximately 36 hours to simulate 30 flaps

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 42: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Prospects for optimization

CFD model[Alben and Shelley, 2005]

Takes approximately 36 hours to simulate 30 flaps

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 43: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Experimental Optimization

Can we perform the optimization directly in the fluid?

Direct policy search

Needs to be robust to noisy evaluationsMinimize number of trials required

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 44: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Optimized policy gradient

The basic algorithm (weight perturbation):

Perturb the control parameters, p by some amount z fromN(0, σ)Perform the update:

∆p = −η(cmt(p + z)− cmt(p))z

Strong performance guarantees

E [∆p] ∝ −∂cmt

∂p.

Poor performance (requires many trials)

SNR optimized policy gradient [Roberts and Tedrake, 2009]

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 45: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Learning results

0 10 20 30 40 504

5

6

7

8

9

10

11

12

13

Trial

Rewa

rd

IC − Sine WaveIC − Smoothed Square Wave

learns in about 10,000 flaps (@ 15 minutes)

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 46: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Learning results (cont.)

0 0.2 0.4 0.6 0.8 1−30

−20

−10

0

10

20

30

t/T

z (m

m)

Executed Waveforms

Initial WaveformIntermediate WaveformFinal Waveform

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 47: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

A dynamic explanation

Forward speed is linear in flapping frequency

from experimentsstatement about average speed

Drag forces quadratic in speed (F ∝ ρSV 2)

Triangle wave obtains highest average speed w/ minimal drag

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 48: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Implications

Enabling tool for experimental fluid dynamicists

Suggests that motor learning algorithms could produceefficient control solutions in fluids

Suggests that we can use this to control robotic birds

Exciting prospects for online learning in changingenvironments

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 49: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Summary

Nonlinear, underactuated control w/ imperfect models viamachine learning control (birds don’t solve Navier-Stokes)

Allows our machines to exploit unsteady flow effects

Soon, robotic birds will:

Fly efficiently and autonomouslyOutperform fixed-wing vehicles in maneuverability

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 50: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

Acknowledgements

Students:

John RobertsRick CoryZack JackowskiSteve Proulx

Funding:

MIT, MIT CSAILNEC corporationLincoln Labs ACCMicrosoft

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers

Page 51: Learning Control at Intermediate Reynolds Numberspeople.csail.mit.edu/russt/iros2008_workshop_talks/Tedrake.pdfCurrent experiments are in \still air" Outdoor ights require robustness

References

Alben, S. and Shelley, M. (2005).Coherent locomotion as an attracting state for a free flappingbody.Proceedings of the National Academy of Sciences,102(32):11163–11166.

Roberts, J. W. and Tedrake, R. (2009).Signal-to-noise ratio analysis of policy gradient algorithms.In To appear in Advances of Neural Information ProcessingSystems (NIPS) 21, page 8.

Vandenberghe, N., Childress, S., and Zhang, J. (2006).On unidirectional flight of a free flapping wing.Physics of Fluids, 18.

Russ Tedrake, MIT CSAIL Learning Control at Intermediate Reynolds Numbers