november 16, learning
TRANSCRIPT
![Page 1: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/1.jpg)
Multi-Robot Systems
CSCI 7000-006Monday, November 16, 2009
Nikolaus Correll
![Page 2: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/2.jpg)
Course summary
• What are multi-robot systems and when to use them
• Reactive algorithms for multi-robot swarms• Deliberative algorithms for multi-robot teams• Gradient-based modeling and control• Probabilistic modeling and control• Optimization• Learning (today)
![Page 3: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/3.jpg)
Upcoming
• Fall break• November 30-December 11: project
presentations– Teach your peers about a specific aspect of multi-
robot systems– Recall background and theory from the class– Present your project and results
• Final reports due December 18
![Page 4: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/4.jpg)
PresentationsMonday Wednesday Friday Monday Wednesday Friday
Mikael Vijeth Maciej Apratim Patrick Marek
Neeti Jason Gregory Stephen Monish Anthony
Ben Rhonda Swamy Peter
Particlefilters
Reactive Swarms
Gradient-based approaches
Swarm Intelligence
Large-scale distributed systems
Multi-Robot Teams
You are giving the lecture this day! Coordinate among yourselves to present common material! We want to recall what we have seen in the course and learn something!
![Page 5: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/5.jpg)
Today
• Learning in multi-robot systems• Genetic algorithms and Particle Swarm
Optimization• Advantages of GA and PSO in distributed
systems
![Page 6: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/6.jpg)
• Encode controllers parametrically (e.g., Braitenberg parameters) into strings (chromosomes)
• Evaluate with robot test runs
• Cross-over and mutate chromosomes
• Repeat until end criterion met
Genetic Algorithms
http://en.wikipedia.org/wiki/Genetic_algorithm
phenotype
![Page 7: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/7.jpg)
Particle Swarm Optimization• Controller parameters span
search space• Instances of controllers are
particles in search space• Particles fly through search space
– Direction– Velocity– Inertia
• Attraction to positions with best results, both for the individual particle and for neighborhoods
• Optimization algorithm– Evaluate controllers– Update particles– Continue
Parameter 1
Parameter 2
Fitness
Current position
Current speed
Own best solution
Neighbors’s best solution
Next best solution
Swarm Intelligence (The Morgan Kaufmann Series in Artificial Intelligence) by Russell C. Eberhart, Yuhui Shi, and James Kennedy, 2001.
![Page 8: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/8.jpg)
Single robot learning: Example Modular Robots
• Gait generated by a Central Pattern Generator (CPG)
• Find parameters for CPG that maximize forward motion
Yvan Bourquin, Self-Organization of Locomotion in Modular Robots, M.Sc. Thesis, University of Sussex & EPFL
![Page 9: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/9.jpg)
Gait optimization results
Yvan Bourquin, Self-Organization of Locomotion in Modular Robots, M.Sc. Thesis, University of Sussex & EPFL
![Page 10: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/10.jpg)
Parallel Learning withMulti-Agent Optimization
• Standard technique with multi-agent optimization: evaluate in serial at each iteration
• Very slow evolution• In multi-robot systems,
can perform parallel evaluations
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
![Page 11: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/11.jpg)
Parallel Learning withMulti-Agent Optimization
• Standard technique with multi-agent optimization: evaluate in serial at each iteration
• Very slow evolution• In multi-robot systems,
can perform parallel evaluations
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
![Page 12: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/12.jpg)
Parallel Learning withMulti-Agent Optimization
• Standard technique with multi-agent optimization: evaluate in serial at each iteration
• Very slow evolution• In multi-robot systems,
can perform parallel evaluations
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
![Page 13: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/13.jpg)
Parallel Learning withMulti-Agent Optimization
• Standard technique with multi-agent optimization: evaluate in serial at each iteration
• Very slow evolution• In multi-robot systems,
can perform parallel evaluations
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
![Page 14: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/14.jpg)
Example 2: Obstacle Avoidance – Group learning, individual fitness
• Artificial Neural Network Control
• Fitness function* rewards speed, straight movement, and avoiding obstacles:
• V = average wheel speed, Δv = difference between wheel speeds, i = value of most active proximity sensor
*Floreano, D. and Mondada, F. (1996) Evolution of Homing Navigation in a Real Mobile Robot. IEEE Transactions on Systems, Man, and Cybernetics--Part B: Cybernetics, 26(3), 396-407.
![Page 15: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/15.jpg)
Parallel Learning Results• 20 individuals/particles for GA/PSO divided among 20 robots, evolved for 100 iterations• Results averaged over 100 trials
Performance of BestEvolved Controllers
Average PerformanceThroughout Evolution
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006. “Best-effort comparison”
![Page 16: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/16.jpg)
Communication-Based Neighborhoods
Ring Topology - Standard
![Page 17: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/17.jpg)
Communication-Based Neighborhoods
2-Closest – Model 1
![Page 18: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/18.jpg)
Communication-Based Neighborhoods
Radius r (40 cm) – Model 2
![Page 19: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/19.jpg)
Communication-Based NeighborhoodsPerformance of best controllers after evolution
Range
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
Topology• Both GA and PSO are sensitive to algorithmic parameters
• Difficult to compare and to design without analytical foundations
• Relation between embodiment, learning algorithm parameters, and fitness?
![Page 20: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/20.jpg)
Varying CommunicationRange - Results
Average swarm performance during evolution
J. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
![Page 21: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/21.jpg)
Example 3: Aggregation – Group learning, group fitness
• Same neural network setup as in obstacle avoidance
• Additional capability – sense relative positions of other nearby robots
• Additional inputs to neural network – center of mass (x,y) of detected robots
• Fitness function:
where robRP(i) is number of robots in range of robot iJ. Pugh and A. Martinoli. Multi-Robot Learning with Particle Swarm Optimization. In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 441 - 448, 2006.
![Page 22: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/22.jpg)
Group Learning andCredit Assignment - Results
Performance of best controllers after evolution
![Page 23: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/23.jpg)
Search Scenario• Use team of e-pucks (mobile robot with 7 cm diameter) • Robots must find (within 10 cm) targets in 4 m x 4 m arena• Robots can sense “intensity”, e.g. loudness of an audio source• Once found, targets are instantly moved to new location• Search continues indefinitely, though controller parameters may
be changed
![Page 24: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/24.jpg)
Example 4: Bacteria-InspiredSearch Algorithm
• E. coli chemotaxis:– Move– Check gradient– If positive, keep direction– If negative, tumble
• Replicate approach for searching robots• Add collaboration – instead of tumble, go
towards nearby robot with strongest detection
J. Pugh and A. Martinoli. Distributed Adaptation in Multi-Robot Search using Particle Swarm Optimization. In Proceedings of the 10th International Conference on the Simulation of Adaptive Behavior, Lecture Notes in Computer Science, pages 393-402, 2008.
![Page 25: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/25.jpg)
• Improved search performance through parameter optimization
• Adaptable parameters:– STEP_SIZE
Bacteria-InspiredSearch Algorithm
![Page 26: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/26.jpg)
• Improved search performance through parameter optimization
• Adaptable parameters:– STEP_SIZE– RL_RANGE
Bacteria-InspiredSearch Algorithm
![Page 27: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/27.jpg)
• Improved search performance through parameter optimization
• Adaptable parameters:– STEP_SIZE– RL_RANGE– CW_LIMIT
Bacteria-InspiredSearch Algorithm
![Page 28: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/28.jpg)
• Improved search performance through parameter optimization
• Adaptable parameters:– STEP_SIZE– RL_RANGE– CW_LIMIT– CCW_LIMIT
Bacteria-InspiredSearch Algorithm
![Page 29: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/29.jpg)
• Group Performance: number of targets found (within 10 cm) in evaluation span
• Individual Fitness: average detected power intensity – used for controller evaluation
• Detected intensity for robot i of all targets j:
• Distance detections inaccurate due to background noise
• Optimal parameter set affected by number of targets, power of targets
Search Experiments
![Page 30: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/30.jpg)
Search Adaptation Results• 50 robots, 3 targets with power 10, 120 second evaluations• Compare small, medium, and large PSO neighborhoods• Results averaged over 250 trials
Average Individual Fitness Average Group Performance
J. Pugh and A. Martinoli. Distributed Adaptation in Multi-Robot Search using Particle Swarm Optimization. In Proceedings of the 10th International Conference on the Simulation of Adaptive Behavior, Lecture Notes in Computer Science, pages 393-402, 2008.
![Page 31: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/31.jpg)
Simulation vs. Real robots
• Number of performance evaluations usually very high -> infeasible on real robots
• Simulation is an abstraction– Might not model noise accurately– Might not model inter-robot variations accurately
• Solution: multi-level modeling– 90% of the evaluations using model/simulation– 10% using real hardware
![Page 32: November 16, Learning](https://reader036.vdocuments.us/reader036/viewer/2022081602/5551c524b4c905922b8b48d6/html5/thumbnails/32.jpg)
Summary
• Population-based optimization algorithms are well suited for learning in multi-robot systems (parallelization)
• Main difficulty: selection of an appropriate fitness function
• GA and PSO are heuristics, are not guaranteed to perform, and are highly susceptive to parameter choice and algorithmic variations
• This is in contrast to analytical optimization