distributed intelligent systems – w6 a multi-level ... · swarm robotics • swarm-robotic...

Distributed Intelligent Systems – W6A Multi-Level Modeling

Methodology for Swarm Robotics

Outline• Multi-Level Modeling Methodology

– Rationale– Assumptions, implementation

• Model Calibration– Methods– Open issues

• Examples– Linear (obstacle avoidance)– Nonlinear (stick-pulling)

• Combined modeling and machine-learning methods• Diversity and specialization

Modeling Rationale and Methodology Overview

A First Achievement using Real Robots(Beckers, Holland, and Deneubourg, 1994)

Swarm Robotics: A new Engineering Discipline?

• Why does it work?• What are the principles?• Is a new paradigm or just an isolated experiment? • If yes, can we precisely define it?• Can we generalize these results to other tasks and

experimental scenarios?• How can we design an efficient and robust SR system?

Methods?• How can we optimize a SR system?

All these questions suggest that formal methods and therefore modeling could help!

• Even if individual node control deterministic, the interaction with the environment/teammates in the real-world is noisy and only statistically predictable →probabilistic models

• Some distributed intelligent systems might be self-organized: among key ingredients of self-organization (see week 1) there is randomness, an ingredient at the core of the exploration-exploitation balance of these systems. Coordinated collective behavior base on self-organization is statistically predictable using appropriate probabilistic models!

Rationale for Probabilistic Modelling –Distributed Intelligent Systems

Rationale for Probabilistic Modelling –Swarm Robotics

• Swarm-robotic systems represent a specific class of distributed intelligent systems: artificial, self-locomoted, scalable, etc.

• Often swarm-robotic systems consist of miniature/small-scale nodes. The smaller the units are and the more they might be affected by noise (crude sensors/actuators/com devices; limited computation): miniaturization increases the need for probabilistic modeling!

• It is not completely clear how much we can generalize the methodology proposed here to non-robotic platforms and to other class of experiments which do not fulfill its assumptions (on-going work)

• Different levels, different system/design parameterscertain low-level parameters (e.g., body morphology, sensor placement) captured explicitly only with very realistic models; others (e.g., communication range, number of individuals) captured also at higher level; diversity studied with microscopic but difficult with macroscopic models; system-oriented as opposed as single-layer oriented (e.g. control, communication)

• Different levels, different generalization powerthe higher the abstraction, the better the generalizing power (e.g., other hardware platforms, other experimental constraints,other class of experiments)

• Different levels, different computational costquantitatively accurate models have often to be solved numerically: the higher the abstraction, the lower the computational cost

Rationale for Multi-Level Modelling

Modeling Levels

Target swarm system: multiple units

ModelsMacroscopic models: single representation for the swarm

Microscopic models: unit represented individually • S&A-based simulator• Kinematic point simulator

• Multi-agent system• …

• Master equations• Chemical reaction networks

• Rate equations (ODE)• …

Approximations

Approximations

Approximations

Approximations

Abs

trac

tion

Expe

rim

enta

l tim

e

Multi-Level Modeling Methodology

Ss SaSs SaSs Sa

∑ ∑′ ′

′−′′=n n

nnn tNtnnWtNtnnW

dttdN )(),|()(),|()(

Ss Sa

Target system (physical reality): info on controller, S&A, communication, morphology and environmental features

Microscopic – Module-based: intra-robot (e.g., S&A, transceiver) and environment (e.g., physics) details reproduced faithfully

Microscopic – Agent-based: multi-agent models, only relevant robot features captured, 1 agent = 1 robot

Macroscopic: rate equations, mean field approach, whole swarm

Com

mon

met

rics

Originality and Differences with other Model-Based Approaches

• Specifically targeted to (miniature) swarm-robotic systems: exploit robust control design techniques for resource-constrained individual nodes (e.g. BB, ANN); incorporate microscopic constraints (e.g.,technology, environment) from start, bottom-up modeling and multi-level system-centered design approach

• Different from traditional model-based techniques in robotics: top-down approaches, start from idealized node model and relax assumptions; compensate with potential expensive/sophisticated technology and control algorithms for meeting the requirements; (multi-level) control-centered design approach

• Different from traditional modeling in biology (and other natural sciences): as simple as possible macroscopic models targeting a given scientific question; free parameters + fitting based on macroscopic measurements since often microscopic information notavailable/accurate

Modeling Assumptions

Modeling Assumptions 1. Probabilistic FSM description for environment and

multi-agent system; arbitrary state granularity2. Semi-Markovian properties: the system future state

is a function of the current state (and possibly of the amount of time spent in it)

3. Nonspatial metrics for swarm performance4. Mean field approach (well-mixed system): mean

spatial distribution of agents over multiple runs assumed to be homogeneous, as they were randomly hopping on the arena

5. Linear superposition of object/robot detection areas(sparse object/robot distribution; no overcrowding, no detection areas overlapped)

Assumptions 1 and 2• We work with states

pin poutTx

Sx

Sx: state xTx: duration of state xpin, pout: probabilities to entry and leave state x

• States can characterize both robots and the environment

Assumptions 3,4,5• Trajectories and object location do not count -> 1D

Montecarlo simulation• N objects of type i -> N x (prob. to encounter i)

O1O1

O2

O2O2

2D physical space 1D probability space

O2

O1

Free space

Experimental Validation of Assumption 3,4,5 – (1)

Position

Shape

Nonembodied obstacles = detection surfaces

0.310.31 ± 0.030.32 ± 0.020.3 ± 0.030.31 ± 0.04robot

GeometryAll shapesRoundRect.SquareSize

Numerical example (mean ± std dev, 3 locations, 100 h simulated time):

Experimental Validation of Assumptions 3,4,5

Symmetryof Stick Distribution

# sticks

Default

From Microscopic to Macroscopic Models:

Theoretical Background

Microscopic Level

∑ ∑′ ′

∆+′−′∆+=

−∆+=∆

n ntnptnttnptnptnttnp

tnpttnptnp),(),|,(),'(),|,(

),(),(),(

p(n,t) = probability of an agent to be in the state n at time tIf Markov properties fulfilled (neglect distributions):

inflow outflow

Probability the agent was in a given state n’

Transition probability

ttnttnptnnW

t ∆′∆+

=′→∆

),|,(lim);|(

0Transition rate

Sum over all possible states n’ the agent can be in

Macroscopic Level

∑ ∑′ ′

′ ′−′=n n

nnn tNtnnWtNtnnW

dttdN )(),|()(),|()( Rate Equation

(time-continuous)

inflow outflown, n’ = states of the agents (all possible states at each instant)Nn = average fraction (or mean number) of agents in state n at time tW = transition rates (linear, nonlinear);

∑ ∑′ ′

′ ′−′+=+n n

nnnn kTNkTnnTWkTNkTnnTWkTNTkN )(),|()(),|()())1((

Time-discrete version. k = iteration index, T time step (often left out)

Left and right side of the equation: averaging over the total number of agents, dividing by ∆t, limit ∆t → 0; neglect distributions of the stochastic variables (mean field approach):

Time Discretization: The Engineering Receipt

1. Assess what’s the time resolution needed for your swarm performance metrics

2. Consider unit-to-unit interaction essence: probabilistic/deterministic, asynchronicity role, …

3. Choose whenever possible the most computationally efficient model: time-discrete less computationally expensive than emulation of continuity (e.g. Runge-Kutta, etc.); in our systems/metrics there is no evidence of decreased prediction accuracy

4. Advantage of time-discrete models: a single common sampling rate can be defined among different abstraction levelsOften not even a tradeoff: just use the appropriate

instrument for the appropriate problem!

Time-discrete vs. time-continuous models:

Model Parameters• Gray-box approach: a priori information about the system

is exploited – multi-agent system– # of agents– technological and environmental constraints

• Models should not only explain but have also predictive power: the mapping between model parameters and design choices should be as explicit as possible (the higher the abstraction level the more difficult it is)

• Two types of parameters for micro-AB and macro: – State durations (e.g., interaction times with objects)– State-to-state transition probabilities (e.g., encountering

probabilities)

From the Target System to Macroscopic Models: An Incremental Bottom-Up Recipe in Seven Steps

Recipe: Target System & Metric(s)1. Perform your basic design choices for the single robot:

HW (e.g., robot morphology, S&A technology, etc.); SW (e.g., control architecture)

2. Define your system performance metric(s)

Recipe: Module-Based Model & FSM3. Implement faithfully your design choices in a module-

based microscopic model (in principle even running the same control code; component library provided)

4. Capture the control structure with a finite number of states of interest (semi-markovian properties must be fulfilled) and generate a corresponding FSM

Search Avoidance

Obstacle detected

Grip

Obstacle avoided

Recipe : Microscopic-AB Model

…

Caste 1Robotic System (N PFSM; N = total # agents)

Environment (Q PFSM; Q = total # objects)

Coupling (e.g., manipulation, sensing)

…O11 O1p

Oq1 Oqr

Ss Sa

Ss Sa

Ss Sa

R11

R12

R1l

…

Caste n

Se Sd

Si

Rn1

…Se Sd

Si

Rnm

… …Sa Sb Sa Sb

5. Approximate local interactions and intra-robot details and develop an agent-based model

Recipe : Macroscopic Model

Environment (1 PFSM)

Coupling

Type 1

Ss Sa

Se Sd

Si

Type q

Caste1

Caste n

Robotic system (1 PFSM)

• Single representation for the whole swarm

• average quantities• central tendency prediction (1 run)• continuous quantities: +1 ODE per

state for all robotic castes and object types (metric/task dependent!)

• - 1 ODE if substituted with conservation equations (e.g., total # of robots, total # of objects of type q, … )

Sa Sb

6. Approximate micro-to-macro mapping (mean field): exploit AB-blueprint and build the macroscopic model

• Two types of parameters for micro-AB and macro: – State durations (e.g., interaction times with objects)– State-to-state transition probabilities (e.g., encountering probabilities)

• A good calibration procedure should:– Maximize the quantitative matching between the target system and the

model according to one or several metrics of performance evaluation– Minimize a posteriori fitting– Have a low cost in terms of experimental time and hardware availability– Maximize model predictive power by allowing to integrate detailed

parameters anchored to physical/technological reality7. Calibrate using following procedures:

– Method 1: run “orthogonal” experiments on local a priori known interactions (robot-to-robot, robot-to-environment) → use for all types of interaction happening these values

– Method 2: use all a priori known information (e.g., geometry, technology) without running experiments → get initial guesses → fine-tune parameters automatically on the target experiment with a as small as possible subset of the target system

Recipe: Model Calibration

Model Calibration – Method 1[Correll & Martinoli ISER 2004]

Delays & Discretization Interval1. Measure all interaction times of interest in your system, i.e. those

which might influence the swarm performance metrics. Note: often “delay states” can just summarize all what you need without getting into the details of what’s going on within the state.

2. Consider only average values (we might consider also parameter distributions in the future, the modeling methodology does not prevent to do so)

3. For time-discrete systems: choose the time step T = MCF all the delays measured (e.g., 3 s obstacle avoidance, 4 s object manipulation, T = 1 s) -> no rounding error.Note: more accuracy in parameter measuring means in this case more computational cost when simulating

Geometric Probabilities gi(Normalized Detection Areas)

gs, gw, …are function of sensor range, behavior, robot’s and object’s size, … : interaction characterization!

Example: stick

gs = As/Aarena

As

Encountering Probabilities

is

si g

AvWr =

As = detection area of the smallest object

v = mean robot speedWs = robot’s detection width for the

smallest object (center-to-center)

1. Measure geometric probabilities of detection gi2. Calculate the encountering rate ri [s-1] for the object i from the

geometric probabilities gi :

3. For time-discrete models, calculate the encountering probabilities pi (per time step) from the encountering rates:

pi = riT

NOTE: see ISER 04; slightly different from IJRR04 (decoupled time and space) !

Model Calibration - Practice• Assumptions (well-mixed, linear overlap of areas) might

be only partially fulfilled• We do not capture distributions in the model parameters,

only deterministic average values; distributions might more faithfully capture:– Controller type (e.g. distal vs. proximal)– Active vs. passive objects (e.g., robot vs. puck)– Size/movability of objects (e.g., wall, blade vs. seed)– Embodiment vs. non embodiment (e.g., area vs. real obstacle)– Way of measuring your metrics (e.g., egocentric, allocentric)– Impact on the considered swarm performance metric through

error propagation (clear decoupling between parameters and structure inaccuracies of the model)

Allocentric Parameter Calibration

• From a bird vie perspective, using an external system (e.g. supervisor in Webots, tracking system in reality)

• Collision with an obstacle defined by a fixed distance between the obstacle and the robot

• High variance on the state durations, little variance (only due to different obstacle types such as wall, robots, corners, other objects) in the encountering probability

Egocentric Parameter Calibration

• From the internal robot perspective • Collision with an obstacle based on behavioral

transition and/or sensory perceptual threshold(might be slightly different for different type of controllers)

• Medium variance on the state durations and encountering probabilities (both due to due to the obstacle avoidance maneuvering, sensor and actuator noise, etc.)

Model Calibration - Practice

Micro-MB to micro-AB comparison(different controllers, calibration scenarios: static with allocentric vsegocentric calibration and allocentricmetric)

Na*/N0

Matching between model calibration and model validation: the example of allocentric vs. egocentric calibration

Issue: model structure based on egocentric information (controller) and metric usually on allocentricperformance (environment-based)

Model Calibration - PracticeBin distribution of interaction time Ta (mean Ta= 25 *50 ms = 1.25 s)

# of

col

lisio

ns

Collision time

Micro AB/macro, deterministic delay

Micro-MB, proximal contr., alloc.Micro-MB, distal controller, allocentric

Micro AB/macro, prob. delay

Model Calibration - PracticeGeometric probability g: example of transition in space from search to obstacle avoidance (1 moving robot, 1 dummy robot, Webots measurements, egocentric)

Distal controller (rule-based)

Proximal controller (Braitenberg, linear)

Model Calibration – Method 2[Correll & Martinoli, DARS 2006]

Distributed Boundary Coverage• Case study: turbine inspection• 25 blades• Up to 40 Alice II robots• Metric: time to complete the

inspection• Suite of algorithms (reactive to

deliberative) and models• In this course: focus on

reactive and introduce modeling methodology

Distributed Reactive Algorithms

10 alices, 25 blades, reactive blade-to-blade movement

20 alices, 25 blades, obstacle avoidance only

Parameter Calibration –Starting from Method 1

Encountering probability– Detection area– Robot speed– Sensor range– Ad hoc experiments

Interaction times– A hoc experiments

Improvement bydata-fitting

Can avoid ad hoc experiments for enc. probabilitiesCan relax one assumption

Relaxing Modeling Assumptions5. Linear superposition of object/robot detection areas (sparse

object/robot distribution; no overcrowding, no detection areas overlapped)

• Solution: constrained system identification technique [Correll & Martinoli, DARS-2006]

• Results (6 unknown par.)

Initial guess (geometry, specs)ExperimentOptimal parameterizedModel

Current Limitations

• Durations calibrated with previous method (constrained)

• Method has been tested with the whole system in place (including the targeted size in terms of number of robots).

Linear Example:Wandering and Obstacle

Avoidance

A Simple Linear Model

© Nikolaus Correll 2006

Example: search (moving forwards) and obstacle avoidance

A simple Example

Nonspatiality& microscopiccharacterizationDeterministic

robot’s flowchart

Search Avoidance

Start

Obstacle?YN

Search Avoid., τa

Start

Obstacle?pa

ps

1-pa

Probabilistic agent’s flowchart

Ss Sa

pa

τa

ps

PFSM

Linear Model – Probabilistic Delay

Search Avoidance, Ta

Ta = mean obstacle avoidance durationpa = probability of moving to obstacle av.ps = probability of resuming searchNs = average # robots in searchNa= average # robots in obstacle avoidanceN0 = # robots used in the experimentk = 0,1, … (iteration index)

Ns(k+1) =

Na(k+1) =

Ns(k)

N0 – Ns(k+1)

ps=1/Ta

+ psNa(k)- paNs(k)

pa

Ns(0) = N0 ; Na(0) = 0

Linear Model – Deterministic Delay

Search Avoidance, Ta

Ta = mean obstacle avoidance durationpa = probability moving to obstacle avoidanceNs = average # robots in searchNa= average # robots in obstacle avoidanceN0 = # robots used in the experimentk = 0,1, … (iteration index)

Ns(k+1) =

Na(k+1) =

Ns(k)

N0 – Ns(k+1)

1

+ paNs(k-Ta)- paNs(k)

pa

! Ns(k) = Na(k) = 0 for all k<0 !Ns(0) = N0 ; Na(0) = 0

Linear Model – Sample Results

Micro-AB to macro comparison(same robot density but wall surfacebecome smaller with bigger arenas)

Micro-AB to micro-MB comparison(different controllers, static scenarios, allocentric measures)

Na*/N0

Steady State Analysis• Nn(k+1) = Nn(k) for all states

n of the system → Nn*

• Note 1: equivalent to differential equation of dNn/dt = 0

• Note 2: for time-delayed equations easier to perform the steady-state analysis in the Z-space but in t-space also ok (see IJRR-04)

• For our linear example (time-delay option):

aas Tp

NN+

=1

0*

Group size

Ex.: normalized mean number of robots in search mode at steady state as a function of time for obstacle avoidance

aa

aaa Tp

TpNN+

=1

0*

aas Tp

NN+

=1

0*

Nonlinear Example –Stick-Pulling

A Case Study: Stick-Pulling

Proximity sensors

Arm elevationsensor

Physical Set-Up Collaboration via indirect communication

• 2-6 robots• 4 sticks• 40 cm radius arena

IR reflectiveband

Systematic Experiments

Real robots Module-based model

•[Martinoli and Mondada, ISER, 1995]•[Ijspeert et al., AR, 2001]

Experimental and Realistic Simulation Results

• Real robots (3 runs) and realistic simulations (10 runs)• System bifurcation as a function of #robots/#sticks

Nrobots > Nsticks

Nrobots ≤ Nsticks

Geometric Probabilities

sgg

sg

aww

rR

arr

ass

pRp

ppAAp

NppAApAAp

=

==

−===

2

1

0

/)1(

//

Aa = surface of the whole arena

From Reality to Abstraction

Deterministic robot’s flowchart

Probabilistic agent’sflowchart

PFSMNonspatiality& microscopiccharacterization

Full Macroscopic Model

• 6 states: 5 DE + 1 cons. EQ• Ti,Ta,Td,Tc ≠ 0; Τxyz = Τx + Τy + Τz• TSL= Shift Left duration• [Martinoli et al., IJRR, 2004]

)()()()()()()();()()(])()([)()1(

22

121

iasRaswcdascdagcascag

cgasacgagsRwggss

TkNpTkNpTkNTkTkNTkTkNTkTkkNppkkkNkN

−+−+−−∆+−−∆+−Γ−∆+++∆+∆−=+

For instance, for the average number of robots in searching mode:

∏−

−−=

−=Γ

=∆

−−=∆

SL

SLg

Tk

TTkjsgSL

ggg

dggg

jNpTk

kNpk

kNkNMpk

)](1[);(

)()(

)]()([)(

2

22

011

with time-varying coefficients (nonlinear coupling):

Swarm Performance Metric

C(k) = pg2Ns(k-Tca)Ng(k-Tca)

e

T

k

T

kCe

∑== 0

t

)( (k)C

: mean # of collaborations at iteration k

: mean collaboration rate over Te

Collaboration rate: # of sticks per time unit

Results (Standard Arena)

Micro-MB (10 runs)Micro-AB (100 runs)Macro (1 run)

Discrepancies because of ODE approximations (nonlinearities + discrete exact vs. average quantities)

Results: 4 x #Sticks, #Robots and Arena Area

Micro-MB (10 runs)Micro-AB (100 runs)Macro (1 run)

Reducing the Macroscopic Model

Τi,Τa,Τd,Τc << Τg →Τi=Τa=Τd=Τc=0

Goal: reach mathematical tractability

Nonlinear coupling!

Reduced Macroscopic Model

Search Grip

Ns = average # robots in searching modeNg= average # robots in gripping modeN0 = # robots used in the experimentM0 = # sticks used in the experimentΓ = fraction of robots that abandon pullingTe = maximal number of iterationsk = 0,1, …Te (iteration index)

Ns(k+1) =

Ng(k+1) =

Ns(k) – pg1[M0 – Ng(k)]Ns(k)

successful

+ pg2Ng(k)Ns(k)

unsuccessful

+ pg1[M0 – Ng(k-Τg)]Γ(k;0)Ns(k-Tg)

N0 – Ns(k+1)

∏−=

−=Γk

Tkjsg

g

jNpk )](1[)0;( 2

Ns(0) = N0, Ng(0) = 0Ns(k) = Ng(k) = 0 for all k<0

Initial conditions and causality

Results Reduced Micro-AB Model

• 4 robots, 4 sticks, Ra = 40 cm • 16 robots, 16 sticks, Ra = 80 cm

• Micro-AB (100 runs) and macro models overlapped• Only qualitatively agreement with micro-MB/real robots results

Steady State Analysis (Reduced Macro Model)

• Steady-state analysis [Nn(k+1) = Nn(k)] → It can be demonstrated that :

g

optg RM

NforT+

≤∃1

2

0

0

with N0 = number of robots and M0= number of sticks, Rg approaching angle for collaboration

• Counterintuitive conclusion: an optimal Tg can exist also inscenarios with more robots than sticks if the collaboration is very difficult (i.e. Rg very small)!

∝

approaching angle for collaboration

Analysis Verification (Micro-AB and Macro Full Model)

gg RR101~ =

20 robots and 16 sticks (optimal Tg)

Example: (collaboration very difficult)

• can be computed numerically by integrating the full model ODEs or solving the full model steady-state equations

Optimal Gripping Time• Steady-state analysis → can be computed analytically in

the simplified model (numerically approximated value):

gc

g

gg

optg R

forR

NRpT

+=≤

−

+−

−=

12

21

)1(2

1ln

)2

1ln(

10

1

βββ

β

optgT

with β = N0/M0 = ratio robots-to-sticks

[Lerman et al, Alife Journal, 2001], [Martinoli et al, IJRR, 2004]

optgT

Journal Publications using the Same Modeling Framework

Stick Pulling

• [Martinoli, Easton, Agassounon, Int. J. of Robotics Res., 2004]• [Lerman, Galstyan, Martinoli, Ijspeert, Artificial Life, 2001]• [Ijspeert, Martinoli, Billard, Gambardella, Auton. Robots, 2001]

Object Aggregation

• [Agassounon, Martinoli, Easton, Autonomous Robots, 2004]• [Martinoli, Ijspeert, Mondada, Robotics and Auton. Systems 1999]

Wireless-Based Cohesive Swarming

• [Winfield, Liu, Nembrini, Martinoli, Swarm Intelligence J., 2008]

Combined Modeling and Machine-Learning

Techniques

Rationale for Combined Methods (1)• Any level of modeling (micro-MB, micro-AB, or macro) allow us to consider

certain parameters and leave others; models, as expression of reality abstraction, can be considered as more or less coarse “filters” of the reality

• Combined modeling/machine-learning techniques can be be used at any of the abstraction levels; machine-learning techniques will explore the design parameters explicitly represented at a given level of abstraction

• Depending on the features of the hyperspace to be searched (size, continuity, noise, etc.), appropriate machine-learning techniques should be used (e.g., single-agent vs. multi-agent techniques); the different mapping policies (e.g., individual/group, public/private, homogeneous/heterogeneous) are “orthogonal” and can be applied to different microscopic levels

• One particular optimization problem is system identification: the performance to optimize is the matching with the reality (or with a lower abstraction level). See model calibration theory, calibration method #2, this lecture.

Rationale for Combined Methods (2)

Ss SaSs SaSs Sa

Ss Sa

Target system + ML = adaptation with HW in the loop (on-board or off-board)

MB-Microscopic + ML (see Week 05 examples using GA and PSO); for instance low-level design parameters can be learned

AB-Microscopic + ML (see this lecture’s examples); for instance, diversity and specialization can be studied

Macroscopic + ML? Most of the time not needed since very fast + continuous; homogeneous systems mainly;standard numerical optimization techniques/systematic search can be used

Abs

trac

tion

Cos

t of o

ptim

izat

ion/

desi

gn

In-Line Adaptive Learning

Not Always a big Artillery such a GA/PSO is the Most Appropriate Solution…

• Simple individual learning rules combined with collective flexibility can achieve extremely interesting results

• Simplicity and low computational cost means possible embedding on simple, real robots

• Can be used for all sort of parameters, for instance also those controlling the robot activity for a given task in a threshold-based algorithm (see lecture Week 7)

In-Line Adaptive Learning(Li, Martinoli, Abu-Mostafa, 2001)

• GTP: Gripping Time Parameter • ∆d: learning step• d: direction• Underlying low-pass filter for measuring the performance

Algorithm Parameters

Low-pass filter

Adapting rules for the learning step

From Li et al., Adaptive Behavior, 2004

In-Line Adaptive LearningDifferences with gradient descent methods:• Fixed rules for calculating step increase/decrease → limited

descent speed → no gradient computation → more conservative but more stable

• Randomness for getting out from local minima (no momentum)

• Underlying low-pass filter is part of the algorithm

Differences with Reinforcement Learning:• No learning history considered (only previous step)

Differences with basic In-Line Learning:• Step adaptive → faster and more stability at convergence

Co-Learning in a Collaborative Framework –

Homogeneous Systems

Baseline: Homogeneous TeamThree orthogonal axes to consider (extremities or balanced solutions are possible):

• Individual and group fitness• Private (non-sharing of parameters) and public (parameter sharing) policies• Homogeneous vs. heterogeneous systems

Example with binary encoding of candidate solutions

Sample Results – Homogeneous System

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

1

1.2

1.4

Initial gripping time parameter (sec)

Sti

ck−p

ulli

ng

rat

e (1

/min

)

2 robots

3 robots

4 robots

5 robots

6 robots

0 100 200 300 400 500 6000

0.2

0.4

0.6

0.8

1

1.2

1.4

Initial gripping time parameter (sec)

Sti

ck−p

ulli

ng

rat

e (1

/min

)2 robots

3 robots

4 robots

5 robots

6 robots

Short averaging window(filter cut-off f high)

Long averaging window(filter cut-off f low)

Learned (mean + std dev)Systematic (mean only) Note: 1 parameter for the

whole group!

Co-Learning in a Collaborative Framework –Heterogeneous Systems

Viable Policies for Exploring Het. TeamsThree orthogonal axes to consider (extremities or balanced solutions are possible):

• Individual and group fitness• Private (non-sharing of parameters) and public (parameter sharing) policies• Homogeneous vs. heterogeneous systems

Viable for Viable for exploring exploring heterogeneous heterogeneous solutionssolutions

Not scalable

Heterogeneity allowed but eventually roughly homogeneous solution via shuffle around of candidate solutions

Homogeneity enforced

4 robots, one per color, AB-micro + learning

Key question: does team diversity enhance performance? I.e., canindividual become specialized?

Performance ratio between 2 caste and homogeneous system (MB- and AB-microscopic models, systematic)

Diversity and Specialization

Impact of Diversity on Performance(Li, Martinoli, Abu-Mostafa, 2004)

2 3 4 5 6

1

1.05

1.1

1.15

1.2

1.25

1.3

Number of robots

Stic

k−pu

lling

rat

e ra

tio

2−caste, GlobalHeterogeneous, GlobalHeterogeneous, Local

Performance ratio between heterogeneous (full and 2-castes) and homogeneous groups AFTER learning

Notes: • large Tm (long averaging

window)• only private strategies

global = group• local = individual

Diversity Metrics(Balch 1998)

Entropy-based diversity measure introduced in AB-04 could be used for analyzing threshold distributions

Simple entropy:

pi = portion of the agents in cluster i; m cluster in total; h = taxonomic level parameter

Social entropy:

Example – Simple Entropy

34

6r2

r1

r3

• R = {r1, r2, r3}• n = 3 (three swarm points)• bi-dimensional space • define a distance: Euclidian distance• h = taxonomic level parameter• m = number of clusters

34

6r2

r1

r3

h < 3, m = 3

477.031log

313

)31,

31,

31(log)(

3

1

=−

==−= ∑=

HppRHi

ii

34

6r2

r1

r3

3 ≤ h < 4, m = 2

c1

c2

276.0117.0159.032log

32

31log

31

)32,

31(log)(

2

1

=+=−−

==−= ∑=

HppRHi

ii

c1

c2c3

Example – Simple Entropy

11

=∑=

m

iip

34

6r2

r1

r3

4 ≤ h < 6, m = 2c1

c2

301.021log

21

21log

21

)21,

21()

21

31

31,

21

31

31(log)(

2

1

=−−

=++=−= ∑=

HHppRHi

ii

34

6r2

r1

r3

h ≥ 6, m = 1

c1

01log)33(log)(

1

1

=−==−= ∑=

HppRHi

ii

Check with overlapping clusters!

Example – Social Entropy

309.20)21,

21(2)

32,

31(1)

31,

31,

31(3)()(

0

=+×+×+×== ∫∞

HHHRHRD

Note: In contrast to simple entropy ≥ 1

Contrast with R = {r1, r2, r3} and r1 = r2= r3 (homogeneous swarm), for any h ≥ 0 → single cluster → D(R) = 0!

Differences with Plain Euclidian Diversity Measure

• Underlying difference measure might be the same (e.g. Euclidian distance) • Social entropy is looking for possible clustering of the vectors (looking for

possible castes) while Euclidian diversity is just looking how spread out/diverse in general are the vectors

Components in all dimensions

All points from any other point

Specialization Metric

S = specialization; D = social entropy; R = swarm performance

Specialization metric introduced in AB-04:

Notes• Idea: “weighting diversity with performance”• This is useful when the number of tasks to be solved is not well-defined or it is

difficult to assess the task granularity a priori. In such cases the mapping between task granularity and caste granularity might not trivial (one-to-one mapping? How many sub-tasks for a given main task, etc. see the limited performance of a caste-based solution in the stick-pulling experiment)

• Could be used for analyzing specialization arising from a variable-threshold division of labor algorithm (see lecture Week 7)

Sample Results in the Standard Sticks

Diversity SpecializationRelative Performance

• Spec more important for small teams

• Local p > global p• Enforced caste: pay the

price for odd team sizes

• Flat curves, difficult to tell whether diversity bring performance

• Specialization higher with global when needed, drop more quickly when not needed

• Enforcing caste: “low-pass filter” effect

• 2 serial grips needed to get the sticks out• 4 sticks, 2-6 robots, 80 cm arena

Sample Results for Long Sticks

Diversity SpecializationRelative Performance

• Spec more important for small teams

• Global p > local p• Enforced caste: helps

because of original large space (= # of robots)

• Flat curves, difficult to tell whether diversity bring performance

• However, consistent with standard set-up

• Specialization higher with global when needed, drop more quickly when not needed

• Not clear why spec grows with team size and local

• 4 serial grips needed to get the sticks out• 16 sticks, 8-24 robots, 4 times bigger area

• Global performance is usually less noisy, so part of diversity for increasing performance higher with global performance (“specialization when needed”)

• Default sticks - When local and global performance are almost aligned (i.e. “by doing well locally I do well globally”), local performance achieve slightly better results since direct reinforcement (no credit assignment problem)

• Long sticks - When local and global performance are not aligned (i.e. only for grip k performance aligned, for grip 1 … k-1 no insurance a “given baton passing action” will lead to stick pulled out), global reinforcement achieve better results at the price of a bigger credit assignment problem resulting usually in slower learning

Some Remarks on the Results

Conclusion

Take Home Messages

• Models help understanding and generalizing properties of real-time distributed intelligent systems

• Two main levels of models: micro and macro• Often macroscopic models using the mean field

approach result in nonlinear, time-delayed ODE• Multi-level modeling allows for different

approximations, accuracy/computation trade-offs• A methodology with precise inter-level mapping has

been developed for a given class of experiments• If carefully designed, models allow also for system

optimization and closing the loop between analysis and synthesis

Take Home Messages

• Different modeling levels can be combined with machine-learning for design and optimization purposes

• AB-microscopic models allows for efficiently studying diversity and specialization issues

• Specialization is the part of diversity that improves performance

• The diversity and specialization level of a heterogeneous swarm can be quantitatively measured

Additional Literature – Week 6Book• T. Balch and L. E. Parker (Eds.), Robot teams: From diversity to polymorphism.

Natick, MA: A K Peters.• Ross S. M., Introduction to Probability Models, Academic Press, San Diego,

CA, USA, 1997.

Papers• Ijspeert A. J., Martinoli A., Billard A., and Gambardella L.M., “Collaboration

through the Exploitation of Local Interactions in Autonomous Collective Robotics: The Stick Pulling Experiment”. Autonomous Robots, 11(2):149–171, 2001.

• Lerman, K. and Galstyan, A. Mathematical model of foraging in a group of robots: Effect of interference. Autonomous Robots, 13(2):127–141, 2002.

• Murciano, A., Millán, J. del R., & Zamora, J. (1997). Specialization in multi-agent systems through learning. Biological Cybernetics, 76, 375–382.

• Wolpert, D. H., & Tumer, K. (2001). Optimal payoff functions for members of collectives. Advances in Complex Systems, 4, 265–279.

• Matarić, M. J. (1998). Using communication to reduce locality in distributed multi-agent learning. Journal of Experimental and Theoretical Artificial Intelligence, 10, 357–369.

distributed intelligent systems – w6 a multi-level ... · swarm robotics • swarm-robotic...

Documents