2d-environment navigation using a neural network
DESCRIPTION
Matlab simulation of a neural network for 2d environment navigation with obstacle avoidanceTRANSCRIPT
Cognitive Robotics Project
Di Lecce Arturo
POLITECNICO DI MILANO
2D-environment navigationusing a neural network
Abstract
1. a robot with some kind of obstacle sensors
2. a controller for obstacle avoidance
The purpose of this project is simulating a robot that can navigate in a 2D environment avoiding obstacles
What we need:
Arturo Di Lecce POLITECNICO DI MILANO
The robot
The robot is a bi-wheeled w/ circle shape rover equipped with:
Arturo Di Lecce POLITECNICO DI MILANO
Head Direction
Robot body (diam = 1 robot unit)
sensors range indicators
1) touch sensors2) range finder sensors
The robot - touch sensors
Arturo Di Lecce POLITECNICO DI MILANO
two strips of touch sensors - one from -45° to 0° respect head direction (left)- one from 0° to 45° respect head direction(right)
LEFT TOUCH SENSORSdetects objects at 0.6 r.u
in [45°,0°]
RIGHT TOUCH SENSORSdetects objects at 0.6 r.u
in [0°,-45°]
The robot - range finder sensors
Arturo Di Lecce POLITECNICO DI MILANO
two range finder sensors - one from -45° to 0° respect head direction (left)- one from 0° to 45° respect head direction(right)
LEFT RANGE FINDERSstarts to detect objects
at 11 r.u in [45°,0°]
RIGHT RANGE FINDERSstarts to detect objects
at 11 r.u in [0°,-45°]
Neural network controller
Arturo Di Lecce POLITECNICO DI MILANO
“Learning Anticipation via Spiking Networks: Application to Navigation Control”
(Arena et al. - 2009)
● Speaks about a 2-layer neural network controller useful for robot navigation in 2D spaces with capability of obstacle avoidance and target approaching
• We focus on obstacle avoidance
Neural network controller - the spiking network (1/3)
Arturo Di Lecce POLITECNICO DI MILANO
The neuron model used for this controller, is the one proposed by Eugene M. Izhikevich in his paper “Simple Model of Spiking Neurons” (2003):
Neural network controller - the spiking network(2/3)
Arturo Di Lecce POLITECNICO DI MILANO
Depending on four parameters (a,b,c,d), the model reproduces spiking and bursting behavior of known types of cortical neurons
● a is the time scale of recovery variable u(t)● b is the sensitivity of recovery variable u(t)● c is the after spike reset value of membrane potential v(t)● d is the after spike reset value of recovery variable u(t)
Neural network controller - the spiking network(3/3)
Arturo Di Lecce POLITECNICO DI MILANO
Neural network controller – controller structure (1/3)
Arturo Di Lecce POLITECNICO DI MILANO
Composed by:● 2 unconditioned stimuli neurons UC● 2 consditioned stimuli neurons CS● a constant source of neural spikes Go-On● 2 motor neurons RMgo & LMgo● 2 motor neurons RMturn & LMturn
Neural network controller – controller structure (2/3)
Arturo Di Lecce POLITECNICO DI MILANO
UnconditionedStimuli
neurons
Go-OnMotor
Neurons
TurnMotor
Neurons
ConditionedStimuli
neurons
Neural network controller – controller structure (3/3)
Arturo Di Lecce POLITECNICO DI MILANO
2 kind of inputs to 2°nd layer neurons:
● excitatory synapses (like USL → RMTURN) marked with an arrow● Inhibitory synapses (like USL → RMGO) marked with a dot
Neural network controller – controller integration
Arturo Di Lecce POLITECNICO DI MILANO
Left Touch Sensor
Right Touch Sensor
Left RangeFinder
Right RangeFinder
Right Motor
LeftMotor
Neural network controller – controller inputs
Arturo Di Lecce POLITECNICO DI MILANO
If the distance between the robot and an nearest obstacle is
<= 0.6 robot units
● Unconditioned stimuli & touch sensors
With this function, the range finders sensors approximately starts to fire at
a distance of 11 robot units (d0 is obstacle distance)
● Conditioned stimuli & range finders
Neural network controller – neuron input
Arturo Di Lecce POLITECNICO DI MILANO
The synaptic input to a generic neuron J is given by:
Where:● wij represents the weight of the synapse from neuron i to neuron j
● ti is the instant in which a generic neuron i connected to neuron j, emits a spike
● The function ε(t) is
Neural network controller – controller outputs
Arturo Di Lecce POLITECNICO DI MILANO
A single motor input is the sum of:● # of spikes emitted by GoOn neuron● # of spikes emitted by Turn neuron
Neural network controller – synaptic weights learning
Arturo Di Lecce POLITECNICO DI MILANO
Synaptic weights for CS-->TurnMotors are continuously learned according to the STDP rule:
Where Δt is the difference between the spiking time of the presynaptic neuron and that of the postsynaptic one and the rest are parameters of the learning algorithm.
For avoiding that the weights increase steadily, weight decay has been introduced:
Neural network controller – controller behavior (1/2)
Arturo Di Lecce POLITECNICO DI MILANO
● The motor driver signal depends on the number of spikes in the output neurons assigned to the motor
● GoOn motor neurons generate the spike train needed to let the robot advance in the forward direction
● The spikes of the Turn motor neurons are then summed to the ones of the GoOn neurons
● In presence of collisions, GoOn neurons are inhibited and the forward movement is suppressed
● When the left and right motor neurons emit an equal number of spikes, the robot moves forward with a speed proportional to the number of spikes. In absence of conditioned stimuli, the amplitude of the forward movement is about 0.3 r.u. for each step
FORWARD MOVEMENT
Neural network controller – controller behavior (2/2)
Arturo Di Lecce POLITECNICO DI MILANO
● When there is a difference in the number of spikes emitted by left and right motor neurons, the robot rotates. The angle of rotation (in counterclockwise direction) is:
θ = 0.14*Δnswhere
Δns = nr - nl
● We count the spikes emitted both by the left and the right neuron and the robot advance with a speed proportional to the minimum number of them
ROTATION
Simulation - Intro
Arturo Di Lecce POLITECNICO DI MILANO
Main simulation function:
Parameters:● posn: start position of the robot [x,y,θ];● map_name: relative path of the map image (.png)● steps: #of simulation steps● w: initial weights (if 0, weights are initialized into function)● debug: flag for debugging and showing some graphs● drawPath: flag for drawing the robot movements at the end of the simulation
Returns:● posn: final position of the robot● w: weights at the end of the simulation
Simulation – neural network representation
Arturo Di Lecce POLITECNICO DI MILANO
Membranepotentials
Recoveryvariables
Inputs
Simulation – some settings
Arturo Di Lecce POLITECNICO DI MILANO
Simulation – soft sensor
Arturo Di Lecce POLITECNICO DI MILANO
offset=π/2delta=π/2
Simulation – neuron input function
Arturo Di Lecce POLITECNICO DI MILANO
Simulation – setting inputs
+ if excitatory input- if inhibitory input
Touch sensors input
Range finders input
Arturo Di Lecce POLITECNICO DI MILANO
Simulation – weigths update
Arturo Di Lecce POLITECNICO DI MILANO
Simulation – controller (1/2)
Arturo Di Lecce POLITECNICO DI MILANO
Retrieve LM & RM go and turn neurons number of spikes
Go forward
Rotate
Simulation – controller (2/2)
Arturo Di Lecce POLITECNICO DI MILANO
Calc new position
Update distances
Simulation 1 – Results (1/3)
Arturo Di Lecce POLITECNICO DI MILANO
First simulation: 200 timesteps
weights have not yet been learned!
Simulation 1 – Results (2/3)
Arturo Di Lecce POLITECNICO DI MILANO
If touch sensors detect an obstacle, forward movement is inhibited and the
robot turns
Simulation 1 – Results (3/3)
Arturo Di Lecce POLITECNICO DI MILANO
Simulation 2 – Results (1/3)
Arturo Di Lecce POLITECNICO DI MILANO
After 1600 timesteps
From 1600to 2300
timesteps
Simulation 2 – Results (2/3)
Arturo Di Lecce POLITECNICO DI MILANO
Weights are learned and if range
sensors detects an obstacle, the robot
turns
Simulation 2 – Results (3/3)
Arturo Di Lecce POLITECNICO DI MILANO
Weights learning curves
Simulation – Some problems (1/2)
Arturo Di Lecce POLITECNICO DI MILANO
Weights reach their maximum/minimum valuesafter some k-simulation steps
In Arena & al. simulation, this doesn't happen!
Simulation – What can be wrong?
Arturo Di Lecce POLITECNICO DI MILANO
Simulation result depends on a lot of factors:
● Neuron model parameters (a,b,c,d) – influence spike rate, sensitivity and default membrane potential
● STMP parameters (A+,A-,Tau+,Tau-) – influence weights learning rate
● Simulation step: lower values, increases the precision of spike time variables so the weights are learned better but the simulation time grows
● Robot's speed● Soft touch/range sensor algorithm● Different simulation environment (in Arena et al. they used C++)
Simulation – Goin' further ...
Arturo Di Lecce POLITECNICO DI MILANO
● It's possible to integrate target approaching controls adding an inter-neurons layer to the controller and a visual input sensor (a camera)