segunda escuela de invierno para divulgacion de la robotica, 18-22 aug 2008, valparaiso, chile fuzzy...

Segunda Escuela de Invierno para Divulgacion de la Robotica, 18-22 Aug 2008, Valparaiso, Chile

Fuzzy Evolutionary Behavior Learning by a Mobile Robot

&

Entropy Based Diversity Measures for Mobile Robot Navigation

Tomás V. Arredondo


Fuzzy Evolutionary Behavior Learning by a Mobile Robot

Objetivos de esta charla

Introducir la SoftComputing Introducir algunas aplicaciones de la

SoftComputing en robótica Describir algunos trabajos recientes

hechos en la UTFSM sobre este temaKephera


Por que es difícil la robótica?

Ambientes complejos son casi imposibles de caracterizar (modelar) dinámicos, impredecibles hay puntos no accesibles al robot

El robot y su ambiente forman un sistema dinámico El estado de los sensores de un robot es

una función de su ambiente y sus acciones No es fácil saber apriori cuales son los

efectos de las acciones de los mecanismos internos del robot

MARGE (NCSU)

Tiene que ser arraigado en la realidad La IA tradicional utiliza

lógica simbólica para representar el mundo sin ninguna relación con la realidad


Arquitecturas robóticas

Diseño convencional (top-down) Metodología jerárquica Requiere modelos relativamente exactos del

mundo Usan lógica simbólica para representar su

ambiente Funciona bien para ambientes estáticos y

controlados

Sensors

PerceptionModelingPlanning

Task ExecutionMotor Control

Actuators


Arquitecturas robóticas (cont)

Behavior Based Robotics Metodología de agentes paralelos Cada agente ejecuta una pequeña

porción de un trabajo Funciones mas complejas requieren el

trabajo de grupos de agentes Las acciones pueden estar

directamente relacionadas con lo sensado (robótica basado en acciones)

Representaciones no simbólicas Agentes se pueden inhibir o ayudar


Arquitecturas robóticas (cont)

En el tiempo el robot puede tener la necesidad de hacer un modelo cognitivo del mundo.

Estos modelos tratan de representar el mundo, el robot y su interacción.

SLAM (Simultaneous Localization and Mapping) se refiere al proceso de representar el ambiente usando mapas geométricos y localizarse usando dichos mapas.

Finalmente se necesita planear movimientos para que el robot los ejecute.

Nuestros mapas / acciones siempre van a tener errores es por esto que nuestro robot debería poder aprender de sus errores para corregirlos

Kismet Robot (MIT)


Introducción a lógica difusa:

La lógica difusa es una extensión de la lógica tradicional (Booleana) que utiliza conceptos de pertenencia de sets mas parecidos a la manera de pensar humana.

El concepto de un subset difuso fue introducido por L.A. Zadeh en 1965 como una generalización de un subset exacto (crisp subset) tradicional.

Los subsets exactos usan lógica Booleana con valores exactos como por ejemplo la lógica binaria que usa valores de 1 o 0 para sus operaciones.

Lotfi Zadeh


Un set exacto (crisp set) :

μs(x)

x

μS : X -> {0,1}

μS(x) = 1 si x es un miembro de S

μS(x) = 0 si x no es un miembro de S

N

función característica 1


Función de pertenencia de x al set difuso F: μF(x)

Ej: μF(x) corresponde al grado de pertenecia de x a F (nivel de frío medido en la variable x)

-40 -20 0 10 20 30

1

0

μF(x)

x (Co)

fríomas o menos frío

No tan frío

Definitivamente no frío


Reglas IF-THEN difusas:

Una regla IF-THEN difusa es de la forma

IF x is A THEN y is B

En la cual A y B son variables lingüísticas definidas por sets difusos en los universos X e Y.

La parte IF x is A es llamada el antecedente o premisa, mientras la parte THEN y is B es llamada la consecuencia o conclusión.

Honda AsimoHonda Asimo


Sistemas de inferencia tradicionales:

Tipicamente los controladores se relacionan con el mundo externo a traves de valores exactos (no difusos)

velocidad controlador flujo de gasolina

Si el controlador usa logica difusa va a ser necesario alguna conversion


Sistemas de inferencia usando lógica difusa (cont):

Esto se denomina fuzzificacion y defuzzificacion.

input exacto

fuzzificadorcontrolador

difusodefuzzificador output

exacto


Ejemplo Mamdani FLC (Fuzzy Logic Controller):

Este ejemplo es un sistema de control de temperatura...

FL engine

Heater

Cooler

Ambiente a Controlar

Cmd

Temp

Output


Ejemplo: Sistema Mamdani para Control de Temperatura

Función de pertenencia del input e:

e(k)= -1.0F (hay un poco de calor): eneg(-1) = 0.5 ezero(-1) = 0.5 epositive(-1) = 0

Función de pertenencia del input e:

e(k) = -2.5 F (esta calentandose un poco): enegative(-2.5) = 0.5 ezero(-2.5) = 0.5 epos(-2.5) = 0

1

-2 0 2 4-4

negativo zero positivoμ

μ(e):

1

-5 0 5 10-10

negativo zero positivoμ

μ(∆∆e):


Lógica Difusa en la Robótica

Como puede un robot autónomo usar la lógica difusa para resolver la cantidad de decisiones y problemas de control que tiene? El robot tiene un espacio de input mucho mayor que típicas aplicaciones

difusas Mas inputs incrementa el numero de evaluaciones exponencialmente Al crecer el numero de reglas la descripción manual es muy difícil Operación en tiempo real es limitada a HW especifico


Ejemplo de Lógica Difusa en la Robótica: MARGE

En el robot MARGE se utilizaron controladores difusos independientes (agentes)

Las reglas difusas especificaron el accionar de cada agente independientemente y el output de cada agente es un singleton

Reglas de arbitración hace decisiones en el caso que hallan contradicciones o competencias entre el output de los agentes independientes

Esta arquitectura replica la operación de los cerebros biológicos ya que no hay un repositorio centralizado de datos y la comunicación entre regiones es a veces limitada


Redes Neuronales

Este ejemplo de un multilayer neural network es capaz de resolver el problema del XOR

Los valores sobre las líneas indican pesos y los en los círculos indican umbrales (thresholds)

La función no lineal es un step function con valores de 1 si el umbral el excedido y de 0 si el umbral no es excedido

1

1.5 0.5

x1

x2 1 1

1

y

X1 X2 Y0 0 00 1 11 0 11 1 0

1

1 1

-2

1

1

1

)( iii SfO

001

00)( xwxwxwxwS iTi

n

jijiji

i

n

jijijxw

1

)(


Aprendizaje en la Robótica El aprendizaje en la robótica requiere

de la capacidad de generalizar basado en sets limitado de información de aprendizaje Generalmente requiere de algún

feedback

xi1

xin

wi1

win

f Oi

Bias

TeacherAlgoritmo de Aprendizaje

Calculo deError

Ti

Si

Eixi1

xin

Neurona


Algoritmos Genéticos

¿Que es un algoritmo genético (GA)? Los algoritmos genéticos (GA) son algoritmos de búsqueda y optimización basados

en los mecanismos de selección natural y genética.

Los GA usan los siguientes mecanismos: la sobrevivencia de los organismos con mejor capacidad dentro de una población uso de secuencias de caracteres (generalmente 1s y 0s) en strings como

representación del ADN de estos organismos el uso de métodos aleatorios (random) para la generación de la población y para su

reproducción



Poblacióngeneración = n

Mecanismo aleatorio de reproducción

Poblacióngeneración = n+1



Cómo se funcionan los algoritmos genético (GA)? Los GA utilizan lo siguiente:

Esquema de codificación de la información de los miembros en strings (ADN de 1s y 0s) Evaluación de capacidad (fitness) de cada miembro Selección aleatoria de los miembros que se van a reproducir Cruce de la información de los miembros (Crossover) Mutación de la información


AGs y ANNs en el simulador YAKS YAKS (Yet Another Kephera Simulator) Una manera común de crear robots es usando

robótica basado en acciones En este método el robot depende directamente

de los inputs de sus sensores (y ANNs) para determinar sus acciones (movimientos)


AGs y ANNs en YAKS (cont)

Los pesos de la red neuronal del robot se evolucionan usando AGs


AGs y ANNs en YAKS (cont) La idea es que el robot Kephera sea

capaz de navegar en su medio ambiente evitar murallas recargar su batería cuando esta

se descargue


IRMA: Fuzzy, AGs y ANNs (cont) El simulador YAKS utiliza una población de soluciones (redes neuronales)

para optimizar rutas en diferentes piezas de acuerdo a una función de fitness


IRMA: Fuzzy, AGs y ANNs (cont): En YAKS los robots aprenden usando el

AG en el simulador y usando una función de fitness que penaliza rutas de acuerdo a: choques descarga total de batería no volver a su casa


IRMA: Fuzzy, AGs y ANNs Se hizo un trabajo como parte del proyecto IRMA para la integración

de motivaciones Fuzzy, AG y ANN en el simulador YAKS


IRMA: Fuzzy, AGs y ANNs (cont): Se modifico YAKS agregándole una función para el

calculo de fuzzy fitness F. Para calcular F se usan valores obtenidos de la ruta

hecha por el robot y las motivaciones (m1, m2, m3, m4) entre 0-1: homing, curiosity, orientation, energy

Para el calculo de F asociado a la orientación se utiliza una red SOM (Self Organizing Map).


IRMA: Fuzzy, AGs y ANNs (cont): Hay 4 variables fuzzy:

homing, curiosity, orientation, energy

Cada variable fuzzy tiene 5 posibles valores:

Esto nos da un total de 625 (54) reglas para calcular el Fuzzy fitness F.

1.0

0 0.25 0.5 0.75 1.0

Very low

X

μ( X )

Low Medium High Very high

0


IRMA: Fuzzy, AGs y ANNs (cont): El esquema de acciones y entrenamiento en IRMA:

AEM (Action Based Environment Modelling) es el método que implementa SOM para tratar de reconocer el ambiente en que esta corriendo (post previo entrenamiento).

Sensor_1

Sensor_8

Motor_L

Motor_R

Modifyweights

GA:

-Reproduction-Mutation-Elite

AEM

-Fuzzy Fitness

.

.

.


IRMA: Fuzzy, AGs y ANNs (cont): AEM y entrenamiento de la red SOM

1 P...

... Environment

N

Environment vector (1:P)Action

sequenceRobot

ActionsChain coding

SOMEnvironment

1


Preguntas?


Entropy Based Diversity Measures for Mobile Robot Navigation

Tomás Arredondo V., Wolfgang Freund, and Cesar Muñoz

Departamento de Electrónica

Universidad Técnica Federico Santa María

Valparaíso, Chile

[email protected]

http://profesores.elo.utfsm.cl/~tarredondo/cursos.html


Outline

Introduction Robotic System Description Entropy Measures Experimental Evaluation Results Summary and Conclusions


Introduction

Why is robotics so hard? Should be based in “reality” Hard to reach points during navigation Robot and environment are a dynamic system Cumulative errors and noise in models Traditional AI uses symbolic logic which does not have a

relation with the complexity of reality (Arkin 1998) Complex environments which can’t be characterized

easily: dynamic, unpredictable


Introduction (2)

Conventional design (top-down) Hierarchical architecture Exact world models Works well for static controlled environments

Sensors

PerceptionModelingPlanning

Task ExecutionMotor Control

Actuators


Introduction (3)

Behavior based design (Brooks) Parallel tasks (agents) Each agent executes a small part of

the overall process Behaviors can be directly related

with what is sensed Few symbolic representations Separate tasks can suppress (or

overrule) inputs or inhibit outputs Feedback is given mainly through

the environment


Introduction (4)

Our recent research in behavior based robotics has focused on providing more natural and intuitive interfaces between robots and people.

Our motivation based approach, follows this trend by decoupling specific robot behavior using an intuitive interface based on biological motivations (e.g. curiosity, hunger, etc).


Introduction (5)

Having diversity during training can provide for the emergence of more robust and adaptable systems capable of coping with a variety of environmental challenges.

To the best of our knowledge a quantitative method of measuring diversity in the context of robotic training environments is not currently available.


Introduction (6)

Toward this goal we propose using entropy based methods for measuring motivation and environmental diversity.

Using these entropy based measures we investigate the effects of environmental and motivation diversity on robotic fitness.


Outline



System Description – Robot Configuration

The robot configuration has two DC motors and eight (six front and two back) infrared proximity sensors used to detect nearby obstacles.

These sensors provide 10 bit output values (with 6% noise). Robot navigation is performed in the YAKS simulator by

providing sensor values directly into a NN (8-5-2) that drives left and right motors (with 6% noise) for 500 steps.

The robot is not constrained by its battery since 100% charge level allows more than 1000 steps.


System Description – Robot Configuration (2)

To reduce the search space of behaviors, we use a limited number of actions for the robot to execute in each step. Using AEM based encoding four actions are used: 00: Go straight on 01: Turn 30º left 10: Turn 30º right 11: Turn 180º

In AEM, a SOM network is used in order to determine the capability of the robot to determine the room he is navigating in (localization)

Inputs of 1000 steps for all rooms were used in training, these were alternatingly presented to the SOM network for 10000 iterations, the network had a linear output layer of 128 r-nodes.

SOM neighborhood threshold was set to 10 nearest nodes.


System Description – Robot Configuration (3)

In our implementation, the robot generates an internal zone map (55 x 55) in which the zones are marked with various values: obstacles are indicated with a value of -1, those not visited by the robot are marked with a 0, and the visited ones with a 1.


System Description – GA

TRAINING: A GA selects the robot with the best fuzzy fitness in each generation during training Population size: 200. Crossover operator: Random crossover. Selection method: Elite strategy selection. Mutation rate: 1%. Generations: 90.


System Description – Fuzzy Fitness

The motivation set (M) considered for robotic fitness includes: homing (m1), curiosity (m2), energy (m3), and orientation (m4)

During training, the robot performs its behavior and a set of normalized (0 - 1) fitness values (f1 - f4) corresponding to the performed task are obtained from the simulator: proper action termination and escape from original neighborhood

area (f1),

amount of area explored (f2),

percent of battery usage (f3),

environment recognition (f4).


System Description – Fuzzy Fitness (2)

When training, these fitness values are calculated after a robot completes each run: f1 = 1 − (final distance to home/maximum distance).

f2: percentage area explored relative to the optimum.

f3: estimated percent total energy consumption considering all steps taken.

f4: determined by having the robot determine which room he is in (r-node) (versus the correct one), this uses a previously trained SOM network.



Finally fuzzy fitness (F) is calculated using the fitness values provided by the simulator (f1 – f4) and the various motivation (m1 – m4) settings at the time of exploration

F is calculated based on TSK fuzzy logic Fitness values f1 – f4 are fuzzy variables:



TSK fuzzy logic does not require defuzzification Each rule output is a function:

If x is A and y is B then fn = g(x, y) Example:

If X is small and Y is small then f1 = -x + y + 1 If X is small and Y is large then f2 = -y + 3 If X is large and Y is small then f3 = -x + 3 If X is large and Y is large then f4 = x + y + 3

We used min membership rule for determining each fn

Fitness is weighted sum of all fn: F = w1f1 + w2f2+...+ w2fn


System Description – Fuzzy Fitness (5) Sample: Fuzzy Fitness rules, N=5 & M=4 (54 = 625 Rules)

if (f1 == VL) and (f2 == VL) and (f3 == VL) and (f4 == VL) thenf[1] = m1f1K[1] + m2f2K[1] + m3f3K[1] + m4f4K[1]

if (f1 == L) and (f2 == VL) and (f3 == VL) and (f4 == VL) thenf[2] = m1f1K[2] + m2f2K[1] + m3f3K[1] + m4f4K[1]

...if (f1 == H) and (f2 == L) and (f3 == VL) and (f4 == VL) then

f[9] = m1f1K[4] + m2f2K[2] + m3f3K[1] + m4f4K[1]if (f1 == VH) and (f2 == L) and (f3 == VL) and (f4 == VL) then

f[10] = m1f1K[5] + m2f2K[2] + m3f3K[1] + m4f4K[1]...if (f1 == H) and (f2 == VH) and (f3 == VH) and (f4 == VH) then

f[624] = m1f1K[4] + m2f2K[5] + m3f3K[5] + m4f4K[5]if (f1 == VH) and (f2 == VH) and (f3 == VH) and (f4 == VH) then

f[625] = m1f1K[5] + m2f2K[5] + m3f3K[5] + m4f4K[5]


Fuzzy Fitness (6)

Fuzzy Fitness:

1- Calculate μ

2- Using rule index: n

3- Calculate weight of

each rule: w[n] using

min μ of fitness values

4- Calculate f[n] of

each rule using

X, Y and K

5- Calculate F

Algorithm FuzzyFitnessInput:N : number of fuzzy motivations;M : number of membership functions per motivation;X[N] : array of motivation values preset;Y[N] : array of fitness values;K[N] : array of coefficients;μ[N][M] : matrix of membership values for each motivation’s associated fitness;Variables:w[n] : the weight for each fuzzy rule being evaluated;f[n] : the estimated fitness;n, x0, x1, . . . , xN : integers;Output:F : the fuzzy fitness value calculated;begin n := 1; for each x1, x2, . . . , xN := 1 step 1 until M do begin w[n] := min{μ[1][x1], μ[2][x2], . . . , μ[N][xN]}; n := n + 1; end;

end;

N

i

ixKiYiXnf0

][][][][

M MN

i

N

i

iwifiwF0 0

)][/(][][


Outline



Entropy Measures

We use entropy to measure motivation and environmental diversity.

Our concept of diversity follows the well established definition of entropy as a measure of the uncertainty (which generates a diversity of outcomes) in a system.


Entropy Measures – Motivation Diversity

We define a motivation set M as {m1, m2, . . ., mn}.

Toward the calculation of motivation diversity H(M), we consider the corresponding probabilities for {m1, m2, . . . , mn} as {p1, p2, . . . , pn}.

We compute the entropy of the random variable M using:


Entropy Measures – Environmental Diversity

Environmental diversity is calculated using an extension of the local image entropy method.

Using LIE an estimate of the information content of a pixel (x, y) based on a histogram of neighborhood (w, h) pixel values is calculated:


Entropy Measures – Environmental Diversity (2)

A part of an image (e.g. neighborhood or window) is interpreted as a signal of k different states with the local entropy Ew,h(x, y) determining the observers uncertainty about the signal.

Extending this method to a complete image (e.g. environment), and to obtain a measure of its diversity, we compute the average local image entropy (ALIE) for all pixels in the room.


Entropy Measures – Environmental Diversity (3)

Following the definition of entropy, neighborhood width and height were set at twice the robots diameter so that the robot should have close to a maximum uncertainty of traversing through the neighborhood when Ew,h(x, y) = 0.5.

Clearly if a neighborhood is empty or full the robot will have either no difficulty or no chance of traversing the terrain and hence the uncertainty for that neighborhood will be zero.

h

w


Outline



Experimental Evaluation

Of primary interest was to study the impact of diversity on the robot’s ability to learn behaviors.

Two types of diversity were analyzed: environmental training diversity and motivation diversity.


Evaluation – Environmental Training Diversity

We tested training in 15 different square rooms: r1 - r15.

These rooms have an increasing number of obstacles in a fairly random distribution with the condition that any one obstacle should not preclude the robot from reaching any area of the room.


Evaluation – Environmental Training Diversity (2)

Rooms layout



In order to analyze the sensitivity of the environmental diversity metric we first computed the average local image entropy of each room given different square window sizes.



ALIE of rooms r1 – r15



In order to evaluate the impact of environmental training diversity over the robots navigation behavior, we trained the robot in each room for 90 generations.

Fuzzy motivations (m1, m2, m3, m4) were (0, 1, 0, 0).

After the training phase, the best individual from the GA population was selected to run its respective testing phase in all rooms.


Evaluation – Motivation Diversity

We used 66 sets of fuzzy motivation criteria. Average fitness values are given for motivations

(m1,m2,m3) ranging from (0, 1, 0) to (0.4, 0.3, 0.3) with values changing in increments of 0.1.

The population was trained for 90 generations in each room, and tested in all rooms using the best individual.

Average fuzzy fitness values were calculated using the various fitness values f1 − f3.


Outline

Introduction Entropy Measures and Robotics System

Description Experimental Evaluation Results Summary and Conclusions


Results

Ten complete runs were performed of each experiment (each run consisting of one training and 10 test executions) and only average values are reported in our results.


Results – Environmental Diversity


Results – Motivation Diversity

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Entropy

Ave

rag

e F

itn

ess


Outline

Introduction Entropy Measures and Robotics System

Description Experimental Evaluation Results Summary and Conclusions


Summary and Conclusions

ALIE was shown as an effective measure of a training environments potential toward producing highly fit robots.

The average fitness obtained during testing is clearly dependent on training room average local image entropy.

A training environment with too much environmental diversity is as unsuitable as one with not enough diversity.


Summary and Conclusions (2)

Motivation diversity results are somewhat counter intuitive in that simply by diversifying motivation values one could expect higher overall fitness but this is clearly not the case due to robotic system constraints.

Fitness was generally lower with more diverse motivations but the obtained behaviors demonstrated very good capability (e.g. room exploration, battery usage, etc) and were in close agreement with the specified motivations.

In our experiments, environment recognition (e.g. SOM) was generally successful in determining the robots location.


Summary and Conclusions (3)

Entropy based measures can help analyze complex systems such as robotic training environments.

Future work includes utilizing fuzzy motivations within hybrid architectures and parametric studies (e.g. linkages between motivations).

We are currently working on an implementation of these methods in physical robots.


References

1. Arkin, R.: Behavior-Based Robotics, MIT Press, Cambridge, (1998).

2. Park, H., Kim, E., Kim, H.: Robot Competition Using Gesture Based Interface.LNAI,Vol. 3353. Springer-Verlag, Berlin (2005) 131-133

3. Jensen, B., Tomatis, N., Mayor, L., Drygajlo, A., Siegwart, R.: Robots Meet Humans - Interacion in Public Spaces. IEEE Transactions on Industrial Electronics, Vol. 52, No. 6 (2006) 1530-1546

4. Arredondo, T., Freund,W., Mu˜noz, C., Navarro, N., and Quir´os, F.: Fuzzy Motivations for Evolutionary Behavior Learning by a Mobile Robot. LNAI, Vol. 4031. Springer-Verlag, Berlin (2006) p. 462-471.


References (2)

5. Huitt, W.: Motivation to learn: An overview. Educational Psychology Interactive.http://chiron.valdosta.edu/whuitt/col/motivation/motivate.html, (2001).

6. Tan, K.C., Goh, C.K, Yang, Y.J., Lee, T.H.: Evolving better population distributionand exploration in evolutionary multi-objective optimization, European Journal of Operations Research 171 (2006) 463-495

7. Chalmers, D.J.: The evolution of learning: An experiment in genetic connectionism. Proceedings of the 1990 Connectionist Models Summer School, p. 81-90. SanMateo, CA.: M. Kaufmann, 1990

8. YAKS simulator website: http://www.his.se/iki/yaks


References (3)

9. Yamada,S.: Recognizing environments from action sequences using self-organizing maps, Applied Soft Computing, 4 (2004) 35-47

10. Teuvo, K.: The self-organizing map. Proceedings of the IEEE, 79(9), (1990) 1464-1480

11. Cover, T., Thomas, J.: Elements of Information Theory, Wiley, New York, (1991)

12. Handmann, U., Kalinke, T., Tzomakas, C., Werner, M., Weelen, W.v.,: An image processing system for driver assistance. Image and Vision Computing, 18, (2000) 367-376

13. Arredondo, T., Freund,W., and Muñoz, C.: "Entropy Based Diversity Measures in Evolutionary Mobile Robot Navigation". Lecture Notes in Computer Science, Vol. 5027. Springer, Berlin (2008) p. 129-138.


Any questions?

segunda escuela de invierno para divulgacion de la robotica, 18-22 aug 2008, valparaiso, chile fuzzy...

Documents

robotel robot

robot debera

universos x

variable x

pertenecia de x

formaif x

parte if x

lgica booleana