Download - AN AUTOMATIC REGROUPING MECHANISM IN PARTICLE SWARM ... · PDF filean automatic regrouping mechanism to deal with stagnation in particle swarm optimization a thesis by george i. evers

AN AUTOMATIC REGROUPING MECHANISM

TO DEAL WITH STAGNATION

IN PARTICLE SWARM

OPTIMIZATION

A Thesis

by

GEORGE I. EVERS

Submitted to the Graduate School of the

University of Texas-Pan American

In partial fulfillment of the requirements for the degree of

MASTER OF SCIENCE

May 2009

Major Subject: Electrical Engineering

iii

ABSTRACT

Evers, George I., An Automatic Regrouping Mechanism to Deal with Stagnation in

Particle Swarm Optimization. Master of Science (MS), May, 2009, 96 pp., 10 tables, 32

illustrations, 31 references, 6 titles.

Particle Swarm Optimization (PSO), which was intended to be a population-based global

search method, is known to suffer from premature convergence prior to discovering the

true global minimizer. In this thesis, a novel regrouping mechanism is proposed, which

aims to liberate particles from the state of premature convergence. This is done by

automatically regrouping the swarm once particles have converged to within a pre-

specified percentage of the diameter of the search space. The degree of uncertainty

inferred from the distribution of particles at premature convergence is used to determine

the magnitude of the regrouping per dimension. The resulting PSO with regrouping

(RegPSO) provides a mechanism more efficient than repeatedly restarting the search by

making good use of the state of the swarm at premature convergence. Results suggest

that RegPSO is less problem-dependent and consequently provides more consistent

performance than the comparison algorithms across the benchmark suite used for testing.

iv

DEDICATION

The dynamics of living amongst various cultures have been fascinating,

while the constancy of family is refreshing.

Thanks for always being there!

v

ACKNOWLEDGEMENT

Thanks to Dr. Mounir Ben Ghalia for his guidance and

for helping me present my ideas clearly.

vi

TABLE OF CONTENTS

Page

ABSTRACT ....................................................................................................................... iii

DEDICATION ................................................................................................................... iv

ACKNOWLEDGEMENT .................................................................................................. v

TABLE OF CONTENTS ................................................................................................... vi

LIST OF TABLES ............................................................................................................. ix

LIST OF FIGURES ............................................................................................................. x

CHAPTER I. INTRODUCTION ...................................................................................... 1

Motivation for Particle Swarm Optimization .................................................................. 1

Optimization: A Brief Overview ................................................................................. 1

Gradient-Based Methods ............................................................................................. 3

Population-Based Heuristics........................................................................................ 4

PSO as a Member of Swarm Intelligence .................................................................... 6

Research Motivation ....................................................................................................... 8

Research Objectives ...................................................................................................... 14

Empirical Determination of Quality Parameters ....................................................... 14

Development of Regrouping Mechanism for Gbest PSO ......................................... 14

Testing of RegPSO .................................................................................................... 15

vii

Data Comparison ....................................................................................................... 15

Explore Applicability of RegPSO to Simple Uni-Modal Problems .......................... 15

Contributions ................................................................................................................. 15

High-Quality PSO Parameters ................................................................................... 15

Development of Efficient Regrouping Mechanism ................................................... 16

Development of Regrouping Model Specifically for Uni-Modal Case ..................... 16

CHAPTER II. PARTICLE SWARM OPTIMIZATION ALGORITHM ...................... 17

Problem Formulation..................................................................................................... 17

Evolution of the PSO Algorithm ................................................................................... 17

Original PSO Algorithm ............................................................................................ 17

“Lbest” PSO .............................................................................................................. 20

Inertia ......................................................................................................................... 20

Velocity Clamping ..................................................................................................... 28

Standard “Gbest” PSO .............................................................................................. 35

Illustration of Premature Convergence ......................................................................... 42

CHAPTER III. EMPIRICAL SEARCH FOR QUALITY PSO PARAMETERS .......... 48

Rastrigin Experiment Outlined...................................................................................... 48

Independent Validation of “Social Only” PSO ............................................................. 50

viii

Socially Refined PSO .................................................................................................... 53

CHAPTER IV. REGROUPING PARTICLE SWARM OPTIMIZATION .................. 59

Motivation for Regrouping............................................................................................ 59

Detection of Premature Convergence ........................................................................... 60

Swarm Regrouping ........................................................................................................ 61

“Gbest” PSO Continues as Usual .................................................................................. 63

Two-Dimensional Demonstration of the Regrouping Mechanism ............................... 66

CHAPTER V. TESTING AND COMPARISONS ......................................................... 76

Comparison with Standard “Gbest” & “Lbest” PSO’s ................................................. 77

Comparison with Socially Refined PSO ....................................................................... 80

Comparison with MPSO ............................................................................................... 82

Comparison with OPSO ................................................................................................ 83

RegPSO for Simple Uni-Modal Problems .................................................................... 86

CHAPTER VI. CONCLUSIONS ................................................................................... 88

REFERENCES ................................................................................................................ 91

APPENDIX (BENCHMARKS) ...................................................................................... 94

BIOGRAPHICAL SKETCH ........................................................................................... 96

ix

LIST OF TABLES

Page

Table II-1: Effect of Velocity Clamping Percentage with Static Inertia Weight .............. 31

Table II-2. Effect of Velocity Clamping with Linearly Decreased Inertia Weight .......... 33

Table III-1: “Social-only” Gbest PSO with Slightly Negative Inertia Weight ................. 52

Table III-2: “Socially Refined” PSO with Slightly Negative Inertia Weight ................... 55

Table III-3: “Socially Refined” PSO with Small, Negative Inertia Weight ..................... 57

Table V-1: RegPSO Compared to Gbest and Lbest PSO with Neighborhood Size 2 ...... 78

Table V-2: RegPSO Compared with Socially Refined PSO ............................................ 81

Table V-3: RegPSO Compared with MPSO ..................................................................... 83

Table V-4: OPSO Compared with RegPSO both with and without Cauchy Mutation .... 85

Table V-5: A RegPSO Model for Solution Refinement Rather than Exploration ............ 87

x

LIST OF FIGURES

Page

Figure II-1: Velocity Clamping Pseudo Code .................................................................. 30

Figure II-2: Rastrigin Benchmark Used for 2-D Illustration ............................................ 39

Figure II-3: Swarm Initialization (Iteration 0) .................................................................. 40

Figure II-4: First Velocity Updates (Iter. 1)...................................................................... 40

Figure II-5: First Position Updates (Iter. 1) ...................................................................... 41

Figure II-6: Second Velocity Updates (Iter. 2) ................................................................. 41

Figure II-7: Second Position Updates (Iter. 2) .................................................................. 42

Figure II-8: Swarm Initialization (Iter. 0) ......................................................................... 44

Figure II-9: Converging (Iter. 10) ..................................................................................... 44

Figure II-10: Exploratory Cognition and Momenta (Iter. 20) ........................................... 45

Figure II-11: Convergence Continues (Iter. 30) ............................................................... 45

Figure II-12: Momenta Wane (Iter. 40) ............................................................................ 46

Figure II-13: Premature Convergence (Iter. 102) ............................................................. 46

Figure IV-1: RegPSO pseudo code ................................................................................... 65

Figure IV-2: Swarm Regrouped (Iter. 103) ...................................................................... 66

Figure IV-3: PSO in New Search Space (Iter. 113) .......................................................... 67

Figure IV-4: Swarm Migration (Iter. 123) ........................................................................ 67

Figure IV-5: New Well Considered (Iter. 133) ................................................................. 68

xi

Figure IV-6: Most Bests Relocated (Iter. 143) ................................................................. 68

Figure IV-7: Swarm Collapses (Iter. 153) ........................................................................ 69

Figure IV-8: Horizontal Uncertainty (Iter. 163) ............................................................... 69

Figure IV-9: Uncertainty Remains (Iter. 173) .................................................................. 70

Figure IV-10: Becoming Convinced (Iter. 183) ............................................................... 70

Figure IV-11: Premature Convergence Detected (Iter. 219) ............................................ 71

Figure IV-12: Second Regrouping (Iter. 220)................................................................... 71

Figure IV-13: Swarm Migration (Iter. 230) ...................................................................... 72

Figure IV-14: Best Well Found (Iter. 240) ....................................................................... 72

Figure IV-15: Swarm Collapsing (Iter. 250)..................................................................... 73

Figure IV-16: Particles Swarm to the Newly Found Well (Iter. 260) .............................. 73

Figure IV-17: Convergence (Iter. 270) ............................................................................. 74

Figure IV-18: Effect of Regrouping on Cost Function Value .......................................... 74

Figure V-1: Mean Behavior of RegPSO on 30D Rastrigin .............................................. 80

1

CHAPTER I

INTRODUCTION

Motivation for Particle Swarm Optimization

Optimization: A Brief Overview

Optimization is the search for a set of variables that either maximize or minimize a

scalar cost function, ( )f x . The n-dimensional decision vector, ,x consists of the n

decision variables over which the decision maker has control. The cost function is

multivariate since it depends on more than one decision variable, as is common of real-

world relationships. The decision maker desires a more efficient method than trial and

error by which to obtain a quality decision vector, which is why optimization techniques

are employed.

In general, the literature focuses on minimization since the maximum of any cost

function, ( )f x , is mathematically equivalent to the minimum of its additive inverse,

( )f x . In other words, any scalar function to be optimized may be treated wholly as a

minimization problem due to the symmetric relationship between the cost function and its

additive inverse across hyperplane ( ) 0f x .

When each decision variable is allowed to assume all real, integer, or other values

making up the n-dimensional search space, the optimization is said to be unconstrained.

If there are further limitations on the allowable values of any decision variable, the

optimization is said to be constrained. Boundary constraints, which specify a maximum

2

and/or minimum value for any or all decision variables, are not necessarily considered to

constitute constrained optimization, though this would literally be the case.

If the Rocky Mountain Range, with its hills and valleys, represents an

optimization function, with the goal of the optimization problem being to find the

geographical coordinates that minimize the altitude of the function, the bottom of each

valley and depression would be a local minimum in reference to the altitude, which is the

cost function’s value. The n-dimensional coordinates or decision vector at which a local

minimum occurs is called a local minimizer or local minimum point; the decision vector

to be optimized consists of longitude in the horizontal dimension and latitude in the

vertical dimension. Since the goal for this example would be to find the lowest altitude

of the mountain range, one might simply head in a downward direction from the current

location, which would lead him to a local minimum; however, one would not necessarily

have a reason to believe that location to be the global minimum. Local optimization (LO)

methods seek to find a local minimum and, more importantly, its corresponding local

minimizer, while global optimization (GO) methods attempt to find the global minimum,

or lowest function value, and its corresponding global minimizer.

An explorer could try walking all over the mountain range recording the local

minimum (as measured by altitude) at each local minimizer (as measured in two-

dimensional space by longitude and latitude) in order to find the global minimum and its

global minimizer; but this kind of exhaustive search would be quite inefficient. As a

more efficient example of global minimization, consider a team of explorers searching

for local minima independently while sharing information with each other via walkie

talkies: in this way, the team of explorers would have a much better chance to find the

3

global minimum quickly and efficiently since each agent would be aware of the quality of

regions in various directions.

The Rocky Mountain range, with its multiple local minima, is an example of a

multi-modal function. As an example of a uni-modal function, imagine a relatively

smooth crater on the surface of the moon. It has its only one minimum, which does not

necessarily require a team of explorers to locate.

Gradient-Based Methods

Gradient-based optimization can be likened to an explorer with a device to

calculate rate of descent in any particular direction he or she might choose to explore in

order to determine the most feasible direction in which to search. This device might also

approximate second derivative or Hessian information, by which to infer the rate of

descent of the already calculated rate of descent. Using such derivative information, a

Taylor polynomial could conceivably be constructed in each direction for our explorer so

that expected hills and valleys are generated as a contour map, and the most feasible

direction can be inferred based upon all data available in the vicinity. Our explorer,

though well equipped with an expected contour map, has only information about his

greater vicinity, which does not necessarily help him navigate from his location to the

desired global minimizer if the search space is large. Regardless of how much time he

takes to calculate the curvatures in various directions from his immediate vicinity, he has

very little information of valleys far from him. He may even be deceived by local terrain

and end up at the bottom of a steep valley that, based upon all available information, is

the best minimum available but in reality is only a local minimum to which the explorer

has prematurely settled.

4

To escape from the state of premature convergence, one option is to simply restart

the algorithm. This is analogous to having our explorer restart his search. However,

since the path he followed was derived entirely from his instrument’s calculations based

on the terrain visible around him, restarting from the same position would

deterministically lead to premature convergence in the same region. This is because there

is no randomness or stochasm in the algorithm to help him avoid deceit by local terrain.

Consequently, restarting a gradient-based search requires initilializing explorers to

different locations each time, though they could still be deceived by some prominent

topographical feature such as a steep valley.

Population-Based Heuristics

For these reasons, when our explorer is amidst potentially deceitful multi-modal

terrain, he might benefit more from communication with other agents dispersed

elsewhere than from the solitary use of time-consuming and deterministic calculations.

In this way, the nature of the search changes from a series of highly analytical decisions

by one agent to the synchronous movements of multiple agents.

The fact that various agents are dispersed throughout the search space allows for

consideration of multiple regions simultaneously so that deceit by any one local region is

unlikely to occur unless all agents converge to the same region before reaching the global

minimizer, in which case it is said that the agents have prematurely converged. To

further hinder premature convergence, population-based approaches tend to employ some

form of stochasm or randomness. For example, in genetic algorithms (GA) [1-3], the

decision vector or location of each agent is considered to be a sort of DNA, and

beneficial random mutations are seized upon by offspring. Randomness, in the

5

algorithmic world at least, is generated by a separate algorithm called a randomizer that

transforms each seemingly random number to find each next seemingly random number

so that experiments can be scientifically reproduced through the deterministic nature of

seeming “randomness.”

In addition to offering more resistance to premature convergence than do

gradient-based methods, the computational simplicity of population-based optimization

methods allows progress to be made in a more time efficient manner. Population-based

approaches may be able to further reduce computational complexity by more easily

lending themselves to parallel processing; for example, using one processor per agent

might allow one phase of the code to be executed in parallel, after which the other phase

would extract from each agent’s memory the new function values for simple comparison

and write the location of the best agent back to each memory location for consideration

by all other agents before re-entering the parallel phase of computations. If this could be

done, it would reduce computation complexity from a Big O Notation of O(s*k) to O(k),

where “s” is the number of agents employed and “k” is the expected number of iterations.

Heuristic approaches have not necessarily been proven to produce the global

minimum with every trial or to be applicable in all cases. Rather, they have been

demonstrated to work well in general.

The speed of a population-based search heuristic can be measured in iterations,

function evaluations, or real time. Since each particle evaluates its function value at each

iteration, the number of function evaluations conducted per iteration is equal to the

number of search agents. Function evaluations seem to be the most popular measure.

Real time is not generally used since the time required to run a simulation on one

6

computer might not equal the time required on another computer, making real-time

comparisons from paper to paper practically impossible. Furthermore, real simulation

times may vary even on one computer due to system heating and background activity or

other activity by the user.

The time required for a function evaluation, and therefore also the time required

for an iteration, which is a set of function evaluations, depend on the computational

complexity of the algorithm, which would be better reflected by a measure of real time.

However, it is not practical to ask all researchers to use the same system for comparison,

so the traditional function evaluations will be used herein. The reader is cautioned,

however, that an algorithm requiring less function evaluations or iterations than another

is not necessarily faster in real time if the seemingly quicker algorithm is computationally

more complex. To compare efficiencies in real time, one could call the different

algorithms according to a cleverly alternating pattern that might involve random selection

until the desired trial size had been collected for each algorithm; however, as mentioned

previously, such an approach would make for a standalone paper whose results would not

compare well with those of other authors using different computers.

PSO as a Member of Swarm Intelligence

Particle Swarm Optimization (PSO) was introduced in 1995 by social

psychologist James Kennedy and professor and chairman of electrical and computer

engineering Russell C. Eberhart to simulate the natural swarming behavior of birds as

they search for food [4]. The test function used was2 2

1 2( ) ( 100) ( 100)f x x x ,

which has a minimum function value of zero at Cartesian coordinates (100, 100). In

math, this would be called a three-dimensional function as it is graphed on the three

7

dimensional Cartesian coordinate system; however, in optimization the focus is on the

number of dimensions in the decision vector: since there are two decision variables to be

optimized, this is referred to as a two-dimensional optimization problem. In other words,

this particular function has two decision variables, 1x and 2x , to be optimized such that

the resulting decision vector, x , minimizes the cost function, ( )f x .

Kennedy & Eberhart considered the global minimizer of their test function as a

type of corn field and were curious to see whether the swarm of particles would

successfully flock toward the food. As the swarm flocked toward location (100, 100),

this algorithm mimicking the social interaction of swarming or schooling creatures was

verified to be an optimization algorithm. Since that time, PSO has been shown to

converge quickly relative to other population-based optimization algorithms such as GA

while still offering good solution quality [5].

Swarm intelligence is a type of multi-agent system whereby individual agents

behave according to simple rules but interact to produce a surprisingly capable collective

behavior. PSO is one form of swarm intelligence since each particle flies through the

search space by updating its individual velocity at regular intervals toward both the best

position or location it personally has found (i.e. the personal best), and toward the

globally best position found by the entire swarm (i.e. the global best). Since the function

value of each particle is iteratively or regularly evaluated in order to determine which

offers the lowest function value; and since that information affects the velocity, and by

implication the direction, of every other particle; an interestingly capable collective

behavior emerges. The global best, or in some forms of PSO the neighborhood best, are

8

stored to a memory location that all particles access and utilize to determine their

individual velocities. This models the social act of communication.

While particles begin searching between each decision variable’s initial boundary

values, these are not necessarily boundary constraints since particles are generally

allowed to search outside of each decision variable’s range of values. Some forms of

PSO, however, use the initial boundary values as boundary constraints to prevent

particles from exploring outside a fixed search space. PSO has also been shown to be

applicable to constrained nonlinear optimization problems [6].

While global optimization algorithms such as PSO are most naturally applied to

the optimization of multimodal cost functions, they can optimize unimodal functions as

well.

Other examples of swarm intelligence are Ant Colony Optimization (ACO) [7]

[8] [9] and Stochastic Diffusion Search (SDS) [10].

Research Motivation

While population-based heuristics are less susceptible to deceit due to their use of

stochasm and reliance directly upon function values rather than derivative information,

they are nonetheless susceptible to premature convergence, which is especially the case

when there are many decision variables or dimensions to be optimized. The more

communication that occurs between agents, the more similar they tend to become until

converging to the same region of the search space. In particle swarm, if the region

converged to is a local well containing a local minimum, there may initially be hope for

escape via a sort of momentum built into the algorithm via the inertial term; over time,

9

however, particles’ momenta decrease until the swarm settles into a state of stagnation,

from which the basic algorithm does not offer a mechanism of escape.

While allowing particles to continue in this state may lead to solution refinement

or exploitation following the initial phase of exploration, it has been observed empirically

that after enough time, velocities may become so small that at their expected rate of

decrease, even the nearest solution may be eliminated from the portion of the search

space particles can practically be expected to reach in later iterations. In traditional PSO,

when no better global best is found by any other particle for some time, all particles

converge about the existing global best, potentially eliminating even the nearest local

minimizer.

Van den Bergh appears to have solved this particular problem with his

Guaranteed Convergence PSO (GCPSO) by using a different velocity update equation for

the best particle since its personal best and global best both lie at the same point, which in

traditional PSO inhibits the explorative abilities of the best particle, since it is so strongly

pulled toward that one point, with only its waning momentum and accelerations in the

direction of that point keeping it exploring at all [11] [12]. GCPSO is therefore said to

guarantee convergence to a local minimizer.

There is still a problem, however, in that particles tend to converge to a local

minimizer before encountering a true global minimizer. Addressing this problem, Van

den Bergh developed multi-start PSO (MPSO) which automatically triggers a restart

when stagnation is detected. Various criteria for detecting premature convergence were

tested in order to avoid the undesirable state of stagnation [12]: (i) Maximum Swarm

Radius, which defines stagnation as having occurred when the particle with the greatest

10

Euclidean distance from global best reaches a minimum threshold distance, taken as a

percentage of the original swarm radius, (ii) Cluster Analysis, which terminates the

current search when a certain percentage of the swarm has converged to within a pre-

specified Euclidean distance, and (iii) Objective Function Slope, which records the

number of iterations over which no significant improvement has been seen in the function

value returned by the global best, and terminates the current search when that number

reaches a pre-specified maximum. The first two criteria monitor the proximity of

particles to one another, and the latter monitors whether improvement has been seen

recently in the function value being optimized. Which is better seems to depend on

which is the cause of the problem: (i) proximity of particles to one another making

exploration unlikely to impossible, or (ii) function value not improving over time. Since

the former seems to be the cause of the latter, measuring particles’ proximities directly

seems like the better idea, which is consistent with the fact that Van den Bergh found the

Maximum Swarm Radius and Cluster Analysis methods to outperform the Objective

Function Slope method.

Restarting in MPSO refers to starting the search anew with a different sequence of

random numbers generated so that even initial positions are different than they were in

previous searches. At restart, particles lose their memories of the previous search so that

each search is independent of those previously conducted. After each independent

search, the global best is compared to the best global best of previous searches. After a

pre-specified number of restarts have completed, the best of all global bests is proposed

as the most desirable decision vector found over all searches.

11

Following this logic, one wonders if there might be a more efficient mechanism

by which the swarm could “restart.” It was thought that restarting on the original search

space might cause unnecessarily repetitious searching of regions not expected to contain

quality solutions. GCPSO might even allow the swarm to escape local optima if

parameters were designed with exploratory intentions, but this approach would

effectively leave the rest of the swarm trailing almost linearly behind the globally best

particle’s random movements, which would not be ideal. So a mechanism became

desirable by which the swarm could efficiently regroup in a region small enough to avoid

unnecessarily redundant search, yet large enough to escape wells containing local minima

in order to try to prevent stagnation while retaining memory of only one global best

rather than a history of the best of them. Consequently, there is one continuous search

with each grouping making use of previous information rather than a series of

independent searches.

In 1995, James Kennedy and Russell C. Eberhart observed that if each particle is

drawn toward its neighborhood best or local best instead of directly toward the global

best of the entire swarm, particles are less likely to get stuck in local optima [13].

Neighborhoods in this Lbest PSO overlap so that information about the global best is still

transmitted throughout the swarm but more slowly so that more exploration is likely to

occur before convergence, reducing the likelihood of premature convergence. The PSO

literature seems to have focused primarily on global best PSO (Gbest PSO) due to its

relatively quick initial convergence; however, hasty decisions may be of lower quality

than those made after due consideration, and Lbest PSO appears generally to produce

higher quality solutions if given enough time to do so. Since Gbest PSO is more popular,

12

it is often referred to simply as PSO; however, Lbest PSO should not be overlooked.

Lbest PSO still suffers from premature convergence in some cases as demonstrated

somewhat severely on the Rastrigin test function or benchmark, where the standard Gbest

PSO also suffers.

Wang et al. applied an opposition-based learning scheme to PSO (OPSO) along

with a Cauchy mutation of the global best so that particles are less likely to be attracted to

the same position [14]. The main objective of OPSO with Cauchy mutation is to help

avoid premature convergence on multi-modal functions. Using opposition-based learning,

two different positions for each selected particle are evaluated - the particle’s own

position and the position opposite the center of the swarm. Only for particles at the

center of the swarm are these positions the same.

Worasucheep proposed a PSO with stagnation detection and dispersion (PSO-

DD) that detects stagnation by monitoring changes in mean velocity and best function

value, reinvigorates the swarm with velocities one hundred times larger than their levels

at stagnation, and disperses particles by up to one-tenth of one percent of the range on

each dimension [15]. In this way, diversity is infused back into the system so that the

search can continue rather than restarting and searching anew. While this idea improves

performance on some benchmarks, performance suffers considerably on the Rosenbrock

benchmark.

Balancing between the explorative tendencies of Lbest PSO and the quick

convergence of Gbest PSO, Parsopoulos and Vrahatis with their Unified PSO (UPSO)

iteratively take a weighted average of the velocities proposed by each [16, 17]. In this

way, each particle has available for consideration its personal best, its neighborhood best,

13

and the swarm’s global best. Particles can consequently be thought of as being more

informed. However, it may be redundant for the personal best to be considered in both

Gbest and Lbest velocities before weighting, which could conceivably cause it to be over-

represented unless its social acceleration coefficient is decreased to account for this.

Rather than averaging together the two algorithms, it might be computationally simpler to

give the velocity update equation direct access to all three bests. UPSO or some variant,

due to its incorporation of Lbest PSO, may be able to reduce the effect of premature

convergence, but data has so far focused on the number of iterations necessary to

converge to a pre-specified solution quality and on the relative performance of UPSO

rather than on the absolute performance of the algorithm, which would indicate how well

it avoids premature convergence to approximate the global minimizer and facilitate

comparison with other published results.

Once the swarm has converged prematurely, there are at least five options: (i)

terminate the search and accept the best decision vector found as the proposed solution,

(ii) allow the search to continue and hope that the swarm will slowly refine the quality of

the proposed solution, though it is likely only an approximation of a local minimizer

rather than the desired global minimizer, (iii) restart the swarm from new locations and

search again to see if a better solution can be found as in MPSO, (iv) somehow flag

regions of the space to which particles have prematurely converged as already explored

and restart the algorithm so that each successive search is more likely to encounter the

global minimizer, or (v) reinvigorate the swarm by introducing diversity so the search can

continue more or less from the current location without having to restart and re-search

low quality regions of the search space.

14

Binkley and Hagiwara’s velocity-based reinitialization (VBR) shares with Van

den Bergh’s MPSO the idea of maintaining a list of global bests at stagnation. Rather

than restarting on the entire search space, however, the swarm is reinvigorated by

reinitializing velocities, which seems to be more efficient since the entire search space

does not necessarily need to be searched again. At the end of the search, the best of all

global bests is returned as the optimal value. Stagnation here is defined as the median

velocity dropping below a pre-specified threshold. The relatively difficult Rastrigin and

Rosenbrock benchmarks still present difficulty for the algorithm when the search space

consists of many dimensions [16], which is the case of primary concern in this thesis.

Research Objectives

Empirical Determination of Quality Parameters

This research firstly searches for high-quality parameters capable of performing

well in general to see how effectively proper parameter selection can prevent stagnation.

If these parameters are of high enough quality, the stagnation problem will be considered

solved. Otherwise, the regrouping concept will be developed and tested using these

parameters as a basis for comparison.

Development of Regrouping Mechanism for Gbest PSO

The second task is to develop a regrouping mechanism to liberate particles from

entrapping local wells or otherwise deceitful terrain in order to allow continued progress.

The resulting algorithm is called Regrouping PSO (RegPSO).

15

Testing of RegPSO

Testing will be conducted on a benchmark suite consisting of common uni-modal

and multi-modal problems of varying levels of difficulty, including the incorporation of

noise.

Data Comparison

The results of testing will be compared with (a) Gbest PSO and Lbest PSO using

somewhat standard parameters found to work well, (b) the high-quality empirically

determined parameters for Gbest PSO, and (c) other approaches that have been developed

for solving the stagnation problem.

Explore Applicability of RegPSO to Simple Uni-Modal Problems

The idea of regrouping is to help the swarm escape from the state of premature

convergence, which is primarily troublesome on multi-modal problems. However, the

potential applicability of the concept to the simple uni-modal case will be explored as

well.

Contributions

High-Quality PSO Parameters

Many parameter combinations will be tested in order to find a combination that

works well across the benchmark suite in conjunction with Gbest PSO. The resulting

parameters will serve not only as a comparison basis for RegPSO but as a good means to

delay stagnation for applications which do not allow sufficient time for regrouping to

take effect.

16

Development of Efficient Regrouping Mechanism

A regrouping mechanism is developed by which to liberate particles from the

state of premature convergence so that exploration can continue. This regrouping

mechanism will make use of the state of the swarm when premature convergence is

detected in order to re-organize the swarm according to information inferred from the

swarm state. The regrouping mechanism should work better than simply restarting on the

same search space repeatedly and should still be applicable to a variety of problem types.

Development of Regrouping Model Specifically for Uni-Modal Case

Whereas the previous contribution is expected to be useful on multi-modal

functions due to its exploratory intentions, it is desirable to show the applicability of the

same mechanism to the uni-modal case. It will be shown that RegPSO can have

parameters selected so as to regroup in a tiny region in order to help particles refine

solution quality or “exploit” the proposed solution.

17

CHAPTER II

PARTICLE SWARM OPTIMIZATION ALGORITHM

Problem Formulation

The goal of any optimization problem is to maximize or minimize an objective

function f x where x is the decision vector consisting of n dimensions or decision

variables consisting of real numbers. Since maximization of any function f x is

equivalent to minimization of f x , the literature generally focuses on minimization

without loss of generality.

Solution *x is a global minimizer of f x if and only if *f x f x for all x in the

domain of f x . The unconstrained minimization problem of consideration here can be

formulated as

Minimize

where : .n

f x

f R R (2.1)

Evolution of the PSO Algorithm

Original PSO Algorithm

The basic idea of particles searching individually while communicating with each

other concerning the global best in order to produce a more capable collective search

applies to

18

all forms of PSO from the originally conceived algorithm through the more

capable models available today.

Particle swarm, as originally published [4], consisted of a swarm of particles each

moving or flying through the search space according to velocity update equation

1 1 2 21i ii i i i iv k v k c r k p k x k c r k g k x k (2.2)

where

iv k is the velocity vector of particle i at iteration k ,

ix k is the position vector of particle i at iteration k ,

ip k is the n-dimensional personal best of particle i found from initialization

through iteration k,

g k is the n-dimensional global best of the swarm found from initialization

through iteration k,

1c is the cognitive acceleration coefficient so named for its term’s use of the

personal best, which can be thought of as a cognitive process whereby a particle

remembers the best location it has encountered and tends to return to that state,

2c is the social acceleration coefficient so named for its term’s use of the global

best which attracts all particles simulating social communication,

1ir k and 2i

r k are vectors of pseudo-random numbers with components

selected from uniform distribution (0,1)U at iteration k , and

is the Hadamard operator representing element-wise multiplication.

19

The farther a particle is from its personal best, the larger i ip x is and the

stronger the acceleration toward that point is expected to be. Notice that if a particular

dimension of the current position is greater than the same dimension of the personal best,

the acceleration on that dimension is negative, which means that the particle is pulled

back toward that location on that dimension. Of course, this implies that when the

personal best lies ahead of the current position, the particle will accelerate in the positive

direction toward the personal best so that each particle is always pulled toward its

personal best on each dimension. Similarly, the farther a particle is from its global best,

the larger ig x is and the stronger the acceleration is toward that point. The social and

cognitive acceleration coefficients, 1c and 2c , determine the respective strengths of those

pulls and relative importances of each best.

When each dimension of the social and cognitive terms is multiplied by a random

number, the acceleration is not necessarily directed straight toward the best. Were the

same random number used on all dimension, each pull will be straight toward its best.

Either way, particles are accelerated in two different directions at once so that they do not

actually accelerate straight toward either best.

At each iteration, the previous velocity is reduced by the inertia weight and

altered by both accelerations in order to produce the velocity of the next iteration.

Treating each iteration as a unit time step, a position update equation can be stated

as

1 1

for 1,2, , .

i i ix k x k v k

i s

(2.3)

20

“Lbest” PSO

Though Eberhart and Kennedy published the Lbest version the same year as the

Gbest version [13], it was Gbest PSO that gained prominence – apparently for its quick

initial convergence [5]. The only difference between the two is that the velocity update

equation of Lbest PSO uses a neighborhood best rather than the global best as explained

in the research motivation section. “Lbest” PSO often outperforms Gbest PSO as

demonstrated in Table V-1 since hasty decisions often lead to a compromise in solution

quality when taking more time would be practical, though for real-time implementations

or cases of limited available data, the ability to make real-time decisions – even if

imperfect – becomes valuable so that Gbest PSO may be better for such applications.

The velocity update equation of Lbest PSO can be formulated in vector notation as

1 1 2 21 .i ii i i i i iv k v k c r k p k x k c r k l k x k (2.4)

where il k is the local or neighborhood best at iteration .k

Inertia

Static Inertia Weight & Constriction Coefficient

There was a weakness inherent in velocity update equations (2.2) and (2.4) that

was fixed by the introduction of an inertia weight. For the following derivation, let 0k

be the iteration at which particles have their positions and, optionally, their velocities

randomly initialized. Then for any particle i , the velocity at iteration 1k is

1 1 2 21 0 0 0 0 0 0 0 .i ii i i i iv v c r p x c r g x (2.5)

21

Since a particle has only one position, (0),ix from which to choose in order to determine

its personal best, (0),ip of necessity (0) (0)i ip x and the middle term of equation (2.5)

is zero, so the particle’s velocity at iteration 1k can more succinctly be expressed as

2 21 0 0 0 0 .ii i iv v c r g x (2.6)

Using (2.2) again, the velocity of particle i at iteration 2k is

1 1 2 22 1 1 1 1 1 1 1 .i ii i i i iv v c r p x c r g x (2.7)

Substituting the value found in (2.6) for 1iv , the velocity at the second iteration

following initialization becomes

1 1

2

2

2 1 1 1

1 1 1

i

i

i i i

i

v c r p x

cr g x

i

i

2 i i

v 0

r 0 g 0 - x 0 (2.8)

with the substituted values in bold for emphasis. By velocity update equation (2.2), the

velocity of particle i at iteration 3k is

1 1

2 2

3 2 2 2 2

2 2 2 .

i

i

i i i i

i

v v c r p x

c r g x

(2.9)

22

Substituting for 2iv the value found in (2.8), the velocity at the third iteration

following initialization becomes

1

1

2

2

32 2 2

2 2 2

i

i

i

i i

i

v cr p x

c

r g x

i

i

i

1 i i

i

2 i i

2 i i

r 1 g p 1 - x 1v 0

r 0 g g 0 - x 0

+ r 1 g g 1 - x 1

(2.10)

with the substituted values in bold for emphasis.

By mathematical induction, it can be seen that

1 1

1

2 2

0

1 0

.

i

i

k

i i i i

a

k

i

a

v k v c r a p a x a

c r a g a x a

(2.11)

Because the personal bests and global best can only improve over time, 1iv k should

rely more heavily upon recent bests than upon early values. Yet (2.11) shows that early

information in ip a k and g a k is given just as much opportunity to affect

1iv k as is the higher quality information of later iterations since the information of

all iterations is summed without any weighting scheme by which to increase the relative

importance of the higher quality information of later iterations. This problem is remedied

by introducing either an inertia weight [17, 18], (0,1) , or constriction coefficient

[19], (0,1) , into velocity update equation (2.2) according to


or


23

Equation (2.13), which is from Clerc’s constriction models, can be rewritten as


which then simplifies to


where

3 1 4 2and .c c c c (2.16)

Since (2.15) is mathematically equivalent to (2.12) as a result of the acceleration

coefficients being set arbitrarily by the user prior to execution, converting between the

velocity update equation of the constriction coefficient models (2.13) and the standard

velocity update equation with inertia weight (2.12) is straightforward using (2.16).

However, the mathematical equivalence of the velocity update equations does not render

the constriction models mathematically equivalent to standard PSO since the position

updates for the former vary by type such that the velocity vector in those models does not

simply carry particles from their previous positions to their new positions. In this sense,

the velocity concept is redefined by the constriction models.

The constriction models are used in conjunction with Clerc’s equation

2

1 2

2, 4,

2 4

, 4

where

,

[0,1]

c c

(2.17)

which recommends a value for the constriction coefficient based on preselected values of

24

the acceleration coefficients, where smaller values of lead to quick convergence and

larger values allow more exploration.

Equation (2.17) is based on theoretical studies of particle trajectories, but since it

was hoped that the constriction models would eliminate the need for velocity clamping

[19], the calculations that led to (2.17) did not account for the velocity clamping value,

though it affects particles’ trajectories. This is unfortunately somewhat of a weakness in

the model since it appears from empirical testing that all PSO parameters are inter-related

so that (2.17) would be more useful if it accounted for the velocity clamping value,

which has continued to be beneficial as discussed in the following section.

The same process that led to (2.11) beginning with velocity update equation (2.2)

leads to (2.18) when beginning with velocity update equation (2.12) with inertia weight.

1

1 1

1

2 2

0

1 0

.

i

i

kk k a

i i i i

a

kk a

i

a

v k v c r a p a x a

c r a g a x a

(2.18)

So long as the inertia weight has a magnitude less than one, (2.18) shows that past

personal bests are expected to have less effect on a particle’s velocity at iteration 1k

than more recent personal bests due to the effect of multiplication at each iteration by the

inertia weight, . This makes sense conceptually since recent bests – both global and

personal – are expected to be of higher quality than past bests. However, past bests could

still have more effect on a particle’s overall velocity than recent bests for a while at the

beginning of the search since ig a x a , at least, is generally more significant in early

iterations when the swarm is more spread out.

25

Additionally, a particle’s initial velocity, which is not derived from any

information, but randomly initialized to lie between the upper and lower velocity

clamping values, becomes of less effect over time. This too makes sense because its

main benefit is in early iterations where it provides momentum by which to propel the

best particle, but after some time it effectively becomes noise diluting actual information.

Setting 1 would make velocity update equation (2.12) with inertia weight

equivalent to velocity update equation (2.2) without inertia weight so that (2.12) can be

accepted without a rigorous proof demonstrating its superiority to (2.2) since it simply

provides more options. So long as 0,1 , velocity update (2.12) helps particles forget

their lower-quality past positions in order to be more affected by the higher-quality

information of late, which seems to make more sense conceptually.

Time-Varying Inertia Weight

Decreasing the inertia weight over time would still allow the swarm to gradually

forget early information of relatively low quality, as in the static case, due to the iterative

multiplication of all past information by a fraction of one as in equation (2.18). For the

decreasing weight, however, information is forgotten more quickly than were the initial

value held constant. This time-decreasing weighting of information may provide more

balance between early and recent information since early information is forgotten at a

slower rate than later information due to the use of relatively large weights early in the

simulation. In other words, all memory is adversely affected, but short-term memory is

mostly affected. This potentially more balanced weighting of early information with late

information might help the standard algorithm postpone premature convergence to

candidate solutions of later iterations when appropriate initial and final values are used.

26

The decreasing inertia weight also allows early weights to be larger than were a

static weight used throughout the search. This corresponds to larger velocities early in

the search than would otherwise be seen, which may help postpone premature

convergence by facilitating exploration early in the search. The rate of decrease from

initial weight to final weight depends on the expected length of the simulation since the

step size is a fraction of the total number of iterations expected; hence, the amount of

time spent in the relatively explorative phase, as determined by the amount of time for

which the decreasing weight is larger than the value that would have been used for a

static weight, also depends on the expected length of the simulation. Table II-2 shows

data generated by decreasing the inertia weight gradually from 0.9 to 0.4 over the course

of 800,000 function evaluations.

Increasing the inertia weight, on the other hand, would cause past information to

be forgotten more rapidly than recent information due to the weighting distribution, thus

tremendously increasing the importance of the higher quality information of later

iterations. For the right range of values, this could conceptually lead to quicker initial

convergence due to less diversity being maintained; however, this could adversely affect

solution quality on difficult functions by upsetting the balance between exploration and

exploitation.

Quick convergence is desirable when successfully converging to a global

minimizer, but it is undesirable when the search is so hasty as to converge prematurely to

a local minimizer. There is a delicate balance to achieve in order to search efficiently yet

thoroughly. The time-varying weight attempts to improve that balance as inferred from

equation (2.18), which shows that at any iteration a particle’s velocity vector is the result

27

of weighted attractions toward past information, which frames a time-varying inertia

weight as affecting the balance between the rates of short-term and long-term

forgetfulness.

The first study to vary the inertia weight decreased it with the idea that this would

help particles converge upon and refine a solution by reducing velocities over time. This

appeared to work better over the thirty trials conducted [17]; but with only one

benchmark tested, it is conceivable that this might have been a characteristic of that

particular benchmark, which would be consistent with the findings of Meissner et al [20],

who used particle swarm to optimize its own parameters with very different parameters

being proposed per benchmark – including an increasing inertia weight on some

benchmarks and a decreasing weight on others. Since that experiment used Gbest PSO,

which tends to stagnate before reaching a global minimizer, the parameter combinations

recommended are likely not ideal, though they may be approximations of quality local

minimizers.

Whereas [20] found an increasing weight to outperform on some benchmarks,

[21] suggested that an increasing inertia weight outperformed on all benchmarks tested;

however, a different formulation of PSO was used so that the quicker convergence

claimed could not be attributed to the increasing weight alone. In an attempt to reproduce

the results of [21] using standard Gbest PSO, increasing the inertia weight from 0.4 to 0.9

with the same swarm size of 40 particles, acceleration constants 1.49618, and 1,000

iterations as used in the paper resulted in worse performance on all nine benchmarks

relative to decreasing the weight from 0.9 to 0.4. Therefore, decreasing the weight

appears better than increasing it, at least for the range between 0.9 and 0.4. When the

28

static weight was compared to decreasing, however, only the Ackley and Rastrigin

benchmarks saw much improvement from decreasing the weight; and performance on

Rosenbrock suffered from the decrease, so that decreasing the inertia weight is not

always best as can be seen by comparing the data of Table II-2 with that of Table II-1.

It is noteworthy that Naka and Fukuyama showed a decrease from 0.9 to 0.4 to

considerably outperform decreases from 2.0 to 0.9 and from 2.0 to 0.4 on their particular

state estimation problem [22], but they did not generate any comparison data using the

static inertia weight. Table II-1 and Table II-2 compare the performance of static and

decreasing inertia weights on some popular benchmark problems.

Velocity Clamping

Eberhart and Kennedy introduced velocity clamping, which helps particles take

reasonably sized steps in order to comb through the search space rather than bouncing

about excessively [13]. Clerc had hoped to alleviate the need for velocity clamping with

his constriction models [19]. Eberhart, however, showed clamping to improve

performance even when parameters are selected according to a simplified constriction

model (2.17) [18]. Clerc then compared equation (2.13) with velocity clamping to his

other constriction models without velocity clamping and concurred that velocity

clamping does offer considerable improvement even when parameters are selected

according to (2.17), so that the constriction models have not eliminated the benefit of

velocity clamping [19]. Consequently, velocity clamping has become a standard feature

of PSO.

Velocity clamping is done by first calculating the range of the search space on

each dimension, which is done by subtracting the lower bound from the upper bound.

29

For example, if each dimension of the search space is defined by lower and upper bounds

[ 100,100] , the range of the search space is 200 per dimension. Velocities are then

clamped to a percentage of that range according to

max range ,

(0,1]

j jv

(2.19)

where

range ,

for 1,2, ,

and search space, , defined in 2.21 .

U L

j j jx x

j n

(2.20)

For the commonly used clamping value of 0.5, if the center of the search space

lies at the origin of Euclidean space, the maximum velocity is simply the upper bound of

the search space; for example, a search space defined by [ 100,100] on any dimension

would have max 100jv . However, it is not desirable to define maximum velocity

explicitly in terms of the upper bound of the search space since this assumes that the

origin of Euclidean space will always be either the center or lower bound of the intended

search space so that the upper bound is proportional to the range to be explored – neither

of which is necessarily a valid assumption since some application problems have decision

variables, such as length, defined only for positive values, the lower bound of which may

not even be zero. Consequently, it is preferable to define the maximum velocity more

generally in terms of the range of the search space as in (2.19). If, for example, particles

are initialized on [100, 300] for any particular decision variable, they should logically

have the same maximum velocity as if they were initialized on [-100, 100], since the

same distance is expected to be traversed in either case.

30

The same maximum velocity should be applied in both the positive and negative

directions in order to avoid biasing the search in either the positive or negative direction.

The following pseudo code shows how velocities proposed by velocity update equation

(2.12) are clamped prior to usage in position update equation (2.3).

max

max

max

max

if 1

1

else if 1

1

end if

ij j

ij j

ij j

ij j

v k v

v k v

v k v

v k v

Figure II-1: Velocity Clamping Pseudo Code

As noted by Engelbrecht [23], clamping a particle’s velocity changes not only the

step size, but usually also the particle’s direction since changing any component of a

vector changes that vector’s direction unless each component should happen to be

reduced by the same percentage. This should not be thought of as a problem, however,

since each dimension is to be optimized independently, and the particle still moves

toward the global best on each dimension, though at a less intense speed. Since the

maximum iterative movement toward global best on any dimension is clamped, particles

may be thought of as combing the search space a bit more thoroughly than were their

velocities unclamped.

Though the same velocity clamping percentage of fifty percent is used in most

papers for sake of comparison, the value does not appear to have been optimized yet. Liu

et al suggested a value of fifteen percent [24], which has been empirically verified to

work well as shown in Table II-1 and Table II-2.

31

Table II-1: Effect of Velocity Clamping Percentage with Static Inertia Weight

Gbest PSO; 800,000 function evaluations; s = 20, c1 = c2 = 1.49618, = 0.72984

Benchmark n 0.15 0.5 1 Without Vel.

Clamping

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.4952

2.5311

4.4409e-15

5.4122

1.0969

3.1206

3.6524

1.5017

7.0836

1.4975

3.6812

3.9281

0.9313

7.8162

1.6167

3.6544

4.003

1.6462

9.6772

1.8326

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.040416 0.1182

0

2.2675

0.3289

0.049122

0.055008 0

0.15666

0.044639

0.052756

0.12784

0

0.95838

0.21229

0.042845

0.070346

0

0.46229

0.094686

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.9076e-80 4.7513e-74

1.3422e-84 1.469e-72

2.4174e-73

1.6824e-79

4.1822e-75

4.146e-84

2.0732e-73

2.9314e-74

4.4508e-79

4.0909e-75

2.253e-83

1.8646e-73

2.6362e-74

9.1067e-80

2.4315e-76 6.3926e-84

1.1208e-74

1.5841e-75

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0016193

0.0027877

0.00042632

0.016272

0.0030928

0.00272

0.0039438

0.00060861

0.019695

0.0040209

0.0023147

0.0044615

0.00077887

0.067881

0.0095446

0.0030738

0.0053736

0.00069323

0.031762

0.0063947

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

51.7378

56.1753

24.874

91.536

14.4256

70.64194

71.63686

42.78316

116.4097

17.1532

75.11921

75.81567

38.80337

133.324

22.04992

83.0789

83.636

41.7882

136.3089

19.63376

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.6095e-9

1.2763 1.7652e-17

9.9657

2.1661

5.35546e-9

2.06915

2.68986e-18

13.315

3.1387

2.34906e-8

1.49279

1.2411e-18

10.101

2.42602

4.828e-9

1.87934

3.05779e-19 18.6845

3.60868

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0.0038864

0

0.0097159

0.0048081

0

0.0033034

0

0.0097159

0.0046492

0

0.0025261 0

0.0097159

0.004305

0

0.004275

0

0.0097159

0.0048718

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

4.6936e-322

0

2.332e-320

0

0

2.4703e-323

0

8.745e-322

0

0

0 0

3.4585e-323 0

0

6.4229e-323

0

2.7223e-321

0

Weighted

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

5.0889e-322

0

2.4012e-320

0

0

1.0869e-321

0

5.3903e-320

0

0

6.9169e-323 0

1.5563e-321 0

0

4.1007e-322

0

1.6601e-320

0

32

Clamping velocities to fifteen percent provided noticeably better performance in

median and mean values on multi-modal functions of high dimensions, where cautious

step sizes in light of new information proved most beneficial; Griewangk was the

exception, since one poorly performing trial significantly affected the mean function

value. Smaller step sizes seem to have helped avoid premature convergence to sub-

optimal, local minimizers. It appears that the standard velocity clamping value of fifty

percent widely used in the literature can be improved upon, and fifteen percent seems to

work well in agreement with Liu’s observation based on primarily different benchmarks

of low dimensions [24].

To determine if fifteen is also a good clamping percentage in conjunction with the

linearly decreased weight, each trial was repeated in Table II-2 using the same initial

positions and sequences of random numbers used to generate each row of Table II-1.

33

Table II-2: Effect of Velocity Clamping Percentage with Decreasing Inertia Weight

Gbest PSO; 800,000 evaluations; s = 20, c1 = c2 = 1.49618, from 0.9 to 0.4 linearly

Benchmark n 0.15 0.5 1 Without Vel.

Clamping

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

7.9936e-15

1.1191e-14

4.4409e-15

4.3521e-14

8.0648e-15

7.9936e-15

1.0196e-14

4.4409e-15

2.931e-14

4.9149e-15

7.9936e-15

9.4147e-15 4.4409e-15

2.2204e-14

3.2892e-15

2.36167

3.58269

3.9968e-14

20.8328

3.77189

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.012319

0.022023

0

0.090322 0.024071

0.022141

0.028174

0

0.11254

0.027334

0.01109

0.018645 0

0.11942

0.023904

0.017239

0.025321

0

0.11743

0.02732

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.2644e-17

2.3189e-14 2.6219e-22

8.6952e-13

1.2438e-13

3.6766e-17

1.4071e-12

1.0224e-22 5.0725e-11

7.6037e-12

3.8379e-16

8.577e-13

9.158e-20

1.6719e-11

3.3624e-12

1.3274e-15

8.9083e-11

7.0557e-21

4.0182e-9

5.6939e-10

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0015335

0.0015241

0.00033276

0.0028314

0.00065649

0.0019085

0.0021906

0.00086674

0.007024

0.0010925

0.0022996

0.002396

0.00081094

0.0071013

0.0013029

0.0024834

0.0028831

0.00075789

0.0078216

0.0015665

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

24.3765

25.252 13.9294

42.7832

7.06661

25.8689

27.4808

8.95463 48.7529

8.29488

30.3462

31.4805

10.9445

57.7075

9.95111

40.7933

42.6439

18.9042

72.6317

11.5052

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

8.63847

18.859

0.000292151

81.5558 25.9117

8.655254

23.60514

5.062084e-5

143.8422

32.00213

6.576994

16.49861 3.950323e-5

103.6101

24.48977

11.81142

29.77723

8.374797e-6 571.7925

82.17547

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.8331e-106

1.0834e-94 6.4959e-117

4.885e-93

6.9319e-94

9.4358e-106

1.5939e-88

1.0629e-123 7.969e-87

1.127e-87

1.5402e-104

1.5349e-90

8.4934e-115

7.5037e-89

1.061e-89

1.4202e-101

1.6172e-90

3.1766e-114

7.9451e-89

1.1233e-89

Weighted

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

9.6085e-104 4.4182e-93

7.6884e-121 1.6078e-91

2.3941e-92

1.0392e-102

6.1741e-91

5.7277e-115

3.0868e-89

4.3654e-90

3.1051e-103

4.9532e-96 1.4453e-115

1.5059e-94

2.4592e-95

3.8037e-99

7.185e-90

5.6992e-112

3.5536e-88

5.0246e-89

34

On Rastrigin and the noisy Quartic, results were again better for velocities

clamped to fifteen percent of the range of the search space, which was the best percentage

from Table II-1. The best results on Rosenbrock in conjunction with the decreasing

weight were obtained by clamping velocities to a maximum step size equal to the full

range of the search space, which differs from what was seen in Table II-1. Other

performance differences were not of considerable magnitude. Fifteen percent appears to

be the best velocity clamping value for most of the benchmarks tested in Table II-1 and

Table II-2.

Comparison between Table II-1 and Table II-2 shows that on the simple uni-

modal Quadric, Sphere, and Weighted Sphere benchmarks as well as on the more

difficult, uni-modal Rosenbrock, the decreasing inertia weight had an adverse effect on

performance, whereas on the multi-modal Ackley, Griewangk, Rastrigin, Schaffer’s f6,

and the essentially multi-modal Quartic with noise, the decreasing weight improved

performance. It was observed by comparing the numbers of iterations required to

produce a small function value that the static weight produced quicker convergence

across the benchmark suite, but this quicker convergence in turn led to stagnation on

multi-modal functions sooner than did the slower convergence achieved by the linearly

decreased weight. This means that the weighting of past information via the time-

decreasing inertia weight is problem-dependent where the balance achieved by

decreasing the weight was best for relatively difficult multi-modal functions for which

the increased exploration resulting from a larger weight in early iterations proved

beneficial, while the quicker convergence of the static weight was better for the simpler,

uni-modal functions.

35

Standard “Gbest” PSO

Objective

Elaborating on (2.1), repeated below for convenience, the optimization problem

considered herein is to

minimize

: n

f x

f R R

where f is the objective function, or cost function, of an application problem. The

decision vector, ,nx R consists of the n decision variables to be optimized, thus

producing the most desirable function value. A decision vector is called the global

minimizer if it produces the optimal function value called true global minimum. Even

though (2.1) is considered an unconstrained optimization problem, in practice only

solutions belonging to a subset nR are considered feasible. The search space is

defined by a subset

1 2 21

L U L U L U, , , n

n nx x x x x x R (2.21)

where L

jx and U

jx are, respectively, the lower and upper bounds of the search space along

dimension j for 1,2, , .j n

Initializations

For a particle swarm of size s , each particle, 1 2

[ , , , ]ni i i ix x x x , represents a

potential solution to the optimization problem where particles are indexed as 1,2, , .i s

The swarm is initialized by randomizing particles’ positions about the center, center , of

36

the search space, , using random numbers drawn from a uniform distribution so that no

portion of the search space is preferred over any other as would result from random

numbers being drawn from a normal distribution. This can be done according to (2.22)

below.

1 2

L UL U L U

1 1 2 2

10 ( ) ( )

2

where , ,..., with

(0,1) randomly selected.

= , ,..., ,2 2 2

which is usually known in ad

i i i

i

i i

i n

j

n n

x k center r range range

r r r r

r U

x xx x x xcenter

vance rather than calculated.

(2.22)

The personal bests are then initialized to be the same as the initial positions since

there are no past positions with which to compare:

0 0 .i ip k x k (2.23)

Let 1 2( ) ( ), ( ), , ( )sP k p k p k p k be the set of all personal bests at iteration .k In Gbest

PSO, the global best is initialized and iteratively updated to be the best of all personal

bests according to

arg min .i

ip k P k

g k f p k

(2.24)

The value of each particle’s velocity along dimension j is initially randomized to

lie within max max,j j

v v

according to

max max

1 2

2

where , ,..., with

(0,1) randomly selected.

i i

i d

j

v k r v v

r r r r

r U

(2.25)

37

and subsequently clamped to lie within the same range since particles should only need to

step through some maximum percentage of the search space per iteration. Before

velocity clamping was implemented, particles were prone to roam far outside the bounds

of the search space [25]. The value of max

jv is selected as a percentage, , of the range of

the search space along dimension j according to (2.19) and (2.20) repeated below [26].

max range ,

(0,1]

j jv

range ,

for 1,2, , .

U L

j j jx x

j d

range ( )j represents the range of search space along dimension ,j where can be

thought of as the hypercube to be searched, with range j being the length of that

hypercube along dimension .j The velocity clamping percentage, , is usually chosen

within range 0.1 0.5 .

Iterative Swarm Motion

Iteratively, particle i moves from its current position to a new position along

velocity vector, 1 2

[ , , , ],ni i i iv v v v according to position update equation (2.3) restated

below, where the rate or velocity may be thought of as being multiplied by a unit time

step of one iteration.

1 1

for 1,2, , .

i i ix k x k v k

i s

The velocity is first calculated according to velocity update equation (2.12),

restated below.

38

1 1 2 21

for 1,2, ,

i ii i i i i iv k v k c r p k x k c r g k x k

i s

As the stepping process according to velocity and position update equations (2.12)

and (2.3) continues, particles update their personal bests as they encounter better

positions than encountered previously. At any point in time, the best of all personal bests

is the swarm’s global best shared freely between particles. Particles eventually converge,

via their communication of the global best and collective movement toward it, to the one

best position they have found. The algorithm can be allowed to run either for a number

of iterations expected to produce a good solution or until a user-specified criterion or

threshold is reached.

Each particle keeps a memory of its personal best, ( ),ip k for its own

consideration; this is the n-dimensional location, or position, that has produced the best

function value over the particle’s search through the current iteration. Each personal best

is updated only when the particle’s new position at iteration 1k yields a better function

value than does the personal best at iteration k as shown below.

1 if 11 .

if 1

i i i

i

i i i

x k f x k f p kp k

p k f x k f p k

(2.26)

In Gbest PSO the global best, ( ),g k is iteratively updated according to the same equation

(2.24) by which it was initialized. It is then “communicated” via shared computer

memory to all particles for consideration.

The effects of the inertial term, cognitive term, social term, and velocity clamping

percentage on particles’ velocities is illustrated in Figure II-3 through Figure II-7 for

39

swarm size 10,s acceleration constants 1 2 1.49618,c c inertia weight

0.72984, and velocity clamping percentage 0.15. The acceleration coefficients

and inertia weight were obtained from Clerc’s constriction model (2.17) [27]. The

velocity clamping value was selected following the suggestion in [24] and since it

worked well with Gbest PSO in Table II-1 and Table II-2, though the benchmarks in

these tables were of much higher dimensionality than used in [24].

Figure II-2: Rastrigin Benchmark Used for 2-D Illustration

40

Figure II-3: Swarm Initialization (Iteration 0)

Positions and velocities are randomly initialized. Personal bests and the

global best are initialized accordingly. Particles 1 and 3 are selected to

visually illustrate how velocities update and are clamped.

Figure II-4: First Velocity Updates (Iter. 1)

The randomly initialized velocities of iteration 0 decrease via the inertia

weight, and particles accelerate toward the global best of particle 6. There is

no cognitive acceleration since all particles are initially at their personal bests.

The black resultant vectors stem from clamping on each dimension.

41

Figure II-6: Second Velocity Updates (Iter. 2)

Particle 1 continues moving downward to the left according to its inertia and

social acceleration. Particle 3 now experiences cognitive acceleration, which

together with its leftward social acceleration overcome its inertia to the right; it

experiences a larger acceleration toward the global best, as expected from (2.12),

due to the global best being farther away than its personal best.

Figure II-5: First Position Updates (Iter. 1)

Particles move along their resultant velocity vectors to new positions. Particle

1 found a new personal best. The new position of particle 3 evaluates to a

higher function value, so its previous position is still its personal best.

42

The main challenge seen in the literature is that PSO tends to stagnate as illustrated

in the next section.

Illustration of Premature Convergence

The swarm is said to have prematurely converged when the proposed solution is

not a global minimizer and when progress toward better minima has ceased so that

continued activity could only hope to refine the quality of the solution converged upon,

which may or may not be a local minimizer. Stagnation is a result of premature

convergence. Once particles have converged prematurely, they continue converging to

within extremely close proximity of each other so that the global best and all personal

bests are within one miniscule region of the search space. Since particles are continually

attracted to the bests in that same small vicinity, particles stagnate as the momentum

from their previous velocities wears off. While particles are technically always moving,

Figure II-7: Second Position Updates (Iter. 2)

Particles iteratively follow their resultant velocity vectors to new positions.

43

stagnation can be thought of as a lack of movement discernable on the large scale, from

which perspective the stagnated swarm will appear as one dot or point.

The multi-modal Rastrigin function is one of the most difficult benchmarks

common in PSO literature because of its many local wells, each of which has a steep rate

of decrease relative to the overall curvature as shown in Figure II-2. These local wells

make the true global minimizer difficult to discover. PSO can successfully traverse many

of the wells containing local minima that would trap a gradient-based method but often

gets stuck in high-quality wells near the true global minimizer.

In order to illustrate the stagnation problem that has plagued PSO since its

original formulation, Gbest PSO was applied to minimize the two-dimensional Rastrigin

function of Figure II-2 using swarm size 10,s acceleration constants

1 2 1.49,c c inertia weight 0.72, and velocity clamping percentage 0.15.

Swarm motion is graphed on the colored contour maps of Figure II-8 through

Figure II-16, where particles can be seen flying from random initialization to eventual

stagnation at the local minimizer near [2,0]. The true global minimizer at [0,0] is not

discovered. A particle finds the relatively quality region near local minimizer [2,0] and

communicates this new global best to the rest of the swarm. As other particles fly in its

direction, none finds a better global best, so all converge near position [2,0] as momenta

wane. This is the continuation of the search used to illustrate how velocities update in

Figure II-3 through Figure II-7.

44

Figure II-8: Swarm Initialization at Iteration 0

Particles are randomly initialized within the search space.

Figure II-9: Converging (Iter. 10)

Particles are converging to local minimizer [2,0] via their

attraction to the global best in the vicinity.

45

Figure II-10: Exploratory Cognition and Momenta (Iter. 20)

Cognitive accelerations toward personal bests and “momenta”

keep particles searching prior to settling down.

Figure II-11: Convergence Continues (Iter. 30)

As momenta wane and no better global best is found, particles

continue converging to local minimizer [2,0] .

46

Figure II-12: Momenta Wane (Iter. 40)

Momenta continue to wane as particles are repeatedly pulled

toward (a) the global best near [2,0] and (b) their own

personal bests in the same vicinity.

Figure II-13: Premature Convergence (Iter. 102)

The local minimizer near [2,0] is being honed in on, but no

progress is being made toward a better solution as the swarm

has converged prematurely without hope of escape.

47

Stagnation is clearly the main obstacle of PSO as little if any progress is made in

this state. Chapter III conducts an extensive experiment to search for a set of parameters

capable of preventing or postponing stagnation. Chapter IV presents the formulation and

pseudo code for PSO with regrouping (RegPSO). Chapter V compares RegPSO using

standard parameters to (a) Gbest PSO using the best parameters of Chapter III, (b) the

multi-start PSO (MPSO) of Van den Bergh for escaping from premature convergence

once it is detected, and (c) opposition-based PSO (OPSO) designed to maintain swarm

diversity with the hopes of preventing stagnation.

48

CHAPTER III

EMPIRICAL SEARCH FOR QUALITY PSO PARAMETERS

Rastrigin Experiment Outlined

While the parameters derived from Clerc’s constriction model are commonly used

in the literature, this is largely so that improvements can be compared in a

straightforward manner with those of other articles and papers using the same set of

parameters. It has not been empirically determined that these are actually the best

parameters available, and other parameters have been suggested [5, 28].

Since parameter selection affects solution quality, proper selection can be thought

of as postponing stagnation. Consequently, one becomes curious whether such selection

could prevent stagnation altogether. If simple parameter selection alone could prevent

stagnation, this would be preferable since it would not require any modification to the

standard algorithm. Before developing a novel mechanism, an empirical test was done to

check whether parameter selection itself might adequately prevent stagnation.

To explore this possibility, many different combinations of the social acceleration

coefficient, the cognitive acceleration coefficient, and the inertia weight were tested on

the relatively difficult multi-modal Rastrigin benchmark while holding constant: (i) the

velocity clamping threshold at fifteen percent of each dimension’s range, (ii) the swarm

size at thirty, (iii) the number of iterations at three thousand, and (iv) the number of trials

49

per parameter combination at fifty. A swarm size of thirty was utilized for the

experiment since the question was not yet whether parameter selection could prevent

stagnation with a small swarm size but whether it could prevent stagnation even with a

modestly large swarm size.

Parameter combinations were tested by building into the PSO Research Toolbox a

mechanical exploratory feature to automatically implement the following rules: (i) for the

starting values of the acceleration coefficients, initialize the inertia weight to a value

expected to work well; (ii) generate fifty trials for this set of parameters and record the

median, mean, minimum, maximum, and standard deviation in one column of a table;

(iii) increment the inertia weight by 0.01 in either direction; (iv) conduct fifty more trials

and record the resulting statistics in a new column; (v) sort the columns from lowest

inertia weight to highest; (vi) evaluate whether increasing or decreasing the inertia weight

appears most promising based on the median values generated; (vii) increment the inertia

weight in the most promising direction except for every fifteenth column, where the

opposite direction is selected in case the apparent direction is wrong; (viii) as long as the

best median or the best mean is in one of the outer six columns, repeat steps iv – vii to

continue testing other values of the inertia weight; (ix) when neither the best median nor

the best mean are in any of the outer six columns, increment both acceleration

coefficients by 0.1, thus preserving the difference between them; (x) until the maximum

number of tables is reached, use simple trends in the best inertia weight of each table to

determine a good starting value of the inertia weight for the next table, or if an

insufficient number of tables have been generated from which to infer a trend, begin with

the best inertia weight of the previous table; (xi) until the user-specified minimum

50

number of tables is reached, repeat steps ii – x to test various values of the inertia weight

for each new combination of acceleration coefficients. At the conclusion of this process,

tables of data were displayed for human analysis, after which the difference between the

acceleration constants was incremented, and the process was repeated. The linearly

varied inertia weight was not tested here as it would have added a tremendous number of

possible parameter combinations.

At this point, the best median and mean per table were highlighted, the difference

between the acceleration coefficients was incremented from zero by one-tenth to four and

three-tenths with the social coefficient kept larger than the cognitive coefficient for this

study, and steps (i) through (xi) were repeated. The toolbox allowed for thousands of

parameter combinations to be tested on Rastrigin with fifty trials generated per

combination.

Independent Validation of “Social Only” PSO

The best median over three thousand iterations was produced by combination

1 20, 3.8, 0.02c c w . This combination interestingly constitutes the “social-only”

PSO found by Kennedy to quickly train an ANN for solving the XOR problem [29] and

should be thought of as an independent validation of that model. The unique difference

here is the slightly negative inertia weight.

For the social-only PSO, equation (2.18) simplifies to

1

2 2

0

1 0 .i

kk k a

i i i i

a

v k v c r a g a x a

(2.27)

Notice that the slightly negative inertia weight implies that it can be beneficial for the

swarm to be somewhat skeptical of new information (i.e. to take information with a grain

51

of salt) by slightly trusting the social information of any particular iteration for inertia

weight exponent k a even and slightly distrusting the same information for k a odd

such that the each iteration’s information is trusted and distrusted in an oscillatory

fashion with past information being quickly forgotten due to iterative multiplication by

the small inertia weight. On the conceptual level, the proposed parameters in

combination with Kennedy’s social-only model place the most importance on new

information, which is alternately trusted and distrusted until practically forgotten.

Even though i ig a x a is expected to have a larger magnitude per dimension

in earlier iterations when the swarm is more spread out, the small inertia weight will

dominate the product causing past global bests as well as past personal bests to have less

effect on each particle at iteration 1k than more recent bests due to the effect of

multiplication at each iteration by the slight inertia weight.

The curiosity at this point was whether the recommended parameters would

perform well in general or simply be characteristic of the Rastrigin benchmark. The

results of testing this combination across the popular suite of benchmarks are displayed in

Table III-1 where the original swarm size of thirty is tested over the initial ninety-

thousand function evaluations as well as over a full eight-hundred thousand function

evaluations in order to test not only the initial convergence rate but also the eventual

solution quality produced. This work is more concerned with eventual solution quality,

though relatively short trials were used to derive the parameters in order to examine many

different parameter combinations. Swarm sizes of twenty-five and twenty were also

tested out of curiosity.

52

Table III-1: “Social-only” Gbest PSO with Slightly Negative Inertia Weight

c1 = 0, c2 = 3.8, = -0.02, = 0.15, 50 trials per benchmark per column Benchmark n s = 30

90,000 FE’s

s = 30

800,010 FE’s

s = 25

800,000 FE’s

s = 20

800,000FE’s

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.8655e-5

0.014907

3.1492e-6

0.73192

0.10347

3.4728e-13 0.014887

5.0626e-14 0.73192

0.10348

6.6905e-10

0.0010873

5.0626e-14

0.044198

0.0062883

4.0669e-6

0.024254

7.9048e-14

0.64485

0.11786

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.022134 0.046633

4.347e-10

0.76961

0.10895

0.022134

0.046435

3.3307e-16 0.76961

0.10903

0.040576

0.047599

5.5511e-16

0.15602

0.042542

0.030896

0.068903

1.8874e-15

0.70899

0.13558

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.7794

1.1431

0.23305

3.6574

0.86872

2.5499e-7

0.12165

8.5254e-23

3.452

0.55226

0.00042113

0.610391

1.12388e-15

22.0692

3.14867

0.0076941

1.52907

5.27791e-8

37.6524

5.90377

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.07581

0.088061

0.01469

0.23258 0.048645

0.075803

0.088052

0.014682

0.23258

0.048643

0.10865

0.11937

0.03071

0.25844

0.051093

0.12476

0.13695

0.040125

0.29015

0.057916

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

3.3673e-6

0.11948

1.1064e-8

1.9899 0.38348

1.6875e-14

0.099497

0

1.9899

0.36238

4.802e-7

0.54013

3.5527e-15

4.0506

1.0923

0.99496

0.78361

1.7764e-14

3.9817

0.95187

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

25.93824

44.77153

0.08502217

167.5172

38.19886

9.84766

14.0412

0.0244401 85.1125

18.3926

15.1304

22.2481

0.0647027

79.3204 24.0442

16.84264

30.30881

0.0547216

136.9281

34.61948

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0097159

0.010428

0

0.037224

0.0058499

0.0097159

0.010428

0

0.037224

0.0058499

0.0097159

0.010039

0

0.037224

0.0062036

0.0097159

0.0098774 0

0.037224

0.0043897

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.3321e-11

4.318e-9

3.4333e-13

2.1134e-7

2.9875e-8

6.0994e-35

4.2675e-9

4.2073e-94

2.112e-7

2.9863e-8

2.1304e-22

0.00015874

2.2951e-89

0.0061238

0.00088397

3.0375e-12

0.0003836

1.837e-54

0.012502

0.0019265

Weighted Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.9151e-10

0.00033694

1.8131e-12

0.016814

0.0023778

8.5506e-32 0.00033694

6.7149e-95

0.016814

0.0023778

2.6998e-23

1.6776e-6

1.6519e-95

8.241e-5

1.1651e-5

2.508e-11

6.4829e-5

3.7237e-52

0.00088496

0.00017912

53

It is not surprising that the parameters derived using a swarm size of thirty

generally performed best in conjunction with the same swarm size. Even for the same

thirty particles, however, performance was still lacking on Rosenbrock using the

empirically derived parameter combination.

While Table III-1 evidences that complicated functions such as Rastrigin can be

solved fairly well by optimizing parameters for the problem at hand, comparison with

Table II-1 shows that improved performance on Rastrigin came at the cost of deteriorated

performance on Rosenbrock, such that parameter optimization seems to be problem-

dependent. Meissner et al attempted to use PSO to optimize its own parameters [20], and

their results also indicate parameter selection to be problem-dependent. It is cautioned,

however, that since standard PSO was used as the master PSO in that paper, the

parameters recommended for each benchmark should not necessarily be viewed as

optimal since stagnation may have been an issue.

Even the best parameters found for Rastrigin reduce its function value on average

only to one-tenth of one unit; furthermore, the fact that practically the same average

performance was usually seen over both short and long trials suggests that parameter

selection, though effective at postponing stagnation, was not able to avoid it.

Socially Refined PSO

In order to develop a comparison basis by which to gauge the success of RegPSO

relative to the approach of optimizing parameters, other parameters found to work well

on Rastrigin were tested to see which perform well across benchmarks. Interestingly, the

Socially Refined PSO with small, negative inertia weights tested in Table III-2 and Table

III-3 outperformed the social-only and predominantly social parameters with small,

54

positive inertia weights sampled from the same vicinity of the parameter-defined search

space. According to cumulative velocity update equation (2.18), the Socially Refined

PSO trusts both the social and cognitive information of any particular iteration when

k a is even and slightly distrusts the same information when k a is odd while quickly

forgetting past information. In other words, particles oscillate between trust and distrust

until the information is forgotten about in light of new information. According to the

data presented in Tables III-2 through III-3, this may be healthier than simply trusting all

information.

55

Table III-2: “Socially Refined” PSO with Slightly Negative Inertia Weight

c1 = 0.1, c2 = 3.7, = -0.01, = 0.15, 50 trials per benchmark per column

Benchmark n 30s

800,010 FE’s

25s

800,000 FE’s

20s

800,000 FE’s

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

3.2863e-14

3.5918e-14

1.8652e-14 1.1458e-13

1.3505e-14

3.2863e-14 3.6984e-14

1.8652e-14

8.6153e-14

1.2481e-14

3.9968e-14

4.9916e-14

2.931e-14

2.78e-13

3.5686e-14

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.018469

0.029549

0

0.12983 0.031745

0.018452

0.023528 0

0.14943

0.028115

0.020953

0.029729

0

0.1662

0.031242

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.0538e-20

4.0136e-20

5.2502e-24

6.9434e-19

1.0243e-19

4.7631e-23

1.3101e-22

5.5505e-25

2.3811e-21

3.4278e-22

1.4344e-26

2.1305e-25

1.7896e-28

7.1409e-24

1.0083e-24

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0017654

0.0024619

0.00063002

0.013427

0.0023139

0.0027207

0.0046598

0.00068381

0.041015

0.0067316

0.003471

0.0044198

0.00090772

0.046851

0.0063827

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

8.8818e-16

0.23879

0

1.9899

0.47398

1.7764e-15

0.25869

0

1.9899 0.52456

3.5527e-15

0.45768

0

3.9798

0.8339

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

8.16376

8.3122

0.012567

33.2969

6.72267

2.65706

5.3097 0.000600161

14.8977

5.54191

3.77921

5.71585

1.36618e-7

14.4996

5.43735

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0097159

0.00855 0

0.0097159

0.0031894

0.0097159

0.0093273

0

0.0097159

0.0019233

0.0097159

0.0093273

0

0.0097159

0.0019233

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.1808e-108

2.4203e-106

1.4095e-112

5.141e-105

8.6854e-106

1.6591e-124

1.2844e-122

5.2522e-129

3.6725e-121

5.2553e-122

1.6609e-146

2.5295e-143

3.5956e-151

1.135e-141

1.6036e-142

Weighted Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

6.8498e-107

8.8944e-105

3.3557e-111

3.8603e-103

5.458e-104

1.8222e-123

6.2982e-121

6.6437e-128

1.2251e-119

2.202e-120

2.2059e-145

4.2659e-141

1.335e-149

2.0798e-139

2.9401e-140

56

Socially Refined PSO parameters 1 20.1, 3.7, 0.01c c were able to improve upon

the “social-only” parameter combination by removing one-tenth from the social

acceleration coefficient and applying it to the cognitive component. Improved

performance on Rastrigin again came at the cost of deteriorated performance on

Rosenbrock when compared to Table II-1 so that parameter selection is once again seen

to be problem-dependent.

57

Table III-3: “Socially Refined” PSO with Small, Negative Inertia Weight

c1 = 0.1, c2 = 3.5, = -0.1, = 0.15, 50 trials per benchmark per column

Benchmark n 30s

800,010 FE’s

25s

800,000 FE’s

20s

800,000 FE’s

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.931e-14

3.1299e-14

1.5099e-14

6.839e-14 9.9237e-15

3.8192e-14

3.6557e-14

2.2204e-14

7.5495e-14

9.682e-15

3.9968e-14

5.3966e-14

2.2204e-14

2.0695e-13

3.9636e-14

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.017226

0.025327

0

0.15367

0.030708

0.017241

0.026762

0

0.10746 0.026636

0.014772

0.023663 0

0.12269

0.026331

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

7.3887e-29

5.0276e-28

2.0621e-31

1.0916e-26

1.5776e-27

1.4001e-25

7.6784e-25

2.2748e-27

6.5308e-24

1.3586e-24

7.3887e-29

5.0276e-28

2.0621e-31

1.0916e-26

1.5776e-27

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0024288

0.0031599 0.0012296

0.02221

0.0030853

0.0031992

0.004724

0.00085537 0.039222

0.0059909

0.0046839

0.0072108

0.0015745

0.025236

0.0055917

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.49748

0.71637

0 5.9698

1.1378

0.99496

0.89546

0

4.9748

1.1237

0.99496

1.4924

1.7764e-15

5.9697

1.4249

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

5.63767

6.17404

0.00437651

14.5122

5.87016

1.2464

4.2523

0.000173502

17.1673

5.324

0.934938

3.82356

6.75498e-6

11.5996

4.44694

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0097159

0.011334

0 0.037224

0.0080548

0.0097159

0.009133

0

0.0097159 0.0023308

0.0097159

0.0097159

0.0097159

0.0097159

0

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.9666e-130

4.5786e-125

3.5655e-135

2.2883e-123

3.2361e-124

5.5713e-148

4.2839e-146

4.1483e-152

1.2209e-144

1.7476e-145

3.8905e-172

2.3075e-166

1.0487e-177

1.1487e-164

0

Weighted Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

4.1491e-129

1.1582e-127

1.258e-132

2.1567e-126

3.4054e-127

1.3336e-146

3.1883e-143

4.3628e-151

1.0893e-141

1.6308e-142

7.1988e-172

2.635e-168

5.0297e-176

7.7798e-167

0

58

The case that parameter selection is problem-dependent stems not only from the

tradeoff in performance between Rastrigin and Rosenbrock seen by comparing the tables

of this chapter with Table II-1: the tables of this chapter alone show performance

improving with swarm size on Rastrigin and the noisy Quartic while deteriorating with

swarm size on Rosenbrock while the number of function evaluations is held constant.

This means that not even one swarm size is most efficient for all problems. Furthermore,

for the social-only model, performance on Sphere and Weighted Sphere improved with

swarm size, but that trend reversed in the predominantly social model. The need to

optimize parameters for the problem at hand is a weakness since it requires an additional

optimization process prior to the optimization problem itself.

This chapter has empirically derived quality parameters to serve as a comparison

basis by which to test the proposed regrouping mechanism. The small, negative inertia

weight in conjunction with Socially Refined PSO provided good, general performance

with swarm sizes of twenty and twenty-five. While proper parameter selection can be

seen to postpone stagnation quite effectively, it may be insufficient to prevent stagnation

based on the thousands of parameter combinations tested for this chapter.

The problem-dependence of parameter selection suggests that the ability to escape

from the state of premature convergence via a regrouping mechanism in order to continue

searching for better regions might be a more generally applicable approach to dealing

with stagnation.

59

CHAPTER IV

REGROUPING PARTICLE SWARM OPTIMIZATION

Regroup: “to reorganize (as after a setback) for renewed activity” [30].

Motivation for Regrouping

Since parameter selection is problem-dependent such that postponing stagnation

may require an extensive optimization process prior to the optimization problem itself;

and since even with parameters optimized for the problem at hand, the swarm may still

stagnate as seen in Table III-1; a regrouping mechanism is sought to liberate the swarm

from the state of premature convergence, thus enabling continued progress toward a

global minimizer.

The goal of the proposed Regrouping PSO (RegPSO) is to detect when particles

have prematurely converged and regroup them within a new search space large enough to

escape from the local well in which particles have become trapped but small enough to

provide an efficient search. It is thought that this will provide an efficient means of

escape from the state of premature convergence so that the swarm can continue making

progress rather than restarting; while continually restarting the search requires running

the search an arbitrary number of times, which may or may not suffice to discover a

true global minimizer, RegPSO seeks to improve upon past searches.

60

Detection of Premature Convergence

As discussed earlier, all particles are pulled on all dimensions toward the global

best via update equations (2.12) and (2.3). If no particle encounters a better global best

over a period of time, the swarm will continually move closer to the unchanged global

best until the entire swarm has converged to one small region of the search space. If

particles actually have happened upon a global minimizer, they may refine that solution

by their tiny movements toward it; but in all other cases, it is undesirable for particles to

remain in this state. Therefore, it is useful to measure how near particles are to each

other so that an effective action can be taken once they have converged to the same

region. Van den Bergh’s Maximum Swarm Radius criterion for detecting premature

convergence is adopted for this purpose. It is proposed herein that when premature

convergence is detected using this maximum swarm radius measurement, the swarm be

regrouped in a new search space centered at the global best as follows.

At each iteration, k , the swarm radius, ( )k , is taken to be the maximum

Euclidean distance, in n-dimensional space, of any particle from the global best.

1, ,( ) max ( ) ( )

where is the Euclidean norm of any vector .

ii s

k x k g k

a a

(4.1)

Let r represent the hypercube making up the search space at regrouping index

r , where r is initialized to zero and incremented by one with each regrouping so that

0 represents the initial search space within which particles are initialized and 1 2, ,...

are the subsequent search spaces within which particles are regrouped or re-initialized.

Let rrange be the vector containing the side lengths, or range per dimension, of

search space r as shown in (4.2).

61

1 2, ,...,r r r r

nrange range range range

(4.2)

The n-dimensional hypercube r then has sides of length r

jrange for 1,2,...,j n .

Let ( )rdiam represent the “diameter” of search space r , calculated as the

Euclidean norm of vector rrange .

r rdiam range (4.3)

Particles are considered too close to each other and regrouping is triggered when the

normalized swarm radius, norm , defined as the ratio of the maximum Euclidean distance

of any particle from global best to the diameter of the search space [11] falls below a

user-specified stagnation threshold, , satisfying premature convergence condition (4.4)

norm r

k

diam

(4.4)

An empirical study found 41.1 10 to work well with the proposed regrouping

mechanism. Regrouping too early did not allow for the desired degree of solution

refinement, while regrouping too late meant wasting time in a stagnated state prior to

regrouping.

Swarm Regrouping

When premature convergence is detected by condition (4.4), the swarm is

regrouped in a new search space centered at the global best. The side lengths, or range

on each dimension, of the hypercube defining the new search space, r , are determined

by (a) the magnitude of the regrouping factor, , which is inversely proportional to the

stagnation threshold as shown in (4.5)

62

6

,5

(4.5)

and (b) the degree of uncertainty inferred on each dimension from the maximum

deviation from global best. Note that the degree of uncertainty as inferred

computationally simply in (4.6) differs from the maximum Euclidean distance of any

particle from global best in (4.1) as the former is the maximum deviation per dimension

over all particles and the latter is the maximum Euclidean deviation of any one particle.

0

,1, ,

( ) min ( ), maxr

j j i j ji s

range range x k g k

(4.6)

The hypercube defining the new search space, r , is proportional on each dimension to

the degree of uncertainty upon detection of premature convergence, except that the range

on each dimension of r is clamped to a maximum of the range on the same dimension

of the initial search space, 0 , as shown in (4.6).

Each particle is then randomly regrouped about the global best within r

according to

1 2

11 ( ) ( )

2

where , ,...,

with each (0,1) randomly selected.

i i i

j

r r

i i

i n

i

x k g k r range range

r r r r

r U

(4.7)

This randomizes particles to lie within implicitly defined search space

1 2 21

, , , , , ,, , ,r L r U r L r U r L r U r

n nx x x x x x (4.8)

with respective lower and upper bounds

63

,

,

1( ),

2

1( ).

2

j

j

L r r

j j

U r r

j j

x g range

x g range

(4.9)

“Gbest” PSO Continues as Usual

The velocity clamping values are re-calculated based on the dimensions of the

new search space, r , according to

max, ,r rv range (4.10)

where superscript r is again the regrouping index.

Velocities are then re-initialized to lie within new range max, max,,r r

j jv v per

dimension according to

max, max,

1 2

2

where , ,...,

with each (0,1) randomly selected

r r

i i

i n

j

v k r v v

r r r r

r U

(4.11)

Personal bests are re-initialized as originally done such that

.i ip k x k (4.12)

Rather than being re-initialized, the global best is remembered across regroupings.

This allows the search that was in progress prior to the occurrence of premature

convergence to continue since particles are attracted back to the best point found so far

while combing the search space along the way due to their cognitive pulls. After each

regrouping, velocities and positions continue updating as in Gbest PSO with particles

being regrouped within a new search space according to (4.6) and (4.7) when premature

convergence condition (4.4) is met.

64

If premature convergence occurs near an edge of the hypercube defining the

original search space, the new search space may not necessarily be a subspace of the

original search space since it may be desirable to search outside the original bounds if the

initial search space was only a guess as to where solutions were likely to be found.

Restricting particles to the original search space is easy to do via position clamping or

velocity reset [31] if it is known for a fact that better solutions do not lie outside the

search space. In practice, it is easier to make an educated guess as to where a solution

will lie than to know for certain that no better solutions can be found elsewhere; for this

reason, particles are not generally required to stay within the search space if they have

good reason to explore outside of it.

Since the PSO algorithm works well prior to premature convergence, the new

RegPSO algorithm does not require changes to the original position and velocity update

equations but merely liberates the swarm from premature convergence via an automatic

regrouping mechanism. The pseudo code for RegPSO is given in Figure IV-1.

65

Do with Each New Grouping

For j = 1 to n

If r = 0

Calculate 0( )jrange according to (2.20).

Else

Calculate ( )r

jrange according to (4.6).

End If

End For

Calculate max,rv according to (4.10).

Calculate the diameter, rdiam , of the current search space using (4.3).

For 1i to s

Randomly initialize particles’ velocities, ,iv k according to (4.11).

Randomly initialize particles’ positions, ( ),ix k to lie within r .

Initialize personal bests: ( ) ( )i ip k x k .

End For

If r = 0

Initialize the global best, g k , according to (2.24).

End If

Do Iteratively Update velocities according to (2.12).

Clamp velocities when necessary according to Figure 1.

Update positions according to (2.3).

Update personal bests according to (2.26).

Update global best according to (2.24).

Calculate the swarm radius according to (4.1).

If (i) the premature convergence criterion of (4.4) is met or (ii) a

user-defined maximum number of function evaluations per grouping

is satisfied,

Then regroup the swarm.

End If

Until search termination

Figure IV-1: RegPSO pseudo code

66

Two-Dimensional Demonstration of the Regrouping Mechanism

In chapter two, the swarm behavior of Gbest PSO was observed within the search

space of the two-dimensional Rastrigin benchmark in order to demonstrate the stagnation

problem. At iteration 102 in Figure II-13, premature convergence condition (4.4) was

satisfied, which automatically triggered the regrouping shown in Figure IV-2, by which

to escape the state of premature convergence. Figures IV-2 through IV-17 show how

RegPSO helps the swarm find the global minimizer. Figure IV-18 shows the benefit that

regrouping has on the function value. These figures were generated using the same

parameters used for Figures II-3 through II-13 along with stagnation threshold

41.1 10 and regrouping factor 6

.5

Figure IV-2: Swarm Regrouped (Iter. 103)

RegPSO detected premature convergence at iteration 102 of Figure II-13.

The swarm is regrouped above at iteration 103 in order to continue making

progress toward the global minimizer. Personal bests are re-intitialized

according to (4.12). The new, smaller search space is shown.

67

Figure IV-3: PSO in New Search Space (Iter. 113)

Gbest” PSO continues as usual within the new search space after

regrouping. The swarm is returning cautiously to the global best

with new momenta, personal bests, and perspectives.

Figure IV-4: Swarm Migration (Iter. 123)

The swarm is migrating toward a better position found by one

of the particles near [1, 0].

68

Figure IV-5: New Well Considered (Iter. 133)

Some particles are refining the approximation to the local minimizer

near [1, 0] while others continue exploring due to their momenta and

cognitive accelerations.

Figure IV-6: Most Bests Relocated (Iter. 143)

Most of the particles’ personal bests now belong to the well

containing the local minimizer near [1, 0]. Notice the uncertainty

on the horizontal dimension.

69

Figure IV-7: Swarm Collapses (Iter. 153)

Particles collapse on the horizontal dimension to the new improved well.

Figure IV-8: Horizontal Uncertainty (Iter. 163)

Cognitively, the swarm doubts its decision on the horizontal dimension

more so than on the vertical dimension.

70

Figure IV-9: Uncertainty Remains (Iter. 173)

The relative uncertainty on the horizontal dimension is still evident.

Figure IV-10: Becoming Convinced (Iter. 183)

The entire swarm is converging to the local minimizer near [1, 0],

refining the quality of the solution with small steps toward it.

71

Figure IV-11: Premature Convergence Detected (Iter. 219)

Premature convergence is again calculated via condition (4.4).

Particles have had enough time to refine solution quality, which is

an important part of the search, and will be regrouped in the

following iteration.

Figure IV-12: Second Regrouping (Iter. 220)

Because particles were cognitively less certain of their solution on the

horizontal dimension than on the vertical, the swarm regrouped

about the global best with a larger horizontal range than vertical.

72

Figure IV-13: Better Well Discovered (Iter. 230)

Particles return with momenta and cognitive restraint toward the global

best remembered near local minimum [1, 0].

Figure IV-14: Swarm Migration (Iter. 240)

The efficient regrouping mechanism helps particles quickly find the

well containing global minimizer [0, 0].

73

Figure IV-15: Swarm Collapsing (Iter. 250)

Particles swarm toward the new global best.

Figure IV-16: Particles Swarm to the Newly Found Well (Iter. 260)

Personal bests now lie within the new well, eliminating the cognitive

pull to other locations. Momenta wane.

74

Figure IV-17: Convergence (Iter. 270)

Solution refinement of the global minimizer is in progress.

Figure IV-18: Effect of Regrouping on Cost Function Value

Performance comparison of Gbest PSO and the proposed RegPSO

on the Rastrigin benchmark with dimension n = 2.

Figure IV-18 shows that shortly after the first regrouping and immediately after

the second regrouping, solution quality was improved as particles escaped premature

convergence in order to continue onward toward the true global minimizer rather than

75

simply stagnating in place. Having presented the regrouping concept in two dimensions,

its effectiveness is now tested in the much more difficult thirty-dimensional case.

76

CHAPTER V

TESTING AND COMPARISONS

While the linearly decreasing inertia weight was shown in Table II-2 to improve

solution quality on multi-modal functions by postponing stagnation, this results from a

weighting scheme where early information is forgotten less rapidly than late information

as inferred from cumulative velocity update equation (2.18). Postponing stagnation in a

more cautious search, however, prevents regrouping from being triggered as often. It was

found to be more beneficial to allow the quick convergence of the popular static weight

and regroup at premature convergence than to take a considerably longer time to

converge cautiously and regroup less often. For this reason, the popular static inertia

weight, 0.72984 , is proposed for use with RegPSO.

While clamping velocities to fifteen percent rather than the more common fifty

percent generally leads both to quicker convergence and higher quality solutions, the

value of fifty percent was found to be most useful with RegPSO, which is surely

attributable to the re-initialization of velocities within max, max,,r r

j jv v per dimension

according to equation (4.11) with each regrouping. Using larger velocities at regrouping

would certainly help particles escape from entrapping regions in order to find better

minimizers – not only because larger velocities carry particles farther from entrapping

regions but also because larger velocities mean larger momenta by which to overshoot

77

the global best and explore on the other side of the entrapping region before converging

again upon the global best if no better position is found by which to update it.

Hence, it is with the standard parameters recommended by Clerc’s (2.18) that

RegPSO is tested in conjunction with velocities clamped to the popular fifty percent of

the range of the search space per dimension.

Comparison with Standard “Gbest” & “Lbest” PSO’s

In Table V-1, Gbest PSO, Lbest PSO and RegPSO are compared side by side for

800,000 total function evaluations. The point of selecting this number is to show that

RegPSO is capable of avoiding stagnation and continuing onward to approximate the true

global minimizer if given enough time to do so. The standard algorithms use the linearly

decreasing inertia weight and velocity clamping to fifteen percent demonstrated to work

well for Gbest PSO in Table II-1 and Table II-2 and empirically verified to work well

with Lbest PSO. The results on Rastrigin are especially impressive since this benchmark

generally returns high function values in the literature due to stagnation of the swarm. It

is clear that RegPSO is more consistent across the benchmark suite.

78

Table V-1: RegPSO Compared to Gbest PSO & Lbest PSO with Neighborhood Size

2

800,000 function evaluations, s = 20, c1 = c2 = 1.49618, 50 trials per row per column,

RegPSO used 41.1 10 ;

11.2 ; 100,000 function evaluations max per grouping.

Benchmark n Gbest PSO

0.15,

0.9 to 0.4

Lbest PSO with

neighborhood size of 2

0.15,

0.9 to 0.4

RegPSO

0.5,

0.72984

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

7.9936e-15

1.1191e-14

4.4409e-15

4.3521e-14

8.0648e-15

7.9936e-15

1.0623e-14

7.9936e-15

1.5099e-14

3.428e-15

5.0832e-7

5.2345e-7

1.9571e-7

9.7466e-7

1.6771e-7

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.012319

0.022023

0

0.090322

0.024071

0.009861

0.012538

0

0.075718

0.015404

0.0098573

0.013861

0

0.058867

0.01552

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.2644e-17

2.3189e-14

2.6219e-22

8.6952e-13

1.2438e-13

5.877e-24

5.9577e-22

2.0446e-28

1.5377e-20

2.2534e-21

2.5503e-10

3.1351e-10

6.0537e-11

9.5804e-10

2.2243e-10

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0015335

0.0015241

0.00033276

0.0028314

0.00065649

0.0024195

0.0025417

0.00084968

0.0044732

0.00070295

0.0006079

0.00064366

0.0002655

0.0012383

0.00021333

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

24.3765

25.252

13.9294

42.7832

7.06661

28.8538

31.2746

15.9193

73.6268

11.419

2.3981e-14

2.6824e-11

0

1.3337e-9

1.886e-10

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

8.63847

18.859

0.000292151

81.5558

25.9117

0.070101

1.0713

9.1079e-6

4.0744

1.7196

0.0030726

0.0039351

1.7028e-5

0.018039

0.0041375

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.8331e-106

1.0834e-94

6.4959e-117

4.885e-93

6.9319e-94

8.4679e-241

2.1967e-215

1.1756e-258

1.0983e-213

0

5.8252e-15

9.2696e-15

1.2852e-15

4.9611e-14

8.6636e-15

Weighted Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

9.6085e-104

4.4182e-93

7.6884e-121

1.6078e-91

2.3941e-92

3.5402e-240

1.2102e-225

7.5531e-252

5.8251e-224

0

8.1295e-14

9.8177e-14

1.9112e-14

2.5244e-13

5.4364e-14

79

As was shown in Table II-1 and Table II-2, the decreasing inertia weight

improved the performance of Gbest PSO on most of the benchmark suite at the cost of

deteriorated performance on Rosenbrock. "Lbest” PSO does not respond as adversely as

Gbest PSO on Rosenbrock to the otherwise beneficially decreasing inertia weight, but it

is outperformed by Gbest PSO on Rastrigin. “Lbest” PSO seems to perform well on

simple uni-modal benchmarks but suffers on the more complicated uni-modal

Rosenbrock, multi-modal Rastrigin, and noisy Quartic.

Regrouping was not necessary to successfully traverse the multi-modal Ackley

function since its local wells are minor relative to the overall curvature leading to the

global minimizer; however, only RegPSO consistently solved the more difficult multi-

modal Rastrigin due to its prominent local wells, which have a significant impact relative

to the slight overall curvature leading to the global minimizer. RegPSO also provided the

best performance in the presence of noise. Only RegPSO consistently solved the tricky

Rosenbrock, and only RegPSO was able to consistently solve the multi-modal

benchmarks.

Only RegPSO was able to approximate the true global minimizer for all four

hundred and fifty trials, which can be ascertained from the worst case performance per

benchmark. The two standard PSO algorithms show greater problem dependency such as

on Rastrigin, where the true global minimizer was not approximated with even one trial

by either standard PSO algorithm. Due to its apparently lower problem-dependency,

RegPSO may be more applicable than standard PSO for solving problems about which

little is known in advance since it performed consistently in the presence of noise, on

multi-modal benchmarks, and on uni-modal benchmarks.

80

Figure V-1: Mean Behavior of RegPSO on 30D Rastrigin

A swarm size of 20 is sufficient to approximate the global minimizer of the

30-D Rastrigin and reduce the cost function to approximately true minimum.

Comparison with Socially Refined PSO

In Table V-2, RegPSO is compared to the best of the Socially Refined PSO

parameters derived from the Rastrigin experiment of chapter three in order to test

whether the regrouping mechanism provides better general performance than even

parameters painstakingly chosen by trial and error. Each algorithm uses its own quality

velocity clamping percentage.

81

Table V-2: RegPSO Compared with Socially Refined PSO

RegPSO used 41.1 10 ;

11.2 ; 100,000 function evaluations max per grouping.

Benchmark n Socially Refined

PSO

1 20.1, 3.5,

0.1, 0.15,

20

c c

s

Socially Refined

PSO

1 20.1, 3.7,

0.01, 0.15,

20

c c

s

RegPSO

1

2

1.49618,

1.49618,

0.72984,

0.5, 20

c

c

s

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

3.9968e-14 5.3966e-14

2.2204e-14

2.0695e-13 3.9636e-14

3.9968e-14

4.9916e-14 2.931e-14

2.78e-13

3.5686e-14

5.0832e-7

5.2345e-7

1.9571e-7

9.7466e-7

1.6771e-7

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.014772

0.023663

0

0.12269

0.026331

0.020953

0.029729

0

0.1662

0.031242

0.0098573

0.013861 0

0.058867

0.01552

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

7.3887e-29

5.0276e-28

2.0621e-31

1.0916e-26

1.5776e-27

1.4344e-26

2.1305e-25

1.7896e-28

7.1409e-24

1.0083e-24

2.5503e-10

3.1351e-10

6.0537e-11

9.5804e-10

2.2243e-10

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0046839

0.0072108

0.0015745

0.025236

0.0055917

0.003471

0.0044198

0.00090772

0.046851

0.0063827

0.0006079

0.00064366

0.0002655

0.0012383

0.00021333

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.99496

1.4924

1.7764e-15

5.9697

1.4249

3.5527e-15 0.45768

0 3.9798

0.8339

2.3981e-14

2.6824e-11

0

1.3337e-9

1.886e-10

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.934938

3.82356

6.75498e-6

11.5996

4.44694

3.77921

5.71585

1.36618e-7

14.4996

5.43735

0.0030726

0.0039351

1.7028e-5

0.018039

0.0041375

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0097159

0.0097159

0.0097159

0.0097159

0

0.0097159

0.0093273

0 0.0097159

0.0019233

0

0

0

0

0

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

3.8905e-172

2.3075e-166

1.0487e-177

1.1487e-164

0

1.6609e-146

2.5295e-143

3.5956e-151

1.135e-141

1.6036e-142

5.8252e-15

9.2696e-15

1.2852e-15

4.9611e-14

8.6636e-15

Weighted

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

7.1988e-172

2.635e-168

5.0297e-176

7.7798e-167

0

2.2059e-145

4.2659e-141

1.335e-149

2.0798e-139

2.9401e-140

8.1295e-14

9.8177e-14

1.9112e-14

2.5244e-13

5.4364e-14

82

While the Socially Refined PSO improves performance over the standard Gbest

and Lbest PSO’s, RegPSO is still seen to be less problem-dependent and consequently

more consistent across the benchmark suite than the Socially Refined PSO resulting from

the Rastrigin experiment of chapter three. RegPSO demonstrates the versatility to solve

different types of problems.

Comparison with MPSO

In Table V-3, MPSO using the normalized swarm radius convergence detection

criterion was selected for comparison since it was Van den Bergh’s best-performing

restart algorithm: outperforming guaranteed convergence PSO (GCPSO), multi-start PSO

using the cluster analysis convergence detection technique (MPSOcluster), multi-start PSO

using the objective function slope convergence detection technique (MPSOslope), and

random particle swarm optimization (RPSO). MPSO [11] restarts particles on the

original search space when premature convergence is detected, and it uses an improved

local optimizer, GCPSO [11, 12], as its core search algorithm rather than the basic Gbest

PSO for which the regrouping mechanism is currently being tested. RegPSO is compared

to MPSO in order to confirm that the proposed regrouping mechanism is indeed more

efficient than continually restarting on the original search space.

83

Table V-3: RegPSO Compared with MPSO

50 trials per benchmark per algorithm; 200,000 function evaluations per trial.

s = 20, 1 20.5, 1.49618,c c 60.72984, and 10 for both algorithms.

RegPSO used 11.2 and 100,000 function evaluations max per grouping.

Benchmark n MPSO [11]

(using GCPSO)

RegPSO

(using Gbest PSO)

Ackley 30 Median

Mean

0.931

0.751

2.0806e-8

2.4194e-8

Griewangk 30 Median

Mean 1.52E-9

1.99E-9

0.019684

0.030309

Rastrigin 30 Median

Mean

45.8

45.8 10.9525

11.9726

Mean Performance

15.517 4.001

The comparison was not fair because GCPSO is an improved form of Gbest PSO;

the regrouping mechanism was, however, efficient enough to overcome the handicap and

provide greater consistency than that provided by continually restarting GCPSO on the

original search space.

Comparison with OPSO

In Table V-4, RegPSO is compared to opposition-based PSO (OPSO) with

Cauchy mutation, which was developed to “accelerate the convergence of PSO and avoid

premature convergence” [14]. In OPSO, each particle has a fifty percent chance of being

selected to have its position opposite the center of the swarm evaluated in addition to

having its own position evaluated. If the opposite position is better, the particle jumps to

that position, leaving the less beneficial position behind. This is done to maintain

diversity with the hope of avoiding premature convergence. The Cauchy mutation

mutates the global best according to a distribution capable of providing large mutations at

times when compared to normal or uniform distributions; when the mutated position is

better than the original global best, the mutation is kept.

84

OPSO was presented with a swarm size of ten for an expected sixteen function

evaluations per iteration resulting from ten particles: five opposite positions expected

according to probability one-half, and one extra function evaluation to consider mutating

the global best. However, it has been found empirically to work better with larger swarm

sizes. OPSO is compared with RegPSO using twenty particles over eight hundred

thousand function evaluations in order to see how capable the algorithm really is at

avoiding premature convergence. OPSO is given the benefit of the fifteen percent

velocity clamping value found to work well with the Gbest PSO it utilizes and

empirically verified to perform better with OPSO than a clamping value of fifty percent.

85

Table V-4: OPSO Compared with RegPSO both with and without Cauchy Mutation

800,000 function evaluations; s = 20, c1 = c2 = 1.49618, ω = 0.72984

Benchmark n OPSO

0.15

OPSO with Cauchy

mutation

0.15

RegPSO

0.5

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

2.2686

2.3144

0.9313

4.3405

0.68001

1.8997

1.9851

7.9936e-15 3.3449

0.69098

5.0832e-7

5.2345e-7 1.9571e-7

9.7466e-7

1.6771e-7

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.04051

0.071839

0

1.1762

0.17207

0.016007

0.025233

0

0.15272

0.030445

0.0098573

0.013861

0

0.058867

0.01552

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.323e-73

7.3917e-71

6.4403e-79

1.6987e-69

2.7855e-70

1.7387e-69

3.4375e-66

3.0621e-74

1.1251e-64

1.639e-65

2.5503e-10

3.1351e-10

6.0537e-11

9.5804e-10

2.2243e-10

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.0048506

0.0050701

0.0019586

0.0086246

0.0015018

0.0047243

0.0047242

0.0024328

0.0072114

0.0011654

0.0006079

0.00064366

0.0002655

0.0012383

0.00021333

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

45.7681

48.136

19.8992

73.6268

13.0222

49.74788

51.63829

24.87396

107.4552

16.05482

2.3981e-14

2.6824e-11

0

1.3337e-9

1.886e-10

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

5.3552e-7 1.70137

5.88319e-18 12.0378

2.6865

0.000817947

2.45937

7.26269e-17

24.5545

3.9952

0.0030726

0.0039351 1.7028e-5

0.018039

0.0041375

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

3.4585e-322

0

1.6171e-320

0

0

0

0

2.4703e-323

0

5.8252e-15

9.2696e-15

1.2852e-15

4.9611e-14

8.6636e-15

Weighted Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

9.3872e-323

0

4.0019e-321

0

0

2.4703e-323

0

4.0019e-322

0

8.1295e-14

9.8177e-14

1.9112e-14

2.5244e-13

5.4364e-14

86

OPSO did not meet its goal of avoiding premature convergence as can be seen

from the Rastrigin benchmark over eight hundred thousand function evaluations.

RegPSO again provided the best consistency across benchmarks. The consistency across

the benchmark suite is a result of regrouping, which on very simple functions can

actually prevent particles from continuing to refine solution quality by regrouping when it

is not necessary to do so. This tradeoff is eagerly accepted when it is not known in

advance that a particular function is extremely simple since it is far more important to

approximate the global minimizer than to have better approximations sometimes and

horrible approximations other times; however, if it is known in advance that a problem is

quite simple to solve, a different regrouping mechanism is provided in the following

section specifically for this case so that no tradeoff in performance is necessary.

In all tables, the mean performance of RegPSO across benchmarks was superior

to that of the comparison algorithms.

RegPSO for Simple Uni-Modal Problems

Having demonstrated RegPSO to be less problem-dependent and more consistent

across the benchmark suite, the question became whether RegPSO might be capable of

improving performance on simple, uni-modal functions. Toward this end, a tiny

stagnation threshold of 2510 was combined with a regrouping factor of

6

19

110 ,

10

which is a much smaller fraction of the inverse of the stagnation

threshold than used previously. The results are shown in Table V-5.

87

Table V-5: A RegPSO Model for Solution Refinement Rather than Exploration 25 6

1 220, 0.72984,c c 1.49618, 10 , 10

800,000function evaluations, 100,000 function evaluations max per grouping

s

Benchmark n RegPSO

0.5

Ackley

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

3.1206

3.6524

1.5017

7.0836

1.4975

Griewangk

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.049122

0.055008

0

0.15666

0.044639

Quadric

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

7.6883e-77

1.7754e-72

1.5117e-82

7.4807e-71

1.0691e-71

Quartic

with noise

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0.00058695

0.00063169

0.0002655

0.0012383

0.00021131

Rastrigin

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

70.64194

71.63686

42.78316

116.4097

17.1532

Rosenbrock

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

1.7703e-16

0.87706

9.1369e-21

3.9866

1.6682

Schaffer’s f6

2

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0

0

0

0

Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0

0

0

0

Weighted Sphere

30

Median:

Mean:

Minimum:

Maximum:

Std. Dev.:

0

0

0

0

0

Note that these results on the simple uni-modal Quadric, Sphere, and Weighted

Sphere are the best of all algorithms tested over a full eight hundred thousand function

evaluations, which suggests that RegPSO is highly scalable.

88

CHAPTER VI

CONCLUSIONS

An approach for dealing with the stagnation problem in PSO has been tested by

building into the algorithm a mechanism to automatically trigger swarm regrouping when

premature convergence is detected. The regrouping mechanism helps liberate particles

from the state of premature convergence and enables continued progress toward a global

minimizer. RegPSO has been shown to have better mean performance than the

algorithms compared with – a result that would have been more pronounced had only

multi-modal benchmarks been used. RegPSO also consistently outperformed in the

presence of noise. Given sufficient function evaluations, RegPSO was able to solve the

stagnation problem for each benchmark tested and approximate the true global minimizer

with each trial conducted.

Though the parameters used for RegPSO worked consistently across the

benchmark suite, it is not claimed that parameters have been fully optimized. While

RegPSO seems capable of reducing the problem-dependency usually seen in the standard

PSO algorithms so that parameter optimization may be less important, parameters such as

the regrouping factor certainly do have some degree of problem dependency. It may be

necessary to change parameters should the problem at hand present unusual difficulty.

For example, should greater precision be necessary, a smaller stagnation threshold could

89

be selected in order to allow more solution refinement prior to regrouping; conversely,

should less precision be necessary, particles could be regrouped sooner by setting a larger

stagnation threshold in order to achieve a quicker overall search.

RegPSO appears to be a good general purpose optimizer based on the benchmarks

tested, which is certainly encouraging; however, it is cautioned that the empirical nature

of the experiment is not a theoretical proof that RegPSO will solve every problem well:

certainly, its performance must suffer somewhere. Future work will try to understand

where the algorithm suffers in order to understand any limitations and apply it to the

proper contexts. One such difficulty seen already was with simple uni-modal functions,

where regrouping is unnecessary since particles quickly and easily approximate the

global minimizer to a high degree of accuracy, and where there is no better minimizer to

be found. However, even in this context, regrouping proved beneficial when the

stagnation threshold and regrouping factor were set small enough to help particles

improve accuracy of approximations to the true global minimizer – often finding it

exactly.

While the regrouping mechanism has been tested in conjunction with standard

Gbest PSO in order to demonstrate the usefulness of the mechanism itself, there does not

seem to be anything to prevent the same regrouping mechanism from being applied with

another search algorithm at its core. Performance may be improved in conjunction with

an improved local minimizer such as GCPSO.

It may be beneficial to consider turning off the regrouping mechanism once

particles have repeatedly converged to the same solution. This would allow eventual

90

solution refinement of greater precision rather than repeatedly cutting off the local search

in favor of exploration elsewhere.

It has been empirically observed that clamping velocities to fifteen percent of the

range of the search space on each dimension often provides a quicker convergence to

solutions of higher quality in conjunction with standard PSO. RegPSO using standard

Gbest PSO as its core, however, appears to benefit from larger velocities such as those

clamped to fifty percent of the range on each dimension. The larger maximum velocity

facilitates exploration after regrouping by allowing larger step sizes and more significant

momenta by which to resist repeated premature convergence to the remembered global

best. It may be possible to further improve RegPSO via a velocity clamping value that

gradually decreases from fifty percent to fifteen percent with each grouping, so the

benefits of both values can be reaped.

RegPSO seems to improve performance consistency with one set of parameters

by facilitating escape from potentially deceitful local wells and to solve simple, uni-

modal problems free of entrapping wells quite well with another set of parameters

designed to regroup within a tiny region rather than to escape from that region. It is

suspected that RegPSO may provide a degree of scalability previously missing in the

standard PSO algorithm.

91

REFERENCES

[1] J. Holland, Adaptation in Natural and Artificial Systems: University of Michigan

Press, 1975.

[2] T. Back, D. Fogel, and Z. Michalewicz, Evolutionary Computation 1: Basic

Algorithms and Operators: IOP Press, 2000.

[3] T. Back, D. Fogel, and Z. Michalewicz, Evolutionary Computation 2: Advanced

Algorithms and Operations: IOP Press, 2000.

[4] J. Kennedy and R. C. Eberhart, "Particle swarm optimization," in Proceedings of

the IEEE International Conference on Neural Networks, Perth, Australia, 1995,

pp. 1942-1948.

[5] R. Hassan, B. Cohanim, O. d. Weck, and G. Venter, "A comparison of particle

swarm optimization and the genetic algorithm," in Proceedings of the 46th

AIAA/ASME/ASCE/AHS/ASC, Austin, TX, 2004.

[6] X. Hu and R. C. Eberhart, "Solving constrained nonlinear optimization problems

with particle swarm optimization. 2002 (SCI 2002), Orlando, USA. 2002," in

Proceedings of the Sixth World Multiconference on Systemics, Cybernetics and

Informatics, Orlando, 2002.

[7] J. L. Denebourg, J. M. Pasteels, and J. C. Verhaeghe, "Probabilistic behaviour in

ants: a strategy of errors?," Journal of Theoretical Biology, vol. 105, pp. 259-271,

1983.

[8] F. Moyson and B. Manderick, "The collective behaviour of ants: an example of

self-organization in massive parallelism," in Actes de AAAI Spring Symposium on

Parallel Models of Intelligence, Stanford, California, 1988.

[9] S. Goss, S. Aron, J.-L. Deneubourg, and J.-M. Pasteels, "The self-organized

exploratory pattern of the Argentine ant," Naturwissenschaften, vol. 76, pp. 579-

581, 1989.

[10] J. M. Bishop, "Stochastic searching networks," in Proc. 1st IEE Conf. on

Artificial Neural Networks, London, 1989, pp. 329-331.

92

[11] F. Van den Bergh, "An analysis of particle swarm optimizers," PhD thesis,

Department of Computer Science, University of Pretoria, Pretoria, South Africa,

2002.

[12] F. Van den Bergh and A. P. Engelbrecht, "A new locally convergent particle

swarm optimiser," in Proceedings of the IEEE Conference on Systems, Men,

Cybernetics, Hammamet, Tunisia, 2002, pp. 96-101.

[13] R. C. Eberhart and J. Kennedy, "A new optimizer using particle swarm theory," in

Micro Machine and Human Science, MHS '95., Proceedings of the Sixth

International Symposium on, Nagoya, Japan, 1995, pp. 39-43.

[14] H. Wang, Y. Liu, S. Zeng, H. Li, and C. Li, "Opposition-based particle swarm

algorithm with cauchy mutation," in Proc. of the 2007 IEEE Congress on

Evolutionary Computation, Singapore, 2007, pp. 4750-4756.

[15] C. Worasucheep, "A particle swarm optimization with stagnation detection and

dispersion," in Proceedings of the IEEE Congress on Evolutionary Computation,

Hong Kong, 2008, pp. 424-429.

[16] K. J. Binkley and M. Hagiwara, "Balancing exploitation and exploration in

particle swarm optimization: velocity-based reinitialization," Information and

Media Technologies, vol. 3, pp. 103-111, 2008.

[17] Y. Shi and R. Eberhart, "A modified particle swarm optimizer," in Evolutionary

Computation Proceedings, 1998, IEEE World Congress on Computational

Intelligence., The 1998 IEEE International Conference on, Anchorage, AK, 1998,

pp. 69-73.

[18] R. Eberhart and Y. Shi, "Comparing inertia weights and constriction factors in

particle swarm optimization," in Proc. Congress on Evolutionary Computation,

La Jolla, CA, 2000, pp. 84-88.

[19] M. Clerc and J. Kennedy, "The particle swarm - explosion, stability, and

convergence in multidimensional complex space", IEEE Transactions on

Evolutionary Computation, vol. 6, pp. 58-73, Feb. 2002.

[20] M. Meissner, M. Schmuker, and G. Schneider, "Optimized particle swarm

optimization (OPSO) and its application to artificial neural network training,"

BMC Bioinformatics, vol. 7, 10 March, 2006.

[21] Y. Zheng, L. Ma, L. Zhang, and J. Qian, "On the convergence analysis and

parameter selection in particle swarm optimization," in Proceedings of the Second

International Conference on Machine Learning and Cybernetics, 2003, pp. 1802-

1807.

93

[22] S. Naka and Y. Fukuyama, "Practical distribution state estimation using hybrid

particle swarm optimization," IEEE Power Engineering Society Winter Meeting,

vol. 2, pp. 815-820, Jan. 2001.

[23] A. P. Engelbrecht, Computational Intelligence: An Introduction, 2 ed.: John

Wiley and Sons, 2007.

[24] B. Liu, L. Wang, Y. Jin, F. Tang, and D. Huang, "An improved particle swarm

optimization combined with chaos," Chaos, Solition and Fractals, vol. 25, pp.

1261-1271, 2005.

[25] R. C. Eberhart, P. Simpson, and R. Dobbins, Computational Intelligence PC

Tools: Academic Press Professional, 1996.

[26] A. P. Engelbrecht, Fundamentals of Computational Swarm Intelligence: John

Wiley & Sons, 2006.

[27] M. Clerc and J. Kennedy, "The particle swarm-explosion, stability, and

convergence in multidimensional complex space," IEEE Transactions on

Evolutionary Computation, vol. 6, pp. 58-73, 2002.

[28] A. Carlisle and G. Dozier, "An off-the-shelf pso," in Proceedings of the Particle

Swarm Optimization Workshop, Indianapolis, IN, 2001.

[29] J. Kennedy, "The particle swarm: social adaptation of knowledge," in

Proceedings of IEEE International Conference on Evolutionary Computation,

Indianapolis, IN, 1997, pp. 303-308.

[30] in Merriam Webster's Online Dictionary, 2009.

[31] G. Venter and J. Sobieski, "Particle swarm optimization," in Proceedings of the

43rd AIAA/ASME/ASCE/AHS/ASC Stuctures, Structual Dynamics, and Materials

Conference, Denver, CO, 2002.

94

APPENDIX (BENCHMARKS)

95

Benchmark n

Ackley

2

1 1

cos(2 )

0.2

( ) 20 20

32 32

n n

j j

j j

x x

n n

j

f x e e e

x

30

Griewangk

2

1 1

( ) 1 cos4000

600 600

nnj j

j j

j

x xf x

j

x

30

Quadric

2

1 1

( )

100 100

jn

j

j k

j

f x x

x

30

Quartic

with noise

4

1

( ) [0,1)

1.28 1.28

n

i

i

j

f x random i x

x

30

Rastrigin

2

1

( ) 10 ( 10cos(2 ))

5.12 5.12

n

j j

j

j

f x n x x

x

30

Rosenbrock

12 2 2

1

1

( ) 100( ) (1 )

30 30

n

j j j

j

j

f x x x x

x

30

Schaffer’s f6

2

2 2

1 2

2 2 2

1 2

sin( 0.5( ) 0.5

(1.0 0.001( ))

100 100j

x xf x

x x

x

2

Spherical

2

1

( )

100 100

n

j

j

j

f x x

x

30

Weighted

Sphere

2

1

( )

5.12 5.12

n

j

j

j

f x j x

x

30

96

BIOGRAPHICAL SKETCH

George I. Evers received the Bachelor of Arts in Mathematics with a Physics

minor, secondary teaching certificate, and Cum Laude honors from Texas A&M

University – Kingsville, where he served as president of the TAMUK chapter of the

Society of Physics Students, taught physics labs from age 18, and enjoyed conversations

with professors.

Since then, he has primarily enjoyed working in other countries, staying long

enough to appreciate the cultures and languages. He enjoys seeing how various cultures

have evolved different approaches to life’s basic challenges and inferring the resulting

strengths and weaknesses of each approach.

Trading stocks at night while keeping his day job, he became interested in

algorithms. He has enjoyed tackling PSO’s stagnation problem as well as inferring the

parallels between the algorithm and society, such as that an inherently diverse population

and slow rate of agreement between groups may help avoid premature convergence to

sub-optimal solutions, though appearing unfruitful in the short-term.

At The University of Texas – Pan American, he served as a teaching assistant for

the department of electrical engineering.

Download - AN AUTOMATIC REGROUPING MECHANISM IN PARTICLE SWARM ... · PDF filean automatic regrouping mechanism to deal with stagnation in particle swarm optimization a thesis by george i. evers

Top Related