optimal control of inverted pendulum using ant colony system algorithm2

6
Optimal Control of Inverted Pendulum using Ant Colony System Algorithm Shekhar Yadav Dept. of Electrical Engineering, IT.BHU, Varanasi (UP), India. Email: [email protected] J.P.Tiwari Dept. of Electrical Engineering, IT.BHU, Varanasi (UP), India. Email: [email protected] S.K.Nagar Dept. of Electrical Engineering, IT.BHU, Varanasi (UP), India. Email: [email protected] Abstract- In this paper a pole-placement technique is used for designing a feedback controller for an inverted pendulum system. The state feedback method is proposed for controlling and stabilization of an inverted pendulum-cart system. The parameters of feedback gain matrix are optimize using the modified ant colony system (ACS) algorithm. The proposed control strategy has been derived by eliminating the internal signals of the feedback control system by executing row operations. The effectiveness of this proposed technique is validated through experimental results obtained by performing experiments on a simple digital inverted pendulum (33-936IC). Keywords- Pole Placement design, Ant Colony System (ACS), LQR, ITAE, Inverted Pendulum. I. INTRODUCTION The inverted pendulum on a cart is a perfect test-bed for the design of a wide range of classical and contemporary control techniques. Inverted pendulum system is widely used in the field of robotics, space rocket guidance system, fast moving ground vehicles and anti-seismic control for buildings etc. Inverted pendulum is a multivariable, nonlinear, fast reaction, unstable and higher order system [1,2]. A double rod inverted pendulum system as shown in Fig.1, is mounted on a cart which is driven using a dc motor. The aim of the control strategy is to oscillate the inverted pendulum from its initial position, until it reaches the upright equilibrium point [3,7]. The stabilization of inverted pendulum system is proposed using pole-placement technique. The parameters of state feedback gain matrix are tuned using ACS algorithm. Ant colony algorithm was first introduced by M. Dorigo [5] and inspired by the foraging behavior of real ants. The basic ant colony algorithm idea is that a set of cooperating artificial ants searching the solution space in parallel simulated real ants searching their environment for food [4,5]. In pheromone updating rule, the main modification lies in either mode of pheromone trails increment of different routes assigned to different weight values. To keep a suitable balance between the two contradictory aims of exploring the search space and accelerating convergence, an modified ant colony algorithm is proposed based on adjusting pheromone evaporation factor by solving continuous optimization problems. The quadratic performance index is selected as an objective function, and all parameters of state feedback gain matrix are tuned by ACS algorithm. Fig.1 Digital Inverted Pendulum (33-936IC)

Upload: shekhar-yadav

Post on 17-Sep-2015

236 views

Category:

Documents


8 download

DESCRIPTION

Optimal Control of Inverted Pendulum Using Ant Colony System Algorithm2

TRANSCRIPT

  • Optimal Control of Inverted Pendulum using

    Ant Colony System Algorithm

    Shekhar Yadav Dept. of Electrical Engineering,

    IT.BHU, Varanasi (UP), India.

    Email: [email protected]

    J.P.Tiwari

    Dept. of Electrical Engineering,

    IT.BHU, Varanasi (UP), India.

    Email: [email protected]

    S.K.Nagar Dept. of Electrical Engineering,

    IT.BHU, Varanasi (UP), India.

    Email: [email protected]

    Abstract- In this paper a pole-placement technique is

    used for designing a feedback controller for an inverted

    pendulum system. The state feedback method is

    proposed for controlling and stabilization of an inverted

    pendulum-cart system. The parameters of feedback

    gain matrix are optimize using the modified ant colony

    system (ACS) algorithm. The proposed control strategy

    has been derived by eliminating the internal signals of

    the feedback control system by executing row

    operations. The effectiveness of this proposed technique

    is validated through experimental results obtained by

    performing experiments on a simple digital inverted

    pendulum (33-936IC).

    Keywords- Pole Placement design, Ant Colony

    System (ACS), LQR, ITAE, Inverted Pendulum.

    I. INTRODUCTION

    The inverted pendulum on a cart is a perfect test-bed

    for the design of a wide range of classical and

    contemporary control techniques. Inverted pendulum

    system is widely used in the field of robotics, space

    rocket guidance system, fast moving ground vehicles

    and anti-seismic control for buildings etc. Inverted

    pendulum is a multivariable, nonlinear, fast reaction,

    unstable and higher order system [1,2]. A double

    rod inverted pendulum system as shown in Fig.1, is

    mounted on a cart which is driven using a dc motor.

    The aim of the control strategy is to oscillate the

    inverted pendulum from its initial position, until it

    reaches the upright equilibrium point [3,7]. The

    stabilization of inverted pendulum system is

    proposed using pole-placement technique. The

    parameters of state feedback gain matrix are tuned

    using ACS algorithm. Ant colony algorithm was first

    introduced by M. Dorigo [5] and inspired by the

    foraging behavior of real ants.

    The basic ant colony algorithm idea is that a set

    of cooperating artificial ants searching the solution

    space in parallel simulated real ants searching their

    environment for food [4,5]. In pheromone updating

    rule, the main modification lies in either mode of

    pheromone trails increment of different routes

    assigned to different weight values. To keep a

    suitable balance between the two contradictory aims

    of exploring the search space and accelerating

    convergence, an modified ant colony algorithm is

    proposed based on adjusting pheromone evaporation

    factor by solving continuous optimization problems.

    The quadratic performance index is selected as an

    objective function, and all parameters of state

    feedback gain matrix are tuned by ACS algorithm.

    Fig.1 Digital Inverted Pendulum (33-936IC)

  • Linear Quadratic Regulator (LQR) is designed for

    optimal control of inverted pendulum.

    II. POLE-PLACEMENT DESIGN TECHNIQUE

    Considering a linear dynamic system in state space

    form

    = + (1)

    = (2)

    where, =state vector of the plant ( -vector)

    =control signal (scalar)

    =output signal (scalar)

    = constant matrix

    = 1 constant matrix

    and the control signal is given by

    = (3)

    The 1 matrix is called the state feedback gain

    matrix. The closed-loop control system when state

    is fed back to the control signal is given by-

    = () (4)

    Assuming that the pair (A,B) is completely

    controllable, there exist a feedback matrix K such that

    the closed-loop system eigenvalues can be placed in

    arbitrary locations. The state feedback gain matrix

    can also be obtained through the Quadratic cost

    function minimization.

    =1

    2 +

    0 (5)

    The errors are minimized as,

    = [ ] (6)

    where is a state vector,

    = [, , , ]

    The vector is expressed by the solution of the

    Riccati equation.

    + + 1 = 0 (7)

    = 1 (8)

    III. ANT COLONY SYSTEM

    The ant colony system was developed in early 1990s

    by Dorigo et al [5]. The ACS technique is one of the

    metaheuristic optimization methods and is inspired

    by the capability of real ants to establish the shortest path from a food source to their nest. Ants lay the

    chemical substance or the trails of pheromone, on the

    ground when they move along paths. Each individual

    ant makes a decision of the moving direction based

    on the strength of the pheromone trails. The better

    path is one that has higher amount of the pheromone

    trails on the ground. While more and more ants track

    on the food source, the shorter path accumulates the

    more pheromone trails. Thus, most of the ants are

    attracted to the shorter path, and this behavior of the

    path selection encourages the positive feedback effect. It is noted that the ants finally will find the

    shortest path [6].

    A. GENERATION OF NODES AND PATHS

    Let the state feedback gain matrix parameters

    1 , 2 , 3 , 4 are the optimized variables, and

    assume that the value of each of them has four valid

    digits. In the four digits of 1 , 2 , 3 , 4, there are

    two digits before decimal point and two digits after

    decimal point. When using the ACS algorithm, a

    discrete solving space is needed because the path

    selections of an ant in each step are limited. In order

    to use the ACS algorithm conveniently, the values of

    1 , 2 , 3 and4 are expressed on X-Y plane. As

    shown in Fig.2, first we draw sixteen lines

    1 , 2 , , 16 which have equal length and equal

    separation and are perpendicular to axis X.

    1~4 , 5~8 ,9~12 , 13 ~16 represents the first

    digit to fourth digit of 1 , 2 , 3 and4 respectively.

    The X coordinated of these lines are represented by

    numbers 1~16 respectively. Then, we divided each

    of these lines into ten portions and thus eleven nodes

    are generated on each line. The eleven nodes on each

    line represent numbers 0~10 respectively, which are

    possible values of the digits corresponding to the line.

    Let an ant depart from the origin O of X-Y plane.

    When it moves to any node of line 16 , it completes a

    tour. Its moving path can be represented by =

    { , 1 , 1 , , 2 ,2 , . . ,

    16 , 16 , }. Obviously, the values of

    1 , 2 , 3 and4 represented by the path can be

    computed by the following formulas:

  • K1 = y1j 101 + y2j 10

    0 + y3j 101 + y4j 10

    2

    K2 = y5j 101 + y6j 10

    0 + y7j 101 + y8j 10

    2

    K3 = y9j 101 + y10j 10

    0 + y11j 101 + y12j 10

    2

    K4 = y13j 101 + y14j 10

    0 + y15j 101 + y16j 10

    2

    B. TRANSITION RULE

    When all ants move to one line, say, line , let

    (0~10) be the number of ants at node j of line

    then the total number of ants is = 10 =0 . Let

    , be the concentration of pheromone at Node

    , assume that initially all the nodes have same

    amount of pheromone 0. In moving process, an ant

    = 1~ on line 4 = 1~16 , will select a

    node j from the eleven nodes of the next line to

    move to the according to the following transition

    rule:

    =

    , . , , 0 (10)

    and j=J, if 0

    where q is a random variable uniformly distributed

    over [0,1], 0 is tunable parameter, contains all of

    the nodes on line and J is a node that is

    randomly selected according to probability.

    , = ,

    ( , )

    , ( , )

    (11)

    In Eqn. (10) and (11) ( , ) is the visibility of

    node ( , ) and this is computed as-

    , =11

    11 (12)

    where the values of (i=1~16,j= 0~10) are the set

    in following way. In the first iteration of the ACS

    algorithm the values of the are set to the vertical

    coordinates of the sixteen nodes which are obtained

    by mapping the values of state feedback gain matrix

    parameters 10, 2

    0, 30 and 4

    0 onto Fig.2 where 10,

    20, 3

    0 and 40are obtained by using pole placement

    technique. In each of the following iterations, the

    values of state feedback gain matrix 1, 2

    , 3 and

    4 as shown in Fig.2, where 1

    , 2, 3

    and 4 are

    the state feedback gain parameters corresponding to

    best tour generated since the beginning of the trial.

    C. GLOBAL UPDATE OF PHEROMONE

    CONCENTRATION

    When all of the ants in the colony complete their

    tours once in the modified ant colony system

    algorithm i.e. when they arrive on the line 16 , the

    pheromone concentration of each nodes belonging to

    (9)

    Fig.2 Diagram of Generating Nodes and Paths

  • the best tour since the beginning of the trial is

    updated by the following formulas:

    ( , ) (1 ). ( , ) + . ( , ) (13)

    , = Q/ITAE (14)

    where Node , s are the nodes belonging to the

    best tour since the beginning of the trial; is the

    parameter which governs the pheromone decay;

    ITAE* is the value of the ITAE performance criterion

    corresponding to the best tour since the beginning of

    the trial; and Q is a positive constant which can be

    determined in the following way: for a given control

    system, first we obtained the state feedback gain

    matrix through pole-placement technique and then

    we compute the ITAE performance criterion of the

    system according to the obtained state feedback gain

    parameters and use ITAE0 to denote the obtained

    ITAE value, and then let Q be equal to ITAE0.

    Obviously, as the value of ITAE* becomes smaller

    and smaller, the value of Q/ITAE* will become

    greater and greater, which is helpful to increasing the

    pheromone concentration of the nodes on the best

    tour since the beginning of the trial and results in

    finding the best solution within the maximum number

    of iterations allowed.

    D. LOCAL UPDATES OF PHEROMONE

    CONCENTRATION

    The local update is performed as follows: while

    performing a tour, ant is on line 1 and selects

    node j on line , the pheromone concentration of

    Node ( , ) is updated by the following formula:

    ( , ) (1-). ( , ) + 0 (15)

    The value 0 is the same as the initial value of

    pheromone concentration. When an ant visits a node,

    the application of the local update rule makes the

    pheromone level of the node diminish. This has the

    effect of making the visited nodes less and less

    attractive for other ants, thus indirectly favoring the

    exploration of not yet visited nodes. To optimize the

    performance of an inverted pendulum-cart system,

    the gains of state feedback system are adjusted to

    maximize or minimize a certain performance index.

    The objective of the performance index is to

    encompass in a single number a quality measure for

    the performance of the system. Various objective

    functions were written based on error performance

    criterion. The performance index is calculated over a

    time interval; T, normally in the region of 0 T ts , where ts is the settling time of the system. To

    emphasize the effectiveness of the proposed method,

    the ITAE performance criterion as given below is

    adopted in this paper.

    ITAE = . ()

    0 (16)

    IV. MODELING OF INVERTED PENDULUM

    The inverted pendulum-cart system is usually

    presented as a pole balancing task. The system to be

    controlled consists of a cart and a rigid pole hinged to

    the top of the cart. The movement of the cart is

    caused by pulling the belt in two directions by the

    DC motor attached at the end of the rail. By applying

    a voltage to the motor the force can be controlled

    with which the cart is to be pulled. The value of the

    force depends on the value of the control voltage.

    The cart can move left or right on a one-dimensional

    bounded track, whereas the pole can swing in the

    vertical plane determined by the track. The linearized

    system equations around = in the state space are:

    =

    0 1 0 0

    0 +2

    + + 222

    + + 20

    0 0 0 1

    0

    + + 2 +

    + + 20

    +

    0+2

    + +2

    0

    + +2

    (17)

    = 1 0 0 00 0 1 0

    + 00 (18)

    where,

    M (mass of cart) 2.4 kg

    m (mass of pendulum) 0.23 kg

    b (friction of cart) 0.05 N/m/sec

    I (moment inertia of pendulum) 0.099 kgm2

    l (length of pendulum) 0.4 m

    g (acceleration due to gravity) 9.8 m/sec2

  • The state of the system is defined by values of four

    system variables: , , , the cart position, cart

    velocity, pendulum angle and angular velocity of the

    pendulum pole respectively. Control force is applied

    to the system to prevent the pole from falling while

    keeping the cart within the specified limits. The

    inverted pendulum-cart system is used here is Digital

    Pendulum (33-936IC).

    V. RESULTS AND DISCUSSION

    The performance of proposed controller is discussed

    in this section. The closed-loop poles of the system

    are located at = ( = 1,2,3,4), where 1 = 2 +

    2 3, 2 = 2 2 3, 3 = 10, 4 = 10. The closed-loop poles 1 2 are a pair of dominant closed loop poles with = 0.5 = 4. The LQR method finds the optimal control matrix that

    result in some balance between system errors and

    control effort. The performance index matrix (R) and

    the state-cost matrix (Q) is initially set as: =1 and = [1 0 0 0; 0 0 0 0; 0 0 1 0; 0 0 0 0]. The weighting factors will be chosen by trial and errors.

    The state feedback gain matrix found through

    MATLAB commands is:

    = [0.9701 3.0259 70.5683 27.1358] (19)

    The pendulum's and cart's overshoot appear fine, but

    their settling times need improvement and the cart's

    rise time needs to be decreased. Also the cart has, in

    fact, moved in the opposite direction. For now, we

    will concentrate on improving the settling times and

    the rise times.

    The settling time and rise time can be improved by

    updating the matrix Q and matrix R by trial and error

    method. The updated matrix Q and R are given as-

    =

    4500 0 0 00 0 0 00 0 100 00 0 0 0

    = 1

    the step response generated while updating the matrix

    Q and R, is shown in Fig.4, and the state feedback

    gain matrix is given as-

    = [63.05 66.78 372.28 144.12] (20)

    From Fig.4, we see that all design requirements are

    satisfied except the steady-state error of the cart

    position (x) but using LQR method system respond

    very slowly because values of state feedback gain

    becomes larger. Therefore, gains of state feedback

    are tuned through modified ant colony system

    algorithm. Performance index (PI) is optimized for

    position of the cart as in the real system, the length of

    the apparatus on which the cart is moving is limited.

    So care has to be taken to restrict the motion of the

    cart within the limits. This is analyzed on the basis of

    ITAE to maintain the pendulum position at 00 for any

    disturbance given to the cart. The feedback gain

    matrix using ant colony system (ACS) is-

    = 48.78 28.63 51.65 71.76 (21)

    Fig.3 Step response of inverted pendulum

    system

    Fig.4 Step response using LQR method

  • VI. CONCLUSION

    In this paper, an Ant Colony System (ACS) algorithm

    is used to stabilize an inverted pendulum-cart system.

    By using modified ACS algorithm, the calculation

    time can be reduced and the accuracy can be

    increased in comparison with the pole-placement

    design technique. This concept gives a new

    alternative procedure in time varying feedback

    control to improve the stability performance. This

    technique is implemented in an inverted pendulum-

    cart system which is a highly nonlinear system.

    VII. REFERENCES

    [1] C. C. Chung and J. Hauser, Nonlinear control of a swinging pendulum, Automatica, vol. 31, no. 6, pp. 851862, Jun. 1995. [2] Q. Wei, W. P. Dayawansa, and W. S. Levine, Nonlinear controller for an inverted pendulum having restricted travel, Automatica, vol. 31, no. 6, pp. 841-850, 1995. [3] Hamid R. P., M. R. Jaheh-Motlagh, Ali-Akbar J.,

    Optimal feedback control design using genetic algorithm applied to inverted pendulum, IEEE International Symposium on Industrial Electronics, pp. 263-268, June,2007 [4] C. Grosan and A. Abraham: Hybrid Evolutionary Algorithms: Methodologies, Architectures, and Reviews,

    Studies in Computational Intelligence (SCI) 75, 117 (2007), www.springerlink.com c_ Springer-Verlag Berlin Heidelberg. [5] M. Dorigo, M. Birattari, and T. Stitzle, Ant Colony Optimization: Arificial Ants as a Computational

    Intelligence Technique, IEEE computational intelligence magazine, November, 2006 [6] M. Dorigo, L.M. Gambardella, Ant colony system : a cooperative learning approach to the traveling salesman problem, IEEE Tran. On Evolutionary Computation, vol. 1, no. 1, pp. 53-66, 1997. [7] Katsuhiko Ogata, Modern Control Engineering, Prentice Hall, New Jersey, 3rd edition-1997

    Fig.5 Step response of ITAE using ACS

    algorithm