back propagation: variations

Upload: roots999

Post on 06-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Back Propagation: Variations

    1/21

    Back Propagation: Variations

    Layer m

    Input Output

    Layer 0 Layer m+1

    X1

    Xn

    Neuron#1

    Neuron#2

    .

    .

    .

    .

    .

    .

    Neuron#k

    Neuron#1

    Neuron#p

    x'2

    x'1

    x'i

    Y p(m+1)

    Y 1(m+1)

  • 8/3/2019 Back Propagation: Variations

    2/21

    BP- Improvements

    Second order derivatives(Parker, 1982)

    Dynamic range modification (Stronetta

    and Huberman, 1987)

    F(x) = -1/2 + 1/1+e-x

    Meta Learning (Jacobs, 1987; Hagiwara, 1990)

    Selective updates (Huang and Huang, 1990)

  • 8/3/2019 Back Propagation: Variations

    3/21

    BP- Improvements (Cont.)

    Use of Momentum weight change(Rumelhart, 1986)

    wkmi(t+1) = * km* xi(t) +wkmi(t)

    Exponential smoothing (Sejnowski and

    Rosenburg, 1987)

    wkmi(t+1) = (1)* km* xi(t) +wkmi(t)

  • 8/3/2019 Back Propagation: Variations

    4/21

    BP- Improvements (Cont.)

    Accelerating the BP algorithm (Kothari,Klinkhachorn, and Nutter, 1991)

    Gradual increase in learning accuracy Without incorporating the disadvantage

    of increased network size, more complex

    neurons or otherwise violating the parallel

    structure of computation

  • 8/3/2019 Back Propagation: Variations

    5/21

    Gradual increase in learning accuracy

    Temporal instability

    Absence of tru direction of descent

    Void Acc_BackProp (Struct Network *N, struct Train_Set *T)

    {Assume_coarse_error ()while ( < Eventual_Accuracy) {

    while (not_all_trained) {

    Present_Next_Pattern;

    while (!Trained)

    Train_Pattern;}

    Increase_Accuracy ( -= Step);

    }

    }

  • 8/3/2019 Back Propagation: Variations

    6/21

    Training with gradual increase in accuracy

    Direction of

    Steepest descent

    Direction of descent suggested by examplar 1

    Direction of descent suggested

    by examplar 2

    Direction of descent suggested

    by examplar 3...

    Direction of descent suggested

    by examplar M

  • 8/3/2019 Back Propagation: Variations

    7/21

    Error VS Trainning Passes

    0 10000 20000 30000 40000 50000 60000

    0

    2

    4

    6

    8

    10

    12

    BP

    BPGIA

    BP+Mom

    BPGIA+Mom

    Training passes

    Overallerror

    Minimization of the error for a 4 bit 1's complementor(Graph has been curtailed to show detail)

  • 8/3/2019 Back Propagation: Variations

    8/21

    Error VS Trainning Passes

    0 100000 200000 300000

    0

    1

    2

    3

    4

    5

    BP

    BPGIA

    BP+Mom

    BPGIA+Mom

    Training passes

    Overall

    error

    Minimization of the error for a 3-to-8 Decoder

  • 8/3/2019 Back Propagation: Variations

    9/21

    Error VS Trainning Passes

    0 100000 200000

    0.0

    0.2

    0.4

    0.6

    0.8

    BP

    BPGIA

    BP+Mom

    BPGIA+Mom

    Training passes

    Overallerror

    Minimization of the error for the Xor problem

  • 8/3/2019 Back Propagation: Variations

    10/21

    Error VS Trainning Passes

    Minimization of the error for simple shape recognizer

    0 20000 40000 60000 80000 100000 120000

    0

    1

    2

    3

    BP

    BPGIA

    BP+Mom

    BPGIA+Mom

    Training passes

    Overallerror

  • 8/3/2019 Back Propagation: Variations

    11/21

    Error VS Trainning Passes

    Minimization of the error for a 3 bit rotate register

    0 10000 20000 30000 40000 50000

    0

    1

    2

    3

    4

    BP

    BPGIA

    BP+Mom

    BPGIA+Mom

    Training passes

    Overallerro

    r

  • 8/3/2019 Back Propagation: Variations

    12/21

    Error VS Trainning Passes

    Problem

    (network size)

    1s complement

    (4x8x4)

    3 to 8 decoder(3x8x8)

    Exor(2x2x1)

    Rotate register

    (3x6x3)

    Differentiation between asquare, circle and triangle

    (16x20x1)

    BP

    9.7

    (134922)

    5.4(347634)

    4.5(211093)

    4.3

    (72477)

    2.3(71253)

    BPGIA

    6.6

    (92567)

    4.2(268833)

    1.8(88207)

    2.0

    (33909)

    1.3(33909)

    BP+Mom.

    2.2

    (25574)

    1.1(61366)

    2.5(107337)

    1.1

    (15929)

    6.11(145363)

    BPGIA+Mom.

    1.0

    (11863)

    1.0(53796)

    1.0(45916)

    1.0

    (14987)

    1.0(25163)

  • 8/3/2019 Back Propagation: Variations

    13/21

    Training with gradual increase in accuracy

    On an average, doubles the convergence

    rate of back propagation or the back

    propagation algorithm utilizing a

    momentum weight change withoutrequiring additional or more complex

    neurons

  • 8/3/2019 Back Propagation: Variations

    14/21

    Nonsaturating Activation Functions

    For some applications, where saturation

    is not especially beneficial, a

    nonsaturating activation function may be

    used. One suitable example is

    F(x) = log(1+x) for x>0

    =-log(1-x) for x0

    = 1/(1-x) for x

  • 8/3/2019 Back Propagation: Variations

    15/21

    Nonsaturating Activation Functions

    Example: BP for the XOR

    Problem Logarithmic BipolarSigmoidStandard

    bipolar XOR144 epochs 387 epochs

    Modified

    bipolar XOR(+.8 or -.8)

    77 epochs 264 epochs

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall

  • 8/3/2019 Back Propagation: Variations

    16/21

    Nonsaturating Activation Functions

    Example: Product of sine functions

    (continuous single output)

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall

    Y=sin(2x1)*sin(2x2)*Training 5000 epochs @ mean squared error 0.024, =.05

  • 8/3/2019 Back Propagation: Variations

    17/21

    Strictly Local Backpropagation

    Standard BP Requires sharing of information among

    processors (violation of accepted theories on

    the functioning of biological neurons)

    lacks biological plausibility

    Strictly Local BP (Fausett, 1990)

    Alleviates the standard BP deficiency

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall

  • 8/3/2019 Back Propagation: Variations

    18/21

    Strictly Local BP Architecture

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall

  • 8/3/2019 Back Propagation: Variations

    19/21

    Strictly Local BP Architecture

    Cortical unit Sums its inputs and sends the resulting value as a

    signal to the next unit above it

    Synaptic units Receive a single input signal, apply an activation

    function to the input, multiply the result by a weight,

    and send the result to a single unit above

    Thalamic unit Compare the computed output with the target value. If

    they do not match, the thalamic unit sends an error

    signal to the output synaptic unit below it

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall

  • 8/3/2019 Back Propagation: Variations

    20/21

    BP VS. Strictly Local BP

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall

  • 8/3/2019 Back Propagation: Variations

    21/21

    BP VS. Strictly Local BP

    Laurene Fausett, Fundamentals of Neural Networks, Prentice Hall