introduction to artificial neural network

Artificial Neural Network basics

Qingkai Kong 2016-12-02

h"p://seismo.berkeley.edu/qingkaikong/

Workshop time

Lear

ning

cur

ve

Gentle introduction

Step by step ANN

Real world example

h"ps://github.com/qingkaikong/20161202_ANN_basics

Workshop time

Lear

ning

cur

ve

Gentle introduction •  What’s ML •  ANN history •  ANN overview

Step by step ANN

Real world example


What is machine learning?


Self-driving car Voice recognition …


Not always working

1940s Birth

1970s Winter

1980s Rebirth

2006 Deep

ANN in simple view

ANN jargons

What’re the weights

.

.

.

w1

w2

wn

F(eye×w1+nose×w2+…+mouth×wn)

Sheldon Cooper?

Intuitive ��Artificial Neural Network

Output

Input

.

.

.

w1

w2

wn


Sheldon Cooper?


Output

Input

feedback error

.

.

.

w1

w2

wn



Output

Input

Workshop time

Lear

ning

cur

ve


Step by step ANN •  Perceptron •  Backpropagation

Real world example

h"ps://arxiv.org/abs/1508.06576v1 h"ps://deepart.io/

Application: Learn arts

h"p://junkhost.com/2016/03/man-‐combines-‐random-‐peoples-‐photos-‐using-‐neural-‐networks-‐and-‐the-‐results-‐are-‐amazing/

X

Σ yf

InputOutput

feature2

feature3

ω0

ω2

ω3 X – input datay – output targetωi – weightsΣ – summationf – activation functionBlue circle – bias

feature1 ω1

1

Σ = ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn f = f(ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn)

z = ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn

11+ e−z

More activation function

f (z) = 11+ e−z

df (z)dx

= f (z)(1− f (z))

X

Σ yf

InputOutput

feature2

feature3

ω0

ω2


feature1 ω1

1

Σ = ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn f = f(ω0x0+ω1x1+ω2x2+ω3x3+…+ωnxn)y This is our estimation

Perceptron

X

Σ yf

InputOutput

feature2

feature3

ω0

ω2


feature1 ω1

1

Error = Target -‐ Es-ma-on

Perceptron


How the ANN learns


How the ANN learns

Learning: Update weights to reduce error next -me!

Weights Delta = Error × slope × input

Weights update rules

How much we will update the weights for next Xme


Look at errors closer

0 1 Target


Three cases: •  Error < 0: Target is 0, es-ma-on is not 0 •  Error > 0: Target is 1, es-ma-on is not 1 •  Error = 0: Es-ma-on correct

Look at errors closer

0 1 Target

Cases 1: •  Error < 0: Target is 0, es-ma-on is not 0

Look at errors closer��(assume inputs are positive)

Cases 1: •  Target is 0 •  Es-ma-on is 0.3


Error = 0 – 0.3 =-‐0.3



11+ e−z

z zupdate

Error = 0 – 0.3 =-‐0.3



Error = 0 – 0.3 =-‐0.3

z = ω0x0+ω1x1+ω2x2+ω3x3

We need reduce weights!



Error = 0 – 0.3 =-‐0.3

z = ω0x0+ω1x1+ω2x2+ω3x3

We need reduce weights!If we add error to the weights, we will reduce it!



Error = 0 – 0.3 =-‐0.3

z = ω0x0+ω1x1+ω2x2+ω3x3

We need reduce weights!But what if the inputs are negative

Weights Delta = Error × input


z

flat slope

steep slope

Weights Delta = Error × slope × input


Learn from example

Learn from Example

Sample Feature 1 Feature 2 Feature 3 Target Sample 1 0 0 1 0 Sample 2 1 1 1 1 Sample 3 1 0 1 1 Sample 4 0 1 1 0 Sample 5 1 0 0 1

How to deal with errors

X

Σ yf

InputOutput

1

1

-0.166

-1.000

-0.395

1 0.441

1

Σ = (-0.166)×1 + 0.441×1 + (-1)×1 + (-0.395)×1 = -1.12

f(-1.12) = 1/(1+e-1.12) = 0.246

Error = 1 – 0.246 = 0.754

Sample Feature 1 Feature 2 Feature 3 Target

Sample 1 1 1 1 1

11+ e−z

z zupdate

•  Target: 1 •  Es-ma-on: 0.246

Error 0.754

11+ e−z

z zupdate

•  Target: 1 •  Es-ma-on: 0.246

Error 0.754

We want to increase the weights next -me to have larger z

Weights delta = 0.754 × slope × input

z

df (z)dx

= f (z)(1− f (z))

z = -1.12 f(z) = 0.246 slope = 0.246 × (1 - 0.246) = 0.185

Change item = 0.754 × 0.185 × 1 1 1 1

0.139 0.139 0.139 0.139

=

-‐0.166 0.441 -‐1.000 -‐0.395

0.139 0.139 0.139 0.139

Updated Weights

+ =

-‐0.027 0.580 -‐0.861 -‐0.256

=

Changes of the error

Original Weights

updates

-‐0.166 0.441 -‐1.000 -‐0.395

0.139 0.139 0.139 0.139

Updated Weights

z = (-0.027)×1 + 0.580×1 + (-0.861)×1 + (-0.256) ×1 = -0.564

f(z) = 0.637

Error = 1 – 0.637 = 0.363

+ =

-‐0.027 0.580 -‐0.861 -‐0.256

=


Error of next iteration


0.754

0.363

Iterate many times

Go to notebook 01

Application: DeepDrumpf

h"ps://twi"er.com/DeepDrumpf

Perceptron limitations

Winter of ANN

Multi-Layer Perceptron

X

Σ yf

Input

Output1

feature2

feature3

X – input datay – output targetΣ – summationf – activation functionblue circle - bias

feature1

Hidden1 Σ | f

Hidden3 Σ | f

Hidden4 Σ | f

Hidden2 Σ | f

1

X

Σ yf

Input

Output1

feature2

feature3


feature1

Hidden1 Σ | f

Hidden3 Σ | f

1

X

Σ yf

Input

Output1

feature2

feature3


feature1

Hidden1 Σ | f

Hidden3 Σ | f

Hidden4 Σ | f

Hidden2 Σ | f

1

X

Σ yf

Input

Output1

feature2

feature3

feature1

Hidden1 Σ | f

Hidden3 Σ | f

Hidden4 Σ | f

Hidden2 Σ | f

1

Go to notebook 02

Workshop time

Lear

ning

cur

ve


Step by step ANN •  Perceptron •  Backpropagation

Real world example •  Sklearn example

h"p://richzhang.github.io/colorizaXon/ h"p://wha"ogive.com/videoColourizaXon/

Application: Colourization

Go to notebook 03

If you want to learn more …

h"p://dlab.berkeley.edu/training

Some useful resources •  h"p://iamtrask.github.io/2015/07/12/basic-‐python-‐network/ •  h"ps://seat.massey.ac.nz/personal/s.r.marsland/MLBook.html •  h"p://sebasXanraschka.com/ArXcles/2015_singlelayer_neurons.html •  h"p://www.emergentmind.com/neural-‐network •  h"p://neuralnetworksanddeeplearning.com/ •  h"ps://www.coursera.org/learn/neural-‐networks

You can also find most of today’s workshop material on my blog: h"p://qingkaikong.blogspot.com/2016/10/machine-‐learning-‐1-‐what-‐is-‐machine.html

I thank all the authors of the above links, as well as a lot of the images I got from internet.

introduction to artificial neural network

Data & Analytics