learning algorithms for life scientists

67
Artificial Intelligence and Learning Algorithms Presented By Brian M. Frezza 12/1/05

Upload: brian-frezza

Post on 04-Jul-2015

3.975 views

Category:

Education


1 download

DESCRIPTION

This was a very brief introduction to the basics of learning algorithms for life scientists I was asked to give to the incoming first year students at TSRI in the fall of 2005. It covers the very basics of how the algorithms work (sans the complex math) and more importantly, how they can be appropriately understood and applied by chemists and biologists.

TRANSCRIPT

Page 1: Learning Algorithms For Life Scientists

Artificial Intelligence and Learning Algorithms

Presented By Brian M. Frezza 12/1/05

Page 2: Learning Algorithms For Life Scientists

Game Plan

• What’s a Learning Algorithm?• Why should I care?

– Biological parallels

• Real World Examples• Getting our hands dirty with the algorithms

– Bayesian Networks– Hidden Markov Models– Genetic Algorithms– Neural Networks

• Artificial Neural Networks Vs Neuron Biology– “Fraser’s Rules”

• Frontiers in AI

Page 3: Learning Algorithms For Life Scientists

HardMath

Page 4: Learning Algorithms For Life Scientists

What’s a Learning Algorithm?

• “An algorithm which predicts data’s future behavior based on its past performance.”– Programmer can be ignorant of the data’s

trends.• Not rationally designed!

– Training Data– Test Data

Page 5: Learning Algorithms For Life Scientists

Why do I care?

• Use In Informatics– Predict trends in “fuzzy” data

• Subtle patterns in data• Complex patterns in data• Noisy data

– Network inference– Classification inference

• Analogies To Chemical Biology – Evolution– Immunological Response– Neurology

• Fundamental Theories of Intelligence– That’s heavy dude

Page 6: Learning Algorithms For Life Scientists

Street Smarts

• CMU’s Navlab-5 (No Hands Across America)– 1995 Neural Network Driven Car– Pittsburgh to San Diego: 2,797 miles (98.2%)– Single hidden layer backpropagation network!

• Subcellular location through fluorescence– “A Neural network classifier capable of recognizing the patterns of all major subcellular

structures in fluorescence microscope images of HeLa cells” M. V. Boland, and R. F. Murphy, Bioinformatics (2001) 17(12), 1213-1223

• Protein secondary structure prediction• Intron/Exon predictions• Protein/Gene network inference• Speech recognition• Face recognition

Page 7: Learning Algorithms For Life Scientists

The Algorithms

• Bayesian Networks• Hidden Markov Models• Genetic Algorithms

• Neural Networks

Page 8: Learning Algorithms For Life Scientists

Bayesian Networks: Basics

• Requires models of how data behaves– Set of Hypothesis: {H}

• Keeps track of likelihood of each model being accurate as data becomes available– P(H)

• Predicts as a weighted average – P(E) = Sum( P(H)*H(E) )

Page 9: Learning Algorithms For Life Scientists

Bayesian Network Example

• What color hair will Paul Schaffer’s

kids have if he marries Redhead? – Hypothesis

• Ha(rr) rr x rr: 100% Redhead

• Hb(Rr) rr x Rr: 50% Redhead 50% Not

• Hc(RR) rr x RR: 100% Not

• Initially clueless:– So P(Ha) = P(Hb) = P(Hc) = 1/3

Page 10: Learning Algorithms For Life Scientists

Bayesian Network: Trace

Ha: 100% RedheadHb: 50% Redhead 50% Not

Hc: 100% Not

Redhead 0Not 0

HypothesisHistory

Likelihood's

1/31/31/3

P(Hc)P(Hb)P(Ha)

= P(red|Ha)*P(Ha) + P(red|Hb)*P(Hb) + P(red|Hc)*P(Hc)= (1)*(1/3) + (1/2)*(1/3) + (0)(1/3)

=(1/2)

Prediction: Will their next kid be a Redhead?

Page 11: Learning Algorithms For Life Scientists

Bayesian Network:Trace

Ha: 100% RedheadHb: 50% Redhead 50% Not

Hc: 100% Not

Redhead 1Not 0

HypothesisHistory

Likelihood's

01/21/2

P(Hc)P(Hb)P(Ha)

= P(red|Ha)*P(Ha) + P(red|Hb)*P(Hb) + P(red|Hc)*P(Hc)= (1)*(1/2) + (1/2)*(1/2) + (0)(1/3)

=(3/4)

Prediction: Will their next kid be a Redhead?

Page 12: Learning Algorithms For Life Scientists

Bayesian Network: Trace

Ha: 100% RedheadHb: 50% Redhead 50% Not

Hc: 100% Not

Redhead 2Not 0

HypothesisHistory

Likelihood's

01/43/4

P(Hc)P(Hb)P(Ha)

= P(red|Ha)*P(Ha) + P(red|Hb)*P(Hb) + P(red|Hc)*P(Hc)= (1)*(3/4) + (1/2)*(1/4) + (0)(1/3)

=(7/8)

Prediction: Will their next kid be a Redhead?

Page 13: Learning Algorithms For Life Scientists

Bayesian Network: Trace

Ha: 100% RedheadHb: 50% Redhead 50% Not

Hc: 100% Not

Redhead 3Not 0

HypothesisHistory

Likelihood's

01/87/8

P(Hc)P(Hb)P(Ha)

= P(red|Ha)*P(Ha) + P(red|Hb)*P(Hb) + P(red|Hc)*P(Hc)= (1)*(7/8) + (1/2)*(1/8) + (0)(1/3)

=(15/16)

Prediction: Will their next kid be a Redhead?

Page 14: Learning Algorithms For Life Scientists

Bayesian Networks Notes

• Never reject hypothesis unless directly disproved

• Learns based on rational models of behavior– Models can be extracted!

• Programmer needs to form hypothesis beforehand.

Page 15: Learning Algorithms For Life Scientists

The Algorithms

• Bayesian Networks• Hidden Markov Models• Genetic Algorithms

• Neural Networks

Page 16: Learning Algorithms For Life Scientists

Hidden Markov Models(HMM)

• Discrete learning algorithm– Programmer must be able to categorize predictions

• HMMs also assume a model of the world working behind the data

• Models are also extractable• Common Uses

– Speech Recognition– Secondary structure prediction– Intron/Exon predictions– Categorization of data

Page 17: Learning Algorithms For Life Scientists

Hidden Markov Models: Take a Step Back

• 1st order Markov Models:– Q{States}– Pr{Transition}– Sum of all P(T) out of state = 1

Q1

Q4

Q2

Q3

P1

P2

1-P1-P2

P3

1-P3

1

1-P4

P4

Page 18: Learning Algorithms For Life Scientists

1st order Markov Model Setup

• Pick Initial state: Q1

• Pick Transition Probabilities:

• For each time step– Pick a random number 0.0-1.0

Q1

Q4

Q2

Q3

P1

P2

1-P1-P2

P3

1-P3

1

1-P4

P4

P1 P2 P3 P4

0.6 0.2 0.9 0.4

Page 19: Learning Algorithms For Life Scientists

1st order Markov Model Trace

• Current State: Q1 Time Step = 1• Transition probabilities:

• Random Number:– 0.22341

• So Next State:– 0.22341 < P1

• Take P1

– Q2

Q1

Q4

Q2

Q3

P1

P2

1-P1-P2

P3

1-P3

1

1-P4

P4

P1 P2 P3 P4

0.6 0.2 0.9 0.4

Page 20: Learning Algorithms For Life Scientists

1st order Markov Model Trace

• Current State: Q2 Time Step = 2

• Transition probabilities:

• Random Number:– 0.64357

• So Next State:– No Choice, P = 1– Q3

Q1

Q4

Q2

Q3

P1

P2

1-P1-P2

P3

1-P3

1

1-P4

P4

P1 P2 P3 P4

0.6 0.2 0.9 0.4

Page 21: Learning Algorithms For Life Scientists

1st order Markov Model Trace

• Current State: Q3 Time Step = 3• Transition probabilities:

• Random Number:– 0.97412

• So Next State:– 0.97412 > 0.9

• Take 1-P3

– Q4

Q1

Q4

Q2

Q3

P1

P2

1-P1-P2

P3

1-P3

1

1-P4

P4

P1 P2 P3 P4

0.6 0.2 0.9 0.4

Page 22: Learning Algorithms For Life Scientists

1st order Markov Model Trace

• Current State: Q4 Time Step = 4

• Transition probabilities:

• I’m going to stop here.

• Markov Chain:– Q1, Q2, Q3, Q4

Q1

Q4

Q2

Q3

P1

P2

1-P1-P2

P3

1-P3

1

1-P4

P4

P1 P2 P3 P4

0.6 0.2 0.9 0.4

Page 23: Learning Algorithms For Life Scientists

What else can Markov do?

• Higher Order Models– Kth order

• Metropolis-Hastings– Determining thermodynamic equilibrium

• Continuous Markov Models– Time step varies according to continuous

distribution

• Hidden Markov Models– Discrete model learning

Page 24: Learning Algorithms For Life Scientists

Hidden Markov Models (HMMs)

• A Markov Model drives the world but it is hidden from direct observation and its status must be inferred from a set of observables. – Voice recognition

• Observable: Sound waves• Hidden states: Words

– Intron/Exon prediction• Observable: nucleotide sequence• Hidden State: Exon, Intron, Non-coding

– Secondary structure prediction for protein• Observable: Amino acid sequence• Hidden State: Alpha helix, Beta Sheet, Unstructured

Page 25: Learning Algorithms For Life Scientists

Hidden Markov Models: Example• Secondary Structure Prediction

ObservableStates

HiddenStates

UnstructuredAlphaHelix

BetaSheet

His Asp Arg Phe Ala Cis Ser Gln Glu Lys

Leu Met Asn Ser Tyr Thr Ile Trp Pro ValGly

Page 26: Learning Algorithms For Life Scientists

Hidden Markov Models: Smaller Example

GT CA

ExonIntergenic

Intron

ObservableStates

HiddenStates

• Exon/Intron Mapping

P(Ex|Ex)

P(In|Ex)

P(In|Ex) P(It|It)P(Ig|It)

P(Ex|It)

P(Ig|Ig)

P(Itr|Ig)P(Ex|Ig)

P(A|Ex)P(A|It)

P(A|Ig)

P(C|It)P(G|It)P(T|It)P(T|Ex) P(G|Ex) P(C|Ex)

P(C|Ig)

P(T|Ig) P(G|Ig)

Page 27: Learning Algorithms For Life Scientists

Hidden Markov Models: Smaller Example

• Exon/Intron Mapping

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

m

Hid

den

Sta

te

Observable

0.010.890.1

ItIgEx

Starting Distribution

Page 28: Learning Algorithms For Life Scientists

Hidden Markov Model

• How to predict outcomes from a HMM

• Brute force:

– Try every possible Markov chain• Which chain has greatest probability of

generating observed data?

– Viterbi algorithm• Dynamic programming approach

Page 29: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.90.09Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = P(A|Ex) * Start Exon = 3.3*10-2

Introgenic = P(A|Ig) * Start Ig = 2.2*10-1

Intron = P(A|It) * Start It = 0.14 * 0.01 = 1.4*10-3

G

T

A

G

A

G

C

G

G

T

A

A

T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 30: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex) = 4.6*10-2

Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig) = 2.8*10-2

Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It) = 1.1*10-3

G

T

A

G

A

G

C

G

G

T

A

A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 31: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex) = 1.1*10-2

Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig) = 3.5*10-3

Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It) = 1.3*10-3

G

T

A

G

A

G

C

G

G

T

A

1.3*10-33.5*10-31.1*10-2A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 32: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex)Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig)Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It)

G

T

A

G

A

G

C

G

G

T

2.9*10-44.3*10-42.4*10-3A

1.3*10-33.5*10-31.1*10-2A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 33: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex)Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig)Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It)

G

T

A

G

A

G

C

G

G

7.8*10-56.1*10-57.2*10-4T

2.9*10-44.3*10-42.4*10-3A

1.3*10-33.5*10-31.1*10-2A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 34: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex)Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig)Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It)

G

T

A

G

A

G

C

G

7.2*10-51.8*10-55.5*10-5G

7.8*10-56.1*10-57.2*10-4T

2.9*10-44.3*10-42.4*10-3A

1.3*10-33.5*10-31.1*10-2A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 35: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex)Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig)Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It)

G

T

A

G

A

G

C

2.9*10-52.2*10-64.3*10-6G

7.2*10-51.8*10-55.5*10-5G

7.8*10-56.1*10-57.2*10-4T

2.9*10-44.3*10-42.4*10-3A

1.3*10-33.5*10-31.1*10-2A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 36: Learning Algorithms For Life Scientists

Viterbi Algorithm: Trace

0.80.020.18It

0.010.50.49Ig

0.20.10.7Ex

ItIgEx

Hidden State Transition Probabilities

0.20.50.160.14It

0.250.250.250.25Ig

0.140.110.420.33Ex

CGTA

Observable State Probabilities

To

Fro

mH

idd

en S

tate

Observable

0.010.890.1

ItIgEx

Starting Distribution

Example Sequence: ATAATGGCGAGTG

Exon = Max( P(Ex|Ex)*Pn-1(Ex), P(Ex|Ig)*Pn-1(Ig), P(Ex|It)*Pn-1(It) ) *P(T|Ex)Introgenic =Max( P(Ig|Ex)*Pn-1(Ex), P(Ig|Ig)*Pn-1(Ig), P(Ig|It)*Pn-1(It) ) * P(T|Ig)Intron = Max( P(It|Ex)*Pn-1(Ex), P(It|Ig)*Pn-1(Ig), P(It,It)*Pn-1(It) ) * P(T|It)

4.7*10-103.6*10-111.1*10-10G

1.2*10-91.2*10-101.4*10-9T

9.2*10-94.1*10-104.9*-9A

8.2*10-82.7*10-98.4*-9G

2.0*10-79.1*10-91.1*10-7A

1.8*10-63.5*10-89.1*10-8G

4.6*10-62.8*10-77.2*10-7C

2.9*10-52.2*10-64.3*10-6G

7.2*10-51.8*10-55.5*10-5G

7.8*10-56.1*10-57.2*10-4T

2.9*10-44.3*10-42.4*10-3A

1.3*10-33.5*10-31.1*10-2A

1.1*10-32.8*10-24.6*10-2T

1.4*10-32.2*10-13.3*10-2A

IntronIntrogenicExon

Page 37: Learning Algorithms For Life Scientists

Hidden Markov Models

• How to Train an HMM– The forward-backward algorithm

• Ugly probability theory math:

• Starts with an initial guess of parameters• Refines parameters by attempting to reduce the

errors it provokes with fitted to the data.– Normalized probability of the “Forward probability” of

arriving at the state given the observable cross multiplied by the backward probability of generating that observable given the parameter.

CENSORED

Page 38: Learning Algorithms For Life Scientists

The Algorithms

• Bayesian Networks• Hidden Markov Models• Genetic Algorithms

• Neural Networks

Page 39: Learning Algorithms For Life Scientists

Genetic Algorithms

• Individuals are series of bits which represent candidate solutions– Functions– Structures– Images– Code

• Based on Darwin evolution– individuals mate, mutate, and are selected

based on a Fitness Function

Page 40: Learning Algorithms For Life Scientists

Genetic Algorithms

• Encoding Rules– “Gray” bit encoding

• Bit distance proportional to value distance

• Selection Rules– Digital / Analog Threshold

– Linear Amplification Vs Weighted Amplification

• Mating Rules– Mutation parameters– Recombination parameters

Page 41: Learning Algorithms For Life Scientists

Genetic Algorithms

• When are they useful?– Movements in sequence space are funnel shaped

with fitness function• Systems where evolution actually applies!

• Examples– Medicinal chemistry– Protein folding– Amino acid substitutions– Membrane trafficking modeling– Ecological simulations – Linear Programming– Traveling salesman

Page 42: Learning Algorithms For Life Scientists

The Algorithms

• Bayesian Networks• Hidden Markov Models• Genetic Algorithms

• Neural Networks

Page 43: Learning Algorithms For Life Scientists

Neural Networks

• 1943 McCulloch and Pitts Model of how Neurons process information– Field immediately splits

• Studying brain’s– Neurology

• Studying artificial intelligence– Neural Networks

Page 44: Learning Algorithms For Life Scientists

Neural Networks: A Neuron, Node, or Unit

Σ(W)- W0,c

Activation Function Output

Wa,c

Wb,c

W0,c

(Bias)

Wc,n

a z (Bias)

Page 45: Learning Algorithms For Life Scientists

Neural Networks: Activation Functions

Sigmoid Function(logistic function)

Threshold Function

Zero point set by bias

In In

out out+1 +1

Page 46: Learning Algorithms For Life Scientists

Threshold Functions can makeLogic Gates with Neurons!

000

011

01∩

Logical AndW0,c = 1.5

Wb,c = 1

Wa,c = 1

A

B

Σ(W)- W0,ca z (Bias)

Output

If ( Σ(w) – Wo,c > 0 )

Then FIRE

Else

Don’t

(Bias)

Page 47: Learning Algorithms For Life Scientists

And Gate: Trace

W0,c = 1.5

Wb,c = 1

Wa,c = 1

-1.5

Off

Off

Off-1.5 < 0(Bias)

Page 48: Learning Algorithms For Life Scientists

And Gate: Trace

W0,c = 1.5

Wb,c = 1

Wa,c = 1

-0.5

On

Off

Off-0.5 < 0(Bias)

Page 49: Learning Algorithms For Life Scientists

And Gate: Trace

W0,c = 1.5

Wb,c = 1

Wa,c = 1

-0.5

Off

On

Off-0.5 < 0(Bias)

Page 50: Learning Algorithms For Life Scientists

And Gate: Trace

W0,c = 1.5

Wb,c = 1

Wa,c = 1

0.5

On

On

On0.5 > 0(Bias)

Page 51: Learning Algorithms For Life Scientists

Threshold Functions can makeLogic Gates with Neurons!

W0,c = 0.5

Wb,c = 1

Wa,c = 1

A

Σ(W)- W0,ca z (Bias)

If ( Σ(w) – Wo,c > 0 )

Then FIRE

Else

Don’t

(Bias)

010

111

01U

Logical OrB

Page 52: Learning Algorithms For Life Scientists

Or Gate: Trace

W0,c = 0.5

Wb,c = 1

Wa,c = 1

-0.5

Off

Off

Off-0.5 < 0(Bias)

Page 53: Learning Algorithms For Life Scientists

Or Gate: Trace

W0,c = 0.5

Wb,c = 1

Wa,c = 1

0.5

On

Off

On0.5 > 0(Bias)

Page 54: Learning Algorithms For Life Scientists

Or Gate: Trace

W0,c = 0.5

Wb,c = 1

Wa,c = 1

0.5

Off

On

On0.5 > 0(Bias)

Page 55: Learning Algorithms For Life Scientists

Or Gate: Trace

W0,c = 0.5

Wb,c = 1

Wa,c = 1

1.5

On

On

On1.5 > 0(Bias)

Page 56: Learning Algorithms For Life Scientists

Threshold Functions can makeLogic Gates with Neurons!

W0,c = -0.5

Wa,c = -1

Σ(W)- W0,ca z (Bias)

If ( Σ(w) – Wo,c > 0 )

Then FIRE

Else

Don’t

(Bias)Logical Not

10

01!

Page 57: Learning Algorithms For Life Scientists

Not Gate: Trace

W0,c = -0.5

Wa,c = -1

-0.5

Off

On0.5 > 0(Bias)

0 – (-0.5) = 0.5

Page 58: Learning Algorithms For Life Scientists

Not Gate: Trace

W0,c = -0.5

Wa,c = -1

-0.5

On

Off-0.5 < 0(Bias)

-1 – (-0.5) = -0.5

Page 59: Learning Algorithms For Life Scientists

Feed-Forward Vs. Recurrent Networks

• Feed-Forward– No Cyclic connections– Function of its current

inputs– No internal state other

then weights of connections

• “Out of time”

• Recurrent– Cyclic connections– Dynamic behavior

• Stable• Oscillatory• Chaotic

– Response depends on current state

• “In time”

– Short term memory!

Page 60: Learning Algorithms For Life Scientists

Feed-Forward Networks

• “Knowledge” is represented by weight on edges– Modeless!

• “Learning” consists of adjusting weights• Customary Arrangements

– One Boolean output for each value– Arranged in Layers

• Layer 1 = inputs

• Layer 2 to (n-1) = Hidden

• Layer N = outputs– “Perceptron” 2 layer Feed-Forward network

Page 61: Learning Algorithms For Life Scientists

Layers

Input Output

Hidden layer

Page 62: Learning Algorithms For Life Scientists

Perceptron Learning

• Gradient Decent used to reduce error

• Essentially: – New Weight = Old Weight + adjustment– Adjustment = α X error X input X d(activation function)

• α = Learning Rate

CENSORED

Page 63: Learning Algorithms For Life Scientists

Hidden Network Learning

• Back-Propagation

• Essentially: – Start with Gradient Decent from output– Assign “blame” to inputting neurons proportional to

their weights– Adjust weights at previous level using Gradient

decent based on “blame”

CENSORED

Page 64: Learning Algorithms For Life Scientists

They don’t get it either:Issues that aren’t well understood

• α (Learning Rate)

• Depth of network (number of layers)

• Size of hidden layers– Overfitting

– Cross-validation

• Minimum connectivity– Optimal Brain Damage Algorithm

• No extractable model!

Page 65: Learning Algorithms For Life Scientists

How Are Neural Nets Different From My Brain?

1. Neural nets are feed forward– Brains can be recurrent with feedback loops

2. Neural nets do not distinguish between + or – connections

– In brains excitatory and inhibitory neurons have different properties• Inhibitory neurons short-distance

3. Neural nets exist “Out of time”– Our brains clearly do exist “in time”

4. Neural nets learn VERY differently– We have very little idea how our brains are learning

“Fraser’s” Rules

“In theory one can, of course, implement biologically realistic neural networks, but this is a mammoth task. All kinds of details have to be gotten right, or you end up with a

network that completely decays to unconnectedness, or one that ramps up its connections until it basically has a seizure.”

Page 66: Learning Algorithms For Life Scientists

Frontiers in AI

• Applications of current algorithms• New algorithms for determining

parameters from training data– Backward-Forward– Backpropagation

• Better classification of the mysteries of neural networks

• Pathology modeling in neural networks• Evolutionary modeling

Page 67: Learning Algorithms For Life Scientists