deep learning and tensorflow - paviadeep learning: a theoretical introduction –episode 2 [1] deep...

60
[1] Deep Learning: a theoretical introduction – Episode 2 Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli Studi di Pavia

Upload: others

Post on 20-May-2020

24 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[1]Deep Learning: a theoretical introduction – Episode 2

Deep Learningand TensorFlowEpisode 2The Quest for Deeper Networks

Università degli Studi di Pavia

Page 2: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[2]Deep Learning: a theoretical introduction – Episode 2

Feed-Forward Neural Network

Page 3: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[3]Deep Learning: a theoretical introduction – Episode 2

Feed-Forward Neural Network

Page 4: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[4]Deep Learning: a theoretical introduction – Episode 2

Training Feed-Forward Neural Networks

Page 5: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[5]Deep Learning: a theoretical introduction – Episode 2

The Quest forDeeper Networks

Page 6: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[6]Deep Learning: a theoretical introduction – Episode 2

Shallow vs. Deep Feed-Forward Neural Networks

Page 7: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[7]Deep Learning: a theoretical introduction – Episode 2

Shallow vs. Deep Feed-Forward Neural Networks

Page 8: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[8]Deep Learning: a theoretical introduction – Episode 2

Shallow vs. Deep Feed-Forward Neural Networks

Page 9: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[9]Deep Learning: a theoretical introduction – Episode 2

Shallow vs. Deep Feed-Forward Neural Networks

Page 10: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[10]Deep Learning: a theoretical introduction – Episode 2

Parity Circuits

Page 11: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[11]Deep Learning: a theoretical introduction – Episode 2

Parity Circuits

Page 12: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[12]Deep Learning: a theoretical introduction – Episode 2

Parity Circuits

Page 13: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[13]Deep Learning: a theoretical introduction – Episode 2

Parity Circuits

Page 14: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[14]Deep Learning: a theoretical introduction – Episode 2

Parity Circuits

Page 15: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[15]Deep Learning: a theoretical introduction – Episode 2

Depth and piecewise linear functions

Page 16: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[16]Deep Learning: a theoretical introduction – Episode 2

Depth and piecewise linear functions

Page 17: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[17]Deep Learning: a theoretical introduction – Episode 2

k > 2 h(2)

Depth and piecewise linear functions

Page 18: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[18]Deep Learning: a theoretical introduction – Episode 2

h

k h

pmax d k

Depth and piecewise linear functions

Page 19: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[19]Deep Learning: a theoretical introduction – Episode 2

About why they did not useDeep Networks

from the beginning

Page 20: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[20]Deep Learning: a theoretical introduction – Episode 2

Problem: vanishing or exploding Gradients

Page 21: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[21]Deep Learning: a theoretical introduction – Episode 2

Problem: vanishing or exploding Gradients

Page 22: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[22]Deep Learning: a theoretical introduction – Episode 2

• g

• W(i)

Problem: vanishing or exploding Gradients

k

Page 23: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[23]Deep Learning: a theoretical introduction – Episode 2

Problem: initial values of the parameters

Page 24: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[24]Deep Learning: a theoretical introduction – Episode 2

A bag of wonderful tricks

Page 25: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[25]Deep Learning: a theoretical introduction – Episode 2

Why ReLU is better (sometimes)

Page 26: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[26]Deep Learning: a theoretical introduction – Episode 2

Why ReLU is better (sometimes)

Page 27: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[27]Deep Learning: a theoretical introduction – Episode 2

Why ReLU is better (sometimes)

Page 28: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[28]Deep Learning: a theoretical introduction – Episode 2

Overfitting

Page 29: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[29]Deep Learning: a theoretical introduction – Episode 2

Dropout

Page 30: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[30]Deep Learning: a theoretical introduction – Episode 2

Dropout

Page 31: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[31]Deep Learning: a theoretical introduction – Episode 2

Dropout

Page 32: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[32]Deep Learning: a theoretical introduction – Episode 2

Dropout

Page 33: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[33]Deep Learning: a theoretical introduction – Episode 2

Contrasting Overfitting

Page 34: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[34]Deep Learning: a theoretical introduction – Episode 2

Improving on MBGD

Page 35: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[35]Deep Learning: a theoretical introduction – Episode 2

Improving on MBGD

Page 36: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[36]Deep Learning: a theoretical introduction – Episode 2

Improving on MBGD

Page 37: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[37]Deep Learning: a theoretical introduction – Episode 2

AdaGrad

Page 38: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[38]Deep Learning: a theoretical introduction – Episode 2

AdaGrad

B

Page 39: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[39]Deep Learning: a theoretical introduction – Episode 2

AdaGrad

a1 a2

Page 40: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[40]Deep Learning: a theoretical introduction – Episode 2

AdaGrad

d

Page 41: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[41]Deep Learning: a theoretical introduction – Episode 2

AdaGrad

d

Page 42: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[42]Deep Learning: a theoretical introduction – Episode 2

AdaGrad

Page 43: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[43]Deep Learning: a theoretical introduction – Episode 2

AdaDelta

Page 44: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[44]Deep Learning: a theoretical introduction – Episode 2

Improving on MBGD

Page 45: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[45]Deep Learning: a theoretical introduction – Episode 2

Improving on MBGD

Page 46: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[46]Deep Learning: a theoretical introduction – Episode 2

Improving on MBGD

Page 47: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[47]Deep Learning: a theoretical introduction – Episode 2

An aside:function approximation vs. classification

Page 48: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[48]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 49: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[49]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 50: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[50]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

i

Page 51: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[51]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 52: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[52]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 53: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[53]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 54: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[54]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

m wl

Page 55: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[55]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 56: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[56]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

Page 57: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[57]Deep Learning: a theoretical introduction – Episode 2

Classification: Softmax

h

Page 58: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[58]Deep Learning: a theoretical introduction – Episode 2

Another aside:autoencoders

Page 59: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[59]Deep Learning: a theoretical introduction – Episode 2

Auto-encoders

Page 60: Deep Learning and TensorFlow - PaviaDeep Learning: a theoretical introduction –Episode 2 [1] Deep Learning and TensorFlow Episode 2 The Quest for Deeper Networks Università degli

[60]Deep Learning: a theoretical introduction – Episode 2

Auto-encoders