pycon2017

66
Edward 2017-09-09@ PyConJP 2017

Upload: yuta-kashino

Post on 21-Jan-2018

5.221 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Pycon2017

Edward

2017-09-09@ PyConJP 2017

Page 2: Pycon2017

Yuta Kashino ( )

BakFoo, Inc. CEO Astro Physics /Observational Cosmology Zope / Python Realtime Data Platform for Enterprise / Prototyping

Page 3: Pycon2017

Yuta Kashino ( )

arXiv

PyCon2015

Python

PyCon2016

PyCon2017 DNN PPL Edward @yutakashino

Page 4: Pycon2017

-

- Edward

Edward

Page 5: Pycon2017
Page 6: Pycon2017

http://bayesiandeeplearning.org/

Page 7: Pycon2017

Shakir Mohamed

http://blog.shakirm.com/wp-content/uploads/2015/11/CSML_BayesDeep.pdf

Page 8: Pycon2017

-

Denker, Schwartz, Wittner, Solla, Howard, Jackel, Hopfield (1987) Denker and LeCun (1991) MacKay (1992) Hinton and van Camp (1993) Neal (1995) Barber and Bishop (1998) Graves (2011) Blundell, Cornebise, Kavukcuoglu, and Wierstra (2015) Hernandez-Lobato and Adam (2015)

Page 9: Pycon2017

-

Yarin Gal Zoubin Ghahramani Shakir Mohamed Dastin Tran Rajesh Ranganath David Blei Ian Goodfellow

Columbia U

U of Cambridge

Page 10: Pycon2017

-

- :

- :

- :

- :

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

x

y

y

(n) =X

j

(2)j �(

X

i

(1)ji x

(n)i ) + ✏

(n)

p(y(n) | x(n),✓) = �(

X

i

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

Page 11: Pycon2017

: - +

- 2012 ILSVRC

→ 2015

-

-

-

-

Page 12: Pycon2017

: -

- ReLU, DropOut, Mini Batch, SGD(Adam), LSTM… -

- ImageNet, MSCoCo… - : GPU,

- : - Theano, Torch, Caffe, TensorFlow, Chainer, MxNet, PyTorch…

Page 13: Pycon2017

: -

-

-

-

-

https://lossfunctions.tumblr.com/

Page 14: Pycon2017

: -

-

-

- Adversarial examples -

Page 15: Pycon2017

-

=

=

-

Page 16: Pycon2017

-

- :

- :

- :

- :

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

x

y

y

(n) =X

j

(2)j �(

X

i

(1)ji x

(n)i ) + ✏

(n)

p(y(n) | x(n),✓) = �(

X

i

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

Page 17: Pycon2017

1. →

2. → DropOut

Page 18: Pycon2017

1.

- data hypothesis( )

- :

-

-

P (H | D) =P (H)P (D | H)PH P (H)P (D|H)

P (x) =X

y

P (x, y)

P (x, y) = P (x)P (y | x)

posterior likelihoodprior

evidence

Page 19: Pycon2017

1. - :

- :

-

-

- :

P (H | D) =P (H)P (D | H)PH P (H)P (D|H)

likelihood priorposterior

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)m:

P (x | D,m) =

ZP (x | ✓,D,m)P (✓ | D,m)d✓

P (m | D) =P (D | m)P (m)

P (D)

evidence

✓ ⇠ Beta(✓ | 2, 2)

Page 20: Pycon2017

1.

-

- :

- :

- :

…x1 x2 xd

✓(2)

✓(1)

x

y

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)m:

Page 21: Pycon2017

1.

- (MCMC)

- (Variational Inference)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m) ZP (D | ✓,m)P (✓)d✓

evidence

Page 22: Pycon2017

1.

-

-

P (✓ | D,m) = P (D | ✓,m)P (✓ | m)

liklihood priorposterior

https://github.com/dfm/corner.py

Page 23: Pycon2017

1.

http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/

NUTS (HMC)Metropolis -Hastings

Page 24: Pycon2017

1.

Page 25: Pycon2017

P(θ|D,m) KL q(θ)

ELBO

1.

�⇤= argmin�KL(q(✓;�) || p(✓ | D))

= argmin�Eq(✓;�)[logq(✓;�)� p(✓ | D)]

ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]

�⇤ = argmax�ELBO(�)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

Page 26: Pycon2017

1. - KL =ELBO

- P q

q(✓;�1)

q(✓;�5)

p(✓,D) p(✓,D)

✓✓

�⇤ = argmax�ELBO(�)

ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]

Page 27: Pycon2017

1. - P q

- :

- ADVI: Automatic Differentiation Variational Inference - BBVI: Blackbox Variational Inference

arxiv:1603.00788

arxiv:1401.0118

https://github.com/HIPS/autograd/blob/master/examples/bayesian_neural_net.py

Page 28: Pycon2017

1. - VI

-

- David MacKay “Lecture 14 of the Cambridge Course” - PRML 10 http://www.inference.org.uk/itprnn_lectures/

Page 29: Pycon2017

1. Reference- Zoubin Ghahramani “History of Bayesian neural networks” NIPS 2016 Workshop Bayesian Deep Learning - Yarin Gal “Bayesian Deep Learning"O'Reilly Artificial Intelligence in New York, 2017

Page 30: Pycon2017

2.

-

- :

- :

- :

- :

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

x

y

y

(n) =X

j

(2)j �(

X

i

(1)ji x

(n)i ) + ✏

(n)

p(y(n) | x(n),✓) = �(

X

i

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

Dropout

Page 31: Pycon2017

2.Dropout- Yarin Gal ”Uncertainty in Deep Learning” - Dropout

- Dropout : conv

- LeNet with Dropout

http://mlg.eng.cam.ac.uk/yarin/blog_2248.html

Page 32: Pycon2017

2.Dropout- LeNet DNN

- conv Dropout MNIST

Page 33: Pycon2017

2.Dropout- CO2

Page 34: Pycon2017

- :

- :

- :

- :

- (MCMC)

- (Variational Inference)

…x1 x2 xd

✓(2)

✓(1)

x

y

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

Page 35: Pycon2017

Edward

Page 36: Pycon2017

Edward- Dustin Tran (Open AI)

- Blei Lab

- (PPL)

- 2016 2 PPL

- / TensorFlow

- George Edward Pelham BoxBox-Cox Trans., Box-Jenkins, Ljung-Box test box plot Tukey,

3 2 RA Fisher

Page 37: Pycon2017

- Probabilistic Programing Library/Langage - Stan, PyMC3, Anglican, Church, Venture,Figaro, WebPPL, Edward

- : Edward / PyMC3

- (VI)

Metropolis Hastings Hamilton Monte Carlo Stochastic Gradient Langevin Dynamics No-U-Turn Sampler

Blackbox Variational Inference Automatic Differentiation Variational Inference

Page 38: Pycon2017

PPL

Edward

TensorFlow(TF) + (PPL)

TF:

PPL: + +

Page 39: Pycon2017

PPL

Edward

Edward TensorFlow

Page 40: Pycon2017

1. TF: -

- :

Page 41: Pycon2017

1. TF:

Page 42: Pycon2017

1. TF: -

-

- GPU / TPU

Inception v3 Inception v4

# of parameters: 42,679,816

# of layers: 48

Page 43: Pycon2017

1. TF: - Keras, Slim

- TensorBoard

Page 44: Pycon2017

1. TF: -

- tf.contrib.distributions

Page 45: Pycon2017

2. x:

edward

x

⇤ s P (x | ↵)

✓⇤ ⇠ Beta(✓ | 1, 1)

Page 46: Pycon2017

2. - ( )

Edward

p(x, ✓) = Beta(✓ | 1, 1)50Y

n=1

Bernoulli(xn | ✓),

Page 47: Pycon2017

2.

-

log_prob()

-

mean()

-

sample()

- tf.contrib.distributions 51 : https://www.tensorflow.org/api_docs/python/tf/contrib/distributions

Page 48: Pycon2017

3. Edward TF

Page 49: Pycon2017

3.

Wh Wh

Wx

Wx

bh bh

xtxt�1

ht�1 ht

Wy Wy

by byyt�1 yt

h

t

= tanh(W

h

h

t�1 +W

x

x

t

+ b

h

)

y

t

⇠ Normal(W

y

h

t

+ b

y

, 1).

Page 50: Pycon2017

3. http://edwardlib.org/tutorials/

Page 51: Pycon2017

4.

- :

- :

- :

- :

- (MCMC)

- (Variational Inference)

…x1 x2 xd

✓(2)

✓(1)

x

y

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

Page 52: Pycon2017

4.

Page 53: Pycon2017

Edward MCMC

4. : MCMC

Page 54: Pycon2017

Edward : KLqp

4. :

Page 55: Pycon2017

5. Box’s loopGeorge Edward Pelham Box

Blei 2014

Page 56: Pycon2017

5. Box’s loop

Page 57: Pycon2017

Edward- Edward = TensorFlow +

+ +

- TensorFlow

-

- TF GPU, TPU, TensorBoard, Keras

-

- TensorFlow

Page 58: Pycon2017
Page 59: Pycon2017

Refrence•D. Tran, A. Kucukelbir, A. Dieng, M. Rudolph, D. Liang, and

D.M. Blei. Edward: A library for probabilistic modeling, inference, and criticism.(arXiv preprint arXiv:1610.09787)

•D. Tran, M.D. Hoffman, R.A. Saurous, E. Brevdo, K. Murphy, and D.M. Blei. Deep probabilistic programming.(arXiv preprint arXiv:1701.03757)

•Box, G. E. (1976). Science and statistics. (Journal of the American Statistical Association, 71(356), 791–799.)

•D.M. Blei. Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models. (Annual Review of Statistics and Its Application Volume 1, 2014)

Page 60: Pycon2017

-

- Edward

Edward

Page 62: Pycon2017

BakFoo, Inc.NHK NMAPS: +

Page 63: Pycon2017

BakFoo, Inc.PyConJP 2015

Python

Page 65: Pycon2017

BakFoo, Inc.: SNS +

Page 66: Pycon2017

3.

256 28*28