pycon2017

Edward

2017-09-09@ PyConJP 2017

Yuta Kashino ( )

BakFoo, Inc. CEO Astro Physics /Observational Cosmology Zope / Python Realtime Data Platform for Enterprise / Prototyping

Yuta Kashino ( )

PyCon2015

Python

PyCon2016

PyCon2017 DNN PPL Edward @yutakashino

- Edward

Edward

http://bayesiandeeplearning.org/

Shakir Mohamed

http://blog.shakirm.com/wp-content/uploads/2015/11/CSML_BayesDeep.pdf

Denker, Schwartz, Wittner, Solla, Howard, Jackel, Hopfield (1987) Denker and LeCun (1991) MacKay (1992) Hinton and van Camp (1993) Neal (1995) Barber and Bishop (1998) Graves (2011) Blundell, Cornebise, Kavukcuoglu, and Wierstra (2015) Hernandez-Lobato and Adam (2015)

Yarin Gal Zoubin Ghahramani Shakir Mohamed Dastin Tran Rajesh Ranganath David Blei Ian Goodfellow

Columbia U

U of Cambridge

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

(n) =X

(2)j �(

(1)ji x

(n)i ) + ✏

p(y(n) | x(n),✓) = �(

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

- 2012 ILSVRC

→ 2015

- ReLU, DropOut, Mini Batch, SGD(Adam), LSTM… -

- ImageNet, MSCoCo… - : GPU,

- : - Theano, Torch, Caffe, TensorFlow, Chainer, MxNet, PyTorch…

https://lossfunctions.tumblr.com/

- Adversarial examples -

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

(n) =X

(2)j �(

(1)ji x

(n)i ) + ✏

p(y(n) | x(n),✓) = �(

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

1. →

2. → DropOut

- data hypothesis( )

P (H | D) =P (H)P (D | H)PH P (H)P (D|H)

P (x) =X

P (x, y)

P (x, y) = P (x)P (y | x)

posterior likelihoodprior

evidence

1. - :

P (H | D) =P (H)P (D | H)PH P (H)P (D|H)

likelihood priorposterior

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)m:

P (x | D,m) =

ZP (x | ✓,D,m)P (✓ | D,m)d✓

P (m | D) =P (D | m)P (m)

evidence

✓ ⇠ Beta(✓ | 2, 2)

…x1 x2 xd

✓(2)

✓(1)

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)m:

- (MCMC)

- (Variational Inference)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m) ZP (D | ✓,m)P (✓)d✓

evidence

P (✓ | D,m) = P (D | ✓,m)P (✓ | m)

liklihood priorposterior

https://github.com/dfm/corner.py

http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/

NUTS (HMC)Metropolis -Hastings

P(θ|D,m) KL q(θ)

�⇤= argmin�KL(q(✓;�) || p(✓ | D))

= argmin�Eq(✓;�)[logq(✓;�)� p(✓ | D)]

ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]

�⇤ = argmax�ELBO(�)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

1. - KL =ELBO

q(✓;�1)

q(✓;�5)

p(✓,D) p(✓,D)

✓✓

�⇤ = argmax�ELBO(�)

ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]

1. - P q

- ADVI: Automatic Differentiation Variational Inference - BBVI: Blackbox Variational Inference

arxiv:1603.00788

arxiv:1401.0118

https://github.com/HIPS/autograd/blob/master/examples/bayesian_neural_net.py

1. - VI

- David MacKay “Lecture 14 of the Cambridge Course” - PRML 10 http://www.inference.org.uk/itprnn_lectures/

1. Reference- Zoubin Ghahramani “History of Bayesian neural networks” NIPS 2016 Workshop Bayesian Deep Learning - Yarin Gal “Bayesian Deep Learning"O'Reilly Artificial Intelligence in New York, 2017

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

(n) =X

(2)j �(

(1)ji x

(n)i ) + ✏

p(y(n) | x(n),✓) = �(

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

Dropout

2.Dropout- Yarin Gal ”Uncertainty in Deep Learning” - Dropout

- Dropout : conv

- LeNet with Dropout

http://mlg.eng.cam.ac.uk/yarin/blog_2248.html

2.Dropout- LeNet DNN

- conv Dropout MNIST

2.Dropout- CO2

- (MCMC)

…x1 x2 xd

✓(2)

✓(1)

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

Edward

Edward- Dustin Tran (Open AI)

- Blei Lab

- (PPL)

- 2016 2 PPL

- / TensorFlow

- George Edward Pelham BoxBox-Cox Trans., Box-Jenkins, Ljung-Box test box plot Tukey,

3 2 RA Fisher

- Probabilistic Programing Library/Langage - Stan, PyMC3, Anglican, Church, Venture,Figaro, WebPPL, Edward

- : Edward / PyMC3

- (VI)

Metropolis Hastings Hamilton Monte Carlo Stochastic Gradient Langevin Dynamics No-U-Turn Sampler

Blackbox Variational Inference Automatic Differentiation Variational Inference

Edward

TensorFlow(TF) + (PPL)

PPL: + +

Edward

Edward TensorFlow

1. TF: -

1. TF:

1. TF: -

- GPU / TPU

Inception v3 Inception v4

# of parameters: 42,679,816

# of layers: 48

1. TF: - Keras, Slim

- TensorBoard

1. TF: -

- tf.contrib.distributions

edward

⇤ s P (x | ↵)

✓⇤ ⇠ Beta(✓ | 1, 1)

2. - ( )

Edward

p(x, ✓) = Beta(✓ | 1, 1)50Y

Bernoulli(xn | ✓),

log_prob()

mean()

sample()

- tf.contrib.distributions 51 : https://www.tensorflow.org/api_docs/python/tf/contrib/distributions

3. Edward TF

xtxt�1

ht�1 ht

by byyt�1 yt

= tanh(W

t�1 +W

⇠ Normal(W

3. http://edwardlib.org/tutorials/

- (MCMC)

…x1 x2 xd

✓(2)

✓(1)

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

Edward MCMC

4. : MCMC

Edward : KLqp

5. Box’s loopGeorge Edward Pelham Box

Blei 2014

5. Box’s loop

Edward- Edward = TensorFlow +

- TensorFlow

- TF GPU, TPU, TensorBoard, Keras

- TensorFlow

Refrence•D. Tran, A. Kucukelbir, A. Dieng, M. Rudolph, D. Liang, and

D.M. Blei. Edward: A library for probabilistic modeling, inference, and criticism.(arXiv preprint arXiv:1610.09787)

•D. Tran, M.D. Hoffman, R.A. Saurous, E. Brevdo, K. Murphy, and D.M. Blei. Deep probabilistic programming.(arXiv preprint arXiv:1701.03757)

•Box, G. E. (1976). Science and statistics. (Journal of the American Statistical Association, 71(356), 791–799.)

•D.M. Blei. Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models. (Annual Review of Statistics and Its Application Volume 1, 2014)

- Edward

Edward

Questions

kashino@bakfoo.com@yutakashino

BakFoo, Inc.NHK NMAPS: +

BakFoo, Inc.PyConJP 2015

Python

BakFoo, Inc.

BakFoo, Inc.: SNS +

256 28*28

pycon2017

Technology

from paths to sandboxes

simple steps to ux/ui web design

profile & resume buzzkill - words to avoid

wireframes for the wicked

designing for an augmented reality world

expert positioning using linkedin

social, digital & mobile in the americas

social, digital & mobile in india 2014

become a photographer

how to set google analytics goals and funnels

zipcar millennials survey

automotive20

succumbing to the python in financial markets

site search analytics in a nutshell

design principles: the philosophy of ux

the art of colors

google analytics new feature

foursquare wallet presentation

designing with vision

refresh okc