pycon2017

Post on 21-Jan-2018

5.221 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Edward

2017-09-09@ PyConJP 2017

Yuta Kashino ( )

BakFoo, Inc. CEO Astro Physics /Observational Cosmology Zope / Python Realtime Data Platform for Enterprise / Prototyping

Yuta Kashino ( )

arXiv

PyCon2015

Python

PyCon2016

PyCon2017 DNN PPL Edward @yutakashino

-

- Edward

Edward

http://bayesiandeeplearning.org/

Shakir Mohamed

http://blog.shakirm.com/wp-content/uploads/2015/11/CSML_BayesDeep.pdf

-

Denker, Schwartz, Wittner, Solla, Howard, Jackel, Hopfield (1987) Denker and LeCun (1991) MacKay (1992) Hinton and van Camp (1993) Neal (1995) Barber and Bishop (1998) Graves (2011) Blundell, Cornebise, Kavukcuoglu, and Wierstra (2015) Hernandez-Lobato and Adam (2015)

-

Yarin Gal Zoubin Ghahramani Shakir Mohamed Dastin Tran Rajesh Ranganath David Blei Ian Goodfellow

Columbia U

U of Cambridge

-

- :

- :

- :

- :

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

x

y

y

(n) =X

j

(2)j �(

X

i

(1)ji x

(n)i ) + ✏

(n)

p(y(n) | x(n),✓) = �(

X

i

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

: - +

- 2012 ILSVRC

→ 2015

-

-

-

-

: -

- ReLU, DropOut, Mini Batch, SGD(Adam), LSTM… -

- ImageNet, MSCoCo… - : GPU,

- : - Theano, Torch, Caffe, TensorFlow, Chainer, MxNet, PyTorch…

: -

-

-

-

-

https://lossfunctions.tumblr.com/

: -

-

-

- Adversarial examples -

-

=

=

-

-

- :

- :

- :

- :

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

x

y

y

(n) =X

j

(2)j �(

X

i

(1)ji x

(n)i ) + ✏

(n)

p(y(n) | x(n),✓) = �(

X

i

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

1. →

2. → DropOut

1.

- data hypothesis( )

- :

-

-

P (H | D) =P (H)P (D | H)PH P (H)P (D|H)

P (x) =X

y

P (x, y)

P (x, y) = P (x)P (y | x)

posterior likelihoodprior

evidence

1. - :

- :

-

-

- :

P (H | D) =P (H)P (D | H)PH P (H)P (D|H)

likelihood priorposterior

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)m:

P (x | D,m) =

ZP (x | ✓,D,m)P (✓ | D,m)d✓

P (m | D) =P (D | m)P (m)

P (D)

evidence

✓ ⇠ Beta(✓ | 2, 2)

1.

-

- :

- :

- :

…x1 x2 xd

✓(2)

✓(1)

x

y

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)m:

1.

- (MCMC)

- (Variational Inference)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m) ZP (D | ✓,m)P (✓)d✓

evidence

1.

-

-

P (✓ | D,m) = P (D | ✓,m)P (✓ | m)

liklihood priorposterior

https://github.com/dfm/corner.py

1.

http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/

NUTS (HMC)Metropolis -Hastings

1.

P(θ|D,m) KL q(θ)

ELBO

1.

�⇤= argmin�KL(q(✓;�) || p(✓ | D))

= argmin�Eq(✓;�)[logq(✓;�)� p(✓ | D)]

ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]

�⇤ = argmax�ELBO(�)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

1. - KL =ELBO

- P q

q(✓;�1)

q(✓;�5)

p(✓,D) p(✓,D)

✓✓

�⇤ = argmax�ELBO(�)

ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]

1. - P q

- :

- ADVI: Automatic Differentiation Variational Inference - BBVI: Blackbox Variational Inference

arxiv:1603.00788

arxiv:1401.0118

https://github.com/HIPS/autograd/blob/master/examples/bayesian_neural_net.py

1. - VI

-

- David MacKay “Lecture 14 of the Cambridge Course” - PRML 10 http://www.inference.org.uk/itprnn_lectures/

1. Reference- Zoubin Ghahramani “History of Bayesian neural networks” NIPS 2016 Workshop Bayesian Deep Learning - Yarin Gal “Bayesian Deep Learning"O'Reilly Artificial Intelligence in New York, 2017

2.

-

- :

- :

- :

- :

- : SGD + BackProp

…x1 x2 xd

✓(2)

✓(1)

x

y

y

(n) =X

j

(2)j �(

X

i

(1)ji x

(n)i ) + ✏

(n)

p(y(n) | x(n),✓) = �(

X

i

(n)i x

(n)i )

D = {x(n), y(n)}Nn=1 = (X,y)

Dropout

2.Dropout- Yarin Gal ”Uncertainty in Deep Learning” - Dropout

- Dropout : conv

- LeNet with Dropout

http://mlg.eng.cam.ac.uk/yarin/blog_2248.html

2.Dropout- LeNet DNN

- conv Dropout MNIST

2.Dropout- CO2

- :

- :

- :

- :

- (MCMC)

- (Variational Inference)

…x1 x2 xd

✓(2)

✓(1)

x

y

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

Edward

Edward- Dustin Tran (Open AI)

- Blei Lab

- (PPL)

- 2016 2 PPL

- / TensorFlow

- George Edward Pelham BoxBox-Cox Trans., Box-Jenkins, Ljung-Box test box plot Tukey,

3 2 RA Fisher

- Probabilistic Programing Library/Langage - Stan, PyMC3, Anglican, Church, Venture,Figaro, WebPPL, Edward

- : Edward / PyMC3

- (VI)

Metropolis Hastings Hamilton Monte Carlo Stochastic Gradient Langevin Dynamics No-U-Turn Sampler

Blackbox Variational Inference Automatic Differentiation Variational Inference

PPL

Edward

TensorFlow(TF) + (PPL)

TF:

PPL: + +

PPL

Edward

Edward TensorFlow

1. TF: -

- :

1. TF:

1. TF: -

-

- GPU / TPU

Inception v3 Inception v4

# of parameters: 42,679,816

# of layers: 48

1. TF: - Keras, Slim

- TensorBoard

1. TF: -

- tf.contrib.distributions

2. x:

edward

x

⇤ s P (x | ↵)

✓⇤ ⇠ Beta(✓ | 1, 1)

2. - ( )

Edward

p(x, ✓) = Beta(✓ | 1, 1)50Y

n=1

Bernoulli(xn | ✓),

2.

-

log_prob()

-

mean()

-

sample()

- tf.contrib.distributions 51 : https://www.tensorflow.org/api_docs/python/tf/contrib/distributions

3. Edward TF

3.

Wh Wh

Wx

Wx

bh bh

xtxt�1

ht�1 ht

Wy Wy

by byyt�1 yt

h

t

= tanh(W

h

h

t�1 +W

x

x

t

+ b

h

)

y

t

⇠ Normal(W

y

h

t

+ b

y

, 1).

3. http://edwardlib.org/tutorials/

4.

- :

- :

- :

- :

- (MCMC)

- (Variational Inference)

…x1 x2 xd

✓(2)

✓(1)

x

y

D = {x(n), y(n)}Nn=1 = (X,y)

P (✓ | D,m) =P (D | ✓,m)P (✓ | m)

P (D | m)

4.

Edward MCMC

4. : MCMC

Edward : KLqp

4. :

5. Box’s loopGeorge Edward Pelham Box

Blei 2014

5. Box’s loop

Edward- Edward = TensorFlow +

+ +

- TensorFlow

-

- TF GPU, TPU, TensorBoard, Keras

-

- TensorFlow

Refrence•D. Tran, A. Kucukelbir, A. Dieng, M. Rudolph, D. Liang, and

D.M. Blei. Edward: A library for probabilistic modeling, inference, and criticism.(arXiv preprint arXiv:1610.09787)

•D. Tran, M.D. Hoffman, R.A. Saurous, E. Brevdo, K. Murphy, and D.M. Blei. Deep probabilistic programming.(arXiv preprint arXiv:1701.03757)

•Box, G. E. (1976). Science and statistics. (Journal of the American Statistical Association, 71(356), 791–799.)

•D.M. Blei. Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models. (Annual Review of Statistics and Its Application Volume 1, 2014)

-

- Edward

Edward

Questions

kashino@bakfoo.com@yutakashino

BakFoo, Inc.NHK NMAPS: +

BakFoo, Inc.PyConJP 2015

Python

BakFoo, Inc.: SNS +

3.

256 28*28

top related