Download - Pycon2017
![Page 1: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/1.jpg)
Edward
2017-09-09@ PyConJP 2017
![Page 2: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/2.jpg)
Yuta Kashino ( )
BakFoo, Inc. CEO Astro Physics /Observational Cosmology Zope / Python Realtime Data Platform for Enterprise / Prototyping
![Page 3: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/3.jpg)
Yuta Kashino ( )
arXiv
PyCon2015
Python
PyCon2016
PyCon2017 DNN PPL Edward @yutakashino
![Page 4: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/4.jpg)
-
- Edward
Edward
![Page 5: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/5.jpg)
![Page 6: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/6.jpg)
http://bayesiandeeplearning.org/
![Page 7: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/7.jpg)
Shakir Mohamed
http://blog.shakirm.com/wp-content/uploads/2015/11/CSML_BayesDeep.pdf
![Page 8: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/8.jpg)
-
Denker, Schwartz, Wittner, Solla, Howard, Jackel, Hopfield (1987) Denker and LeCun (1991) MacKay (1992) Hinton and van Camp (1993) Neal (1995) Barber and Bishop (1998) Graves (2011) Blundell, Cornebise, Kavukcuoglu, and Wierstra (2015) Hernandez-Lobato and Adam (2015)
![Page 9: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/9.jpg)
-
Yarin Gal Zoubin Ghahramani Shakir Mohamed Dastin Tran Rajesh Ranganath David Blei Ian Goodfellow
Columbia U
U of Cambridge
![Page 10: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/10.jpg)
-
- :
- :
- :
- :
- : SGD + BackProp
…
…x1 x2 xd
✓(2)
✓(1)
x
y
y
(n) =X
j
✓
(2)j �(
X
i
✓
(1)ji x
(n)i ) + ✏
(n)
p(y(n) | x(n),✓) = �(
X
i
✓
(n)i x
(n)i )
✓
D = {x(n), y(n)}Nn=1 = (X,y)
![Page 11: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/11.jpg)
: - +
- 2012 ILSVRC
→ 2015
-
-
-
-
![Page 12: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/12.jpg)
: -
- ReLU, DropOut, Mini Batch, SGD(Adam), LSTM… -
- ImageNet, MSCoCo… - : GPU,
- : - Theano, Torch, Caffe, TensorFlow, Chainer, MxNet, PyTorch…
![Page 13: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/13.jpg)
: -
-
-
-
-
https://lossfunctions.tumblr.com/
![Page 14: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/14.jpg)
: -
-
-
- Adversarial examples -
![Page 15: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/15.jpg)
-
=
=
-
![Page 16: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/16.jpg)
-
- :
- :
- :
- :
- : SGD + BackProp
…
…x1 x2 xd
✓(2)
✓(1)
x
y
y
(n) =X
j
✓
(2)j �(
X
i
✓
(1)ji x
(n)i ) + ✏
(n)
p(y(n) | x(n),✓) = �(
X
i
✓
(n)i x
(n)i )
✓
D = {x(n), y(n)}Nn=1 = (X,y)
![Page 17: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/17.jpg)
1. →
2. → DropOut
✓
![Page 18: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/18.jpg)
1.
- data hypothesis( )
- :
-
-
P (H | D) =P (H)P (D | H)PH P (H)P (D|H)
P (x) =X
y
P (x, y)
P (x, y) = P (x)P (y | x)
posterior likelihoodprior
evidence
![Page 19: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/19.jpg)
1. - :
- :
-
-
- :
P (H | D) =P (H)P (D | H)PH P (H)P (D|H)
likelihood priorposterior
P (✓ | D,m) =P (D | ✓,m)P (✓ | m)
P (D | m)m:
P (x | D,m) =
ZP (x | ✓,D,m)P (✓ | D,m)d✓
P (m | D) =P (D | m)P (m)
P (D)
evidence
✓ ⇠ Beta(✓ | 2, 2)
![Page 20: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/20.jpg)
1.
-
- :
- :
- :
…
…x1 x2 xd
✓(2)
✓(1)
x
y
✓
D = {x(n), y(n)}Nn=1 = (X,y)
P (✓ | D,m) =P (D | ✓,m)P (✓ | m)
P (D | m)m:
![Page 21: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/21.jpg)
1.
- (MCMC)
- (Variational Inference)
P (✓ | D,m) =P (D | ✓,m)P (✓ | m)
P (D | m) ZP (D | ✓,m)P (✓)d✓
evidence
![Page 22: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/22.jpg)
1.
-
-
P (✓ | D,m) = P (D | ✓,m)P (✓ | m)
liklihood priorposterior
✓
https://github.com/dfm/corner.py
![Page 23: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/23.jpg)
1.
✓
http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/
NUTS (HMC)Metropolis -Hastings
![Page 24: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/24.jpg)
1.
![Page 25: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/25.jpg)
P(θ|D,m) KL q(θ)
ELBO
1.
�⇤= argmin�KL(q(✓;�) || p(✓ | D))
= argmin�Eq(✓;�)[logq(✓;�)� p(✓ | D)]
ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]
�⇤ = argmax�ELBO(�)
P (✓ | D,m) =P (D | ✓,m)P (✓ | m)
P (D | m)
![Page 26: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/26.jpg)
1. - KL =ELBO
- P q
q(✓;�1)
q(✓;�5)
p(✓,D) p(✓,D)
✓✓
�⇤ = argmax�ELBO(�)
ELBO(�) = Eq(✓;�)[p(✓,D)� logq(✓;�)]
![Page 27: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/27.jpg)
1. - P q
- :
- ADVI: Automatic Differentiation Variational Inference - BBVI: Blackbox Variational Inference
arxiv:1603.00788
arxiv:1401.0118
https://github.com/HIPS/autograd/blob/master/examples/bayesian_neural_net.py
![Page 28: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/28.jpg)
1. - VI
-
- David MacKay “Lecture 14 of the Cambridge Course” - PRML 10 http://www.inference.org.uk/itprnn_lectures/
![Page 29: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/29.jpg)
1. Reference- Zoubin Ghahramani “History of Bayesian neural networks” NIPS 2016 Workshop Bayesian Deep Learning - Yarin Gal “Bayesian Deep Learning"O'Reilly Artificial Intelligence in New York, 2017
![Page 30: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/30.jpg)
2.
-
- :
- :
- :
- :
- : SGD + BackProp
…
…x1 x2 xd
✓(2)
✓(1)
x
y
y
(n) =X
j
✓
(2)j �(
X
i
✓
(1)ji x
(n)i ) + ✏
(n)
p(y(n) | x(n),✓) = �(
X
i
✓
(n)i x
(n)i )
✓
D = {x(n), y(n)}Nn=1 = (X,y)
Dropout
![Page 31: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/31.jpg)
2.Dropout- Yarin Gal ”Uncertainty in Deep Learning” - Dropout
- Dropout : conv
- LeNet with Dropout
http://mlg.eng.cam.ac.uk/yarin/blog_2248.html
![Page 32: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/32.jpg)
2.Dropout- LeNet DNN
- conv Dropout MNIST
![Page 33: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/33.jpg)
2.Dropout- CO2
![Page 34: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/34.jpg)
- :
- :
- :
- :
- (MCMC)
- (Variational Inference)
…
…x1 x2 xd
✓(2)
✓(1)
x
y
✓
D = {x(n), y(n)}Nn=1 = (X,y)
P (✓ | D,m) =P (D | ✓,m)P (✓ | m)
P (D | m)
![Page 35: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/35.jpg)
Edward
![Page 36: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/36.jpg)
Edward- Dustin Tran (Open AI)
- Blei Lab
- (PPL)
- 2016 2 PPL
- / TensorFlow
- George Edward Pelham BoxBox-Cox Trans., Box-Jenkins, Ljung-Box test box plot Tukey,
3 2 RA Fisher
![Page 37: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/37.jpg)
- Probabilistic Programing Library/Langage - Stan, PyMC3, Anglican, Church, Venture,Figaro, WebPPL, Edward
- : Edward / PyMC3
- (VI)
Metropolis Hastings Hamilton Monte Carlo Stochastic Gradient Langevin Dynamics No-U-Turn Sampler
Blackbox Variational Inference Automatic Differentiation Variational Inference
![Page 38: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/38.jpg)
PPL
Edward
TensorFlow(TF) + (PPL)
TF:
PPL: + +
![Page 39: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/39.jpg)
PPL
Edward
Edward TensorFlow
![Page 40: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/40.jpg)
1. TF: -
- :
![Page 41: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/41.jpg)
1. TF:
![Page 42: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/42.jpg)
1. TF: -
-
- GPU / TPU
Inception v3 Inception v4
# of parameters: 42,679,816
# of layers: 48
![Page 43: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/43.jpg)
1. TF: - Keras, Slim
- TensorBoard
![Page 44: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/44.jpg)
1. TF: -
- tf.contrib.distributions
![Page 45: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/45.jpg)
2. x:
edward
x
⇤ s P (x | ↵)
✓⇤ ⇠ Beta(✓ | 1, 1)
![Page 46: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/46.jpg)
2. - ( )
Edward
p(x, ✓) = Beta(✓ | 1, 1)50Y
n=1
Bernoulli(xn | ✓),
![Page 47: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/47.jpg)
2.
-
log_prob()
-
mean()
-
sample()
- tf.contrib.distributions 51 : https://www.tensorflow.org/api_docs/python/tf/contrib/distributions
![Page 48: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/48.jpg)
3. Edward TF
![Page 49: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/49.jpg)
3.
Wh Wh
Wx
Wx
bh bh
xtxt�1
ht�1 ht
Wy Wy
by byyt�1 yt
h
t
= tanh(W
h
h
t�1 +W
x
x
t
+ b
h
)
y
t
⇠ Normal(W
y
h
t
+ b
y
, 1).
![Page 50: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/50.jpg)
3. http://edwardlib.org/tutorials/
![Page 51: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/51.jpg)
4.
- :
- :
- :
- :
- (MCMC)
- (Variational Inference)
…
…x1 x2 xd
✓(2)
✓(1)
x
y
✓
D = {x(n), y(n)}Nn=1 = (X,y)
P (✓ | D,m) =P (D | ✓,m)P (✓ | m)
P (D | m)
![Page 52: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/52.jpg)
4.
![Page 53: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/53.jpg)
Edward MCMC
4. : MCMC
![Page 54: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/54.jpg)
Edward : KLqp
4. :
![Page 55: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/55.jpg)
5. Box’s loopGeorge Edward Pelham Box
Blei 2014
![Page 56: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/56.jpg)
5. Box’s loop
![Page 57: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/57.jpg)
Edward- Edward = TensorFlow +
+ +
- TensorFlow
-
- TF GPU, TPU, TensorBoard, Keras
-
- TensorFlow
![Page 58: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/58.jpg)
![Page 59: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/59.jpg)
Refrence•D. Tran, A. Kucukelbir, A. Dieng, M. Rudolph, D. Liang, and
D.M. Blei. Edward: A library for probabilistic modeling, inference, and criticism.(arXiv preprint arXiv:1610.09787)
•D. Tran, M.D. Hoffman, R.A. Saurous, E. Brevdo, K. Murphy, and D.M. Blei. Deep probabilistic programming.(arXiv preprint arXiv:1701.03757)
•Box, G. E. (1976). Science and statistics. (Journal of the American Statistical Association, 71(356), 791–799.)
•D.M. Blei. Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models. (Annual Review of Statistics and Its Application Volume 1, 2014)
![Page 60: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/60.jpg)
-
- Edward
Edward
![Page 62: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/62.jpg)
BakFoo, Inc.NHK NMAPS: +
![Page 63: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/63.jpg)
BakFoo, Inc.PyConJP 2015
Python
![Page 65: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/65.jpg)
BakFoo, Inc.: SNS +
![Page 66: Pycon2017](https://reader033.vdocuments.us/reader033/viewer/2022050613/5a6478fa7f8b9a40568b4647/html5/thumbnails/66.jpg)
3.
256 28*28