lecture 15: neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · srttu...

22
Lecture 15 SRTTU A.Akhavan 1 شنبه،۲۴ آذر۱۳۹۷ Lecture 15: Neural style transfer Alireza Akhavan Pour CLASS.VISION

Upload: others

Post on 24-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan 1 ۱۳۹۷آذر ۲۴شنبه،

Lecture 15: Neural style transfer

Alireza Akhavan Pour

CLASS.VISION

Page 2: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Neural style transfer

2 ۱۳۹۷آذر ۲۴شنبه،

Page 3: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan 3 ۱۳۹۷آذر ۲۴شنبه،

[Images generated by Justin Johnson]

Content Style Style Content

Generated image Generated image

Neural style transfer

Page 4: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

What are deepConvNets learning?

4 ۱۳۹۷آذر ۲۴شنبه،

Page 5: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing what a deep network is learning

5 ۱۳۹۷آذر ۲۴شنبه،

224 × 224 × 3

⋮ ⋮ ො𝑦

110 × 110 × 96

55 × 55 × 9626 × 26× 256

13 × 13× 256

13 × 13× 384

13 × 13× 384

6 × 6× 256

FC

4096

FC

4096

Pick a unit in layer 1. Find the nine

image patches that maximize the unit’s

activation.

Repeat for other units.

[Zeiler and Fergus., 2013, Visualizing and understanding convolutional networks]

Page 6: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers

6 ۱۳۹۷آذر ۲۴شنبه،

Visualizing deep layers

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 7: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers: Layer 1

7 ۱۳۹۷آذر ۲۴شنبه،

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 8: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers: Layer 2

8 ۱۳۹۷آذر ۲۴شنبه،

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 9: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers: Layer 3

9 ۱۳۹۷آذر ۲۴شنبه،

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 10: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers: Layer 3

10 ۱۳۹۷آذر ۲۴شنبه،

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 11: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers: Layer 4

11 ۱۳۹۷آذر ۲۴شنبه،

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 12: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Visualizing deep layers: Layer 5

12 ۱۳۹۷آذر ۲۴شنبه،

Layer 2 Layer 4Layer 3 Layer 5Layer 1

Page 13: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Cost function

13 ۱۳۹۷آذر ۲۴شنبه،

Page 14: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Neural style transfer cost function

14 ۱۳۹۷آذر ۲۴شنبه،

Content C Style S

Generated image G

[Gatys et al., 2015. A neural algorithm of artistic style. Images on slide generated by Justin Johnson]

𝐽 𝐺 = 𝛼 𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡 𝐶, 𝐺 + 𝛽 𝐽𝑠𝑡𝑦𝑙𝑒(𝑆, 𝐺)

Page 15: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Find the generated image G

15 ۱۳۹۷آذر ۲۴شنبه،

Find the generated image G

1. Initiate G randomly

G: 100 × 100 × 3

2. Use gradient descent to minimize 𝐽(𝐺)

[Gatys et al., 2015. A neural algorithm of artistic style]

Page 16: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Cost function

Content cost function

Style cost function

16 ۱۳۹۷آذر ۲۴شنبه،

Page 17: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Content cost function

17 ۱۳۹۷آذر ۲۴شنبه،

• Say you use hidden layer 𝑙 to compute content cost.

𝐽 𝐺 = 𝛼 𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡 𝐶, 𝐺 + 𝛽 𝐽𝑠𝑡𝑦𝑙𝑒 (𝑆, 𝐺)

• Use pre-trained ConvNet. (E.g., VGG network)

• Let 𝑎[𝑙](𝐶) and 𝑎[𝑙](𝐺) be the activation of layer 𝑙on the images

• If 𝑎[𝑙](𝐶) and 𝑎[𝑙](𝐺) are similar, both images have

similar content

[Gatys et al., 2015. A neural algorithm of artistic style]

𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡 𝐶, 𝐺 =1

2𝑎[𝑙](𝑐) − 𝑎[𝑙](𝐺)

2

Page 18: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Meaning of the “style” of an image

18 ۱۳۹۷آذر ۲۴شنبه،

⋮ ො𝑦

Say you are using layer 𝑙’s activation to measure “style.”

Define style as correlation between activations across channels.

How correlated are the activations

across different channels?

𝑛𝑊

𝑛𝐻𝑛𝐶

[Gatys et al., 2015. A neural algorithm of artistic style]

Page 19: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Intuition about style of an image

19 ۱۳۹۷آذر ۲۴شنبه،

Style image Generated Image

𝑛𝑊

𝑛𝐻𝑛𝐶

𝑛𝑊

𝑛𝐻𝑛𝐶

[Gatys et al., 2015. A neural algorithm of artistic style]

Page 20: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Style matrix

20 ۱۳۹۷آذر ۲۴شنبه،

Let a𝑖,𝑗,𝑘[𝑙]

= activation at 𝑖, 𝑗, 𝑘 . 𝐺[𝑙] is nc[𝑙]× nc

[𝑙]

[Gatys et al., 2015. A neural algorithm of artistic style]

𝐺𝑘 𝑘′[𝑙](𝑠)

=

𝑖=1

𝑛𝐻[𝑙]

𝑗=1

𝑛𝑊[𝑙]

𝑎𝑖𝑗𝑘[𝑙](𝑠)

𝑎𝑖𝑗𝑘′[𝑙](𝑠)

𝐺𝑘 𝑘′[𝑙](𝐺)

=

𝑖=1

𝑛𝐻[𝑙]

𝑗=1

𝑛𝑊[𝑙]

𝑎𝑖𝑗𝑘[𝑙](𝐺)

𝑎𝑖𝑗𝑘′[𝑙](𝐺)

Gram matrix

𝐽𝑠𝑡𝑦𝑙𝑒𝑙

𝑆, 𝐺 =1

(2𝑛𝐻[𝑙]𝑛𝑊[𝑙]𝑛𝑐[𝑙])2

𝐺[𝑙](𝑠) − 𝐺[𝑙](𝐺) 2

=1

(2𝑛𝐻[𝑙]𝑛𝑊[𝑙]𝑛𝑐[𝑙])2

𝑘

𝑘′

(𝐺𝑘 𝑘′𝑙 𝑆

− 𝐺𝑘 𝑘′[𝑙](𝐺)

)2

𝐽𝑠𝑡𝑦𝑙𝑒𝑙

𝑆, 𝐺 =1

(2𝑛𝐻[𝑙]𝑛𝑊[𝑙]𝑛𝑐[𝑙])2

𝐺[𝑙](𝑠) − 𝐺[𝑙](𝐺) 2

=1

(2𝑛𝐻[𝑙]𝑛𝑊[𝑙]𝑛𝑐[𝑙])2

𝑘

𝑘′

(𝐺𝑘 𝑘′𝑙 𝑆

− 𝐺𝑘 𝑘′[𝑙](𝐺)

)2

Page 21: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan

Style cost function

21 ۱۳۹۷آذر ۲۴شنبه،

[Gatys et al., 2015. A neural algorithm of artistic style]

𝐽𝑠𝑡𝑦𝑙𝑒𝑙

𝑆, 𝐺 =1

(2𝑛𝐻[𝑙]𝑛𝑊[𝑙]𝑛𝑐[𝑙])2

𝑘

𝑘′

(𝐺𝑘 𝑘′𝑙 𝑆

− 𝐺𝑘 𝑘′[𝑙](𝐺)

)2

𝐽𝑠𝑡𝑦𝑙𝑒𝑙

𝑆, 𝐺 =

𝑙

𝜆[𝑙] 𝐽𝑠𝑡𝑦𝑙𝑒𝑙

𝑆, 𝐺

𝐽 𝐺 = 𝛼 𝐽𝑐𝑜𝑛𝑡𝑒𝑛𝑡 𝐶, 𝐺 + 𝛽 𝐽𝑠𝑡𝑦𝑙𝑒(𝑆, 𝐺)

Page 22: Lecture 15: Neural style transfer - fall97.class.visionfall97.class.vision/slides/15.pdf · SRTTU –A.Akhavan Lecture 15 Find the generated image G 15 ۱۳۹۷ رذآ۲۴ ،هبنش

Lecture 15SRTTU – A.Akhavan 22 ۱۳۹۷آذر ۲۴شنبه،

منابع

• https://www.coursera.org/specializations/deep-learning

• https://blog.paperspace.com/art-with-neural-networks/