deep learning: more than classification calvin seward.pdf · deep learning: more than...

67
Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna

Upload: others

Post on 28-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

Deep Learning: More Than Classification

Calvin Seward

4 September 2017

AI Summit Vienna

Page 2: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

OVERVIEW

Introduction

Why Neural Networks Are Powerful

Semi-Supervised Localization

Generative Adversarial Networks

The Most Important Part of the Talk

2/52

Page 3: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

SHAMELESS SALES PITCH COMPANY INFORMATION

• Europe’s leading online fashion platform• Operating in 15 countries

• e3.6 billion net sales 2016• ∼13,000 employees from 100+ countries• ∼1,800 employees in technology

• ∼250,000 fashion items offered• ∼21 million active customers

Interested? Visit https://tech.zalando.de and https://jobs.zalando.de

3/52

Page 4: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

SHAMELESS SALES PITCH ZALANDO RESEARCH

• Understand fashion & style• Revolutionize online shopping experience• Product impact, papers, patents, prestige• Autonomous researchers• Big Data, NVIDIA GPUs, Tensorflow. . .

• We’re hiring!!!

Contact: [email protected]

4/52

Page 5: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

SHAMELESS SALES PITCH ZALANDO RESEARCH

• Understand fashion & style• Revolutionize online shopping experience• Product impact, papers, patents, prestige• Autonomous researchers• Big Data, NVIDIA GPUs, Tensorflow. . .• We’re hiring!!!

Contact: [email protected]

4/52

Page 6: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

Why Neural Networks Are Powerful

5/52

Page 7: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

BASIC FEED FORWARD NEURAL NETWORKS

• In mathematical notation:

h = σ(W1input + b1)

output = W2h + b2

f̂W (input) = W2σ(W1input + b1) + b2

• σ, the activation function is a non-linearfunction such as

(a) ReLU (b) eLU (c) Sigmoid

Activation FunctionsSource: Wikipedia

6/52

Page 8: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

MACHINE LEARNING AS LEARNING FUNCTIONS

Most machine learning problems boil down to estimating a function f : Rn → Rm

based on observations (x1, f (x1)), . . . , (xn, f (xn)). Some examples include:• Image Classification: xi is an image, f (xi) is its label• Self Driving Cars: xi is a video sequence, f (xi) is the next action the car

should take• Recommender Systems: xi is a customer’s shopping history, f (xi) is a product

recommendation

The true function f (x) will never be know, we try to find an estimate f̂

7/52

Page 9: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

MACHINE LEARNING AS LEARNING FUNCTIONS

Most machine learning problems boil down to estimating a function f : Rn → Rm

based on observations (x1, f (x1)), . . . , (xn, f (xn)). Some examples include:• Image Classification: xi is an image, f (xi) is its label• Self Driving Cars: xi is a video sequence, f (xi) is the next action the car

should take• Recommender Systems: xi is a customer’s shopping history, f (xi) is a product

recommendationThe true function f (x) will never be know, we try to find an estimate f̂

7/52

Page 10: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

BASIC FEED FORWARD NEURAL NETWORKS

• Neural Networks are universal functionapproximation

• For any function f : Rn → Rm fulfilling certainsmoothness criteria there exists a neuralnetwork f̂W that is arbitrarily close to f

• Finding correct weights W would give us

f̂W : Image→ Label

Source: Wikipedia

8/52

Page 11: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

BASIC FEED FORWARD NEURAL NETWORKS

• Neural Networks are universal functionapproximation

• For any function f : Rn → Rm fulfilling certainsmoothness criteria there exists a neuralnetwork f̂W that is arbitrarily close to f

• Finding correct weights W would give us

f̂W : Image→ Label

Source: Wikipedia

8/52

Page 12: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

BASIC FEED FORWARD NEURAL NETWORKS – BACKPROPAGATION

• Many useful networks have millions ofweights

• A basic grid search for 1 million weightswould take years

• For differentiable loss L, we can use thechain rule to efficiency calculate

∂wiL(f̂W (x)− f (x))

for all i (this is know as backpropagation)

−(

∂w1L(f̂W (x)−f (x)), . . . ,

∂wnL(f̂W (x)−f (x))

)>points in direction that decreases loss

Source: Wikipedia

9/52

Page 13: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

BASIC FEED FORWARD NEURAL NETWORKS – STOCHASTIC GRADIENT DESCENT

Algorithm 1 stochastic gradient descent neuralnetwork training

1: λ← learning rate2: while you haven’t lost patience do3: x ← example from training set4: # Calculate the error5: E ← loss(f (x), f̂W (x))6: # Update the weights7: for weights wi in network do8: wi ← wi − λ ∂E

∂wi9: end for

10: end while11: Return: Trained neural network f̂W Source: Wikipedia

10/52

Page 14: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

PUTTING IT ALL TOGETHER

• Neural networks can approximate arbitrarysmoothish functions f : Rn → Rm

• Given a value x ∈ Rn, and a loss function Lthe gradients

∂wiL(f̂W (x), f (x))

can be efficiently calculated for all weights wi

• The weights W for network f̂W can belearned on an arbitrarily large training setusing Stochastic Gradient Descent

• All this can be done in parallel on the GPUSource: Wikipedia

11/52

Page 15: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

Semi-Supervised Localization

12/52

Page 16: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

EXCITING SHOP THE LOOK APPLICATION: LOCALIZATIONNO PRIOR INFORMATION ABOUT ARTICLE LOCATION WAS USED!

Pullover Ankle Boots Trouser Shirt

13/52

Page 17: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

“SHOP THE LOOK FEATURE”VIEW THE PRODUCT DETAIL PAGE

14/52

Page 18: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

“SHOP THE LOOK FEATURE”SAY YOU LIKE THE OTHER ITEMS

15/52

Page 19: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

“SHOP THE LOOK FEATURE”YOU CAN BUY THE OTHER ITEMS

16/52

Page 20: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

PRETTY COOL DATASET

Shop The Look Image Individual Articleswith Meta-data

17/52

Page 21: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

The method we used is based on “Learning Deep Features for DiscriminativeLocalization” by Zhou, Khosla et. al, 2015. [4]This image was also shamelessly pilfered from their paper.

18/52

Page 22: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

We start by training a classifier with a specific architecture (more on this later).

For example, with this image inputted, output should be near

(0,1,0,0 . . . ,0,1)>

where the ones codify the articles that are present (pullover, ankle boots, trouser,shirt) and the zeros codify the articles missing (dress, sunglasses, . . . ).

19/52

Page 23: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

We start by training a classifier with a specific architecture (more on this later).For example, with this image inputted, output should be near

(0,1,0,0 . . . ,0,1)>

where the ones codify the articles that are present (pullover, ankle boots, trouser,shirt) and the zeros codify the articles missing (dress, sunglasses, . . . ).

19/52

Page 24: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

Classification Performance

20/52

Page 25: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

Classification Performance

21/52

Page 26: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

Classification Performance

22/52

Page 27: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

• Image is fed through most of pre-trained Google inception neural network

23/52

Page 28: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: Deepmind

24/52

Page 29: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

• Image fed through convolutional part of pre-trained inception network

• Results pooled with average pooling• Pooled vector is transformed with a last fully connected matrix, giving us a

classification• Cross entropy loss back propagated

n∑i=1

m∑j=1

−πij log(σ(cij))− (1− πij) log(1− σ(cij))

25/52

Page 30: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

• Image fed through convolutional part of pre-trained inception network• Results pooled with average pooling

• Pooled vector is transformed with a last fully connected matrix, giving us aclassification

• Cross entropy loss back propagatedn∑

i=1

m∑j=1

−πij log(σ(cij))− (1− πij) log(1− σ(cij))

25/52

Page 31: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

• Image fed through convolutional part of pre-trained inception network• Results pooled with average pooling• Pooled vector is transformed with a last fully connected matrix, giving us a

classification

• Cross entropy loss back propagatedn∑

i=1

m∑j=1

−πij log(σ(cij))− (1− πij) log(1− σ(cij))

25/52

Page 32: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

• Image fed through convolutional part of pre-trained inception network• Results pooled with average pooling• Pooled vector is transformed with a last fully connected matrix, giving us a

classification• Cross entropy loss back propagated

n∑i=1

m∑j=1

−πij log(σ(cij))− (1− πij) log(1− σ(cij))

25/52

Page 33: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

• Since the final bit is entirely linear, it’s equivalent to an ensemble of classifiers• Each classifier takes as input all channels of one spacial position of

convolutional output• Each individual classification depends on only a local patch of image• Final classifier is average of individual classifiers

26/52

Page 34: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

And that’s how we manage to learn the location of fashion articles without anyprior location information

27/52

Page 35: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

source: [4]

And that’s how we manage to learn the location of fashion articles without anyprior location information

27/52

Page 36: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

A few more images from test set

Bag Low Shoe Shirt Trouser

28/52

Page 37: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

A few more images from test set

Bag Pullover Sneaker Trouser

29/52

Page 38: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

DEEP FEATURES FOR DISCRIMINATIVE LOCALIZATION

A few more images from test set

Ankle Boots Coat Shirt Trouser

30/52

Page 39: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

Generative Adversarial Networks

31/52

Page 40: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GENERATIVE ADVERSARIAL NETWORK MOTIVATION

• Neural network G : Rn → Rm is auniversal function approximation.

• There must exist weights Wg suchthat if Z is some multivariate randomnoise

G : Z → {cat pictures}

• Two measures of quality:

• Quality of generated images• Diversity of generated images

• Big challenge: How do you evaluateG with respect to these measures?

MNIST digits generated with restrictedboltzmann machine. Source:https://deeplearning4j.org/rbm-mnist-tutorial.html

32/52

Page 41: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GENERATIVE ADVERSARIAL NETWORK MOTIVATION

• Neural network G : Rn → Rm is auniversal function approximation.

• There must exist weights Wg suchthat if Z is some multivariate randomnoise

G : Z → {cat pictures}

• Two measures of quality:• Quality of generated images• Diversity of generated images

• Big challenge: How do you evaluateG with respect to these measures?

MNIST digits generated with restrictedboltzmann machine. Source:https://deeplearning4j.org/rbm-mnist-tutorial.html

32/52

Page 42: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GENERATIVE ADVERSARIAL NETWORK MOTIVATION

• Neural network G : Rn → Rm is auniversal function approximation.

• There must exist weights Wg suchthat if Z is some multivariate randomnoise

G : Z → {cat pictures}

• Two measures of quality:• Quality of generated images• Diversity of generated images

• Big challenge: How do you evaluateG with respect to these measures?

MNIST digits generated with restrictedboltzmann machine. Source:https://deeplearning4j.org/rbm-mnist-tutorial.html

32/52

Page 43: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GENERATIVE ADVERSARIAL NETWORK MOTIVATION

• Neural network D : Rn → Rm is a universal function approximation.• There must exist weights Wd such that

D : {pictures} →

{1 It’s a real cat picture0 It’s a generated cat picture

• If G creates poor quality images, D will detect it• If G fails to create diverse images, D will learn which images are generated• D(G(z)) is a neural network, so we can use gradients from D to update G

33/52

Page 44: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GENERATIVE ADVERSARIAL NETWORK MOTIVATION

• Neural network D : Rn → Rm is a universal function approximation.• There must exist weights Wd such that

D : {pictures} →

{1 It’s a real cat picture0 It’s a generated cat picture

• If G creates poor quality images, D will detect it• If G fails to create diverse images, D will learn which images are generated

• D(G(z)) is a neural network, so we can use gradients from D to update G

33/52

Page 45: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GENERATIVE ADVERSARIAL NETWORK MOTIVATION

• Neural network D : Rn → Rm is a universal function approximation.• There must exist weights Wd such that

D : {pictures} →

{1 It’s a real cat picture0 It’s a generated cat picture

• If G creates poor quality images, D will detect it• If G fails to create diverse images, D will learn which images are generated• D(G(z)) is a neural network, so we can use gradients from D to update G

33/52

Page 46: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

CLASSIC GENERATIVE ADVERSARIAL NETWORK FORMULATION

minG

maxD

V (D,G) = Ex∼pdata(x)[log D(x)] + Ex∼pz(z)[log(1− D(G(z)))]

Algorithm 2 GAN training algorithm from [2]

1: for number of training iterations do2: for k steps do3: Sample noise vector z ∼ pg(z) and real data x ∼ pdata(x)4: Update discriminator by using its gradient:

∇Wd log D(x) + log(1− D(G(z)))

5: end for6: Sample noise vector z ∼ pg(z)7: Update generator by using its gradient:

∇Wg log(1− D(G(z))

8: end for

34/52

Page 47: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

CLASSIC GENERATIVE ADVERSARIAL NETWORK FORMULATION

minG

maxD

V (D,G) = Ex∼pdata(x)[log D(x)] + Ex∼pz(z)[log(1− D(G(z)))]

Algorithm 3 GAN training algorithm from [2]

1: for number of training iterations do2: for k steps do3: Sample noise vector z ∼ pg(z) and real data x ∼ pdata(x)4: Update discriminator by using its gradient:

∇Wd log D(x) + log(1− D(G(z)))

5: end for6: Sample noise vector z ∼ pg(z)7: Update generator by using its gradient:

∇Wg log(1− D(G(z))

8: end for 34/52

Page 48: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GANS FOR CREEPY FACES

Faces generated by DCGAN trained on CelebA dataset [3]

35/52

Page 49: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GANS FOR FASHION

Interpolation between random items in fashion DNA space

36/52

Page 50: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GANS FOR TEXTURE GENERATION

Generated texture where each corner is a specific snake skin texture and interior is alinear combination of corner textures, see [1]

37/52

Page 51: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GAN OPEN QUESTIONS / PROBLEMS

• Convergence is the exception, not the rule• Many successful GANs are highly tuned• Finding exact right parameters seems to be

trial and error• Sudden divergence

GAN convergence sensitive to:• Learning rates• Activation functions• Generator / discriminator

architectures• Alignment of the stars ;)

38/52

Page 52: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

GAN OPEN QUESTIONS / PROBLEMS

• Convergence is the exception, not the rule• Many successful GANs are highly tuned• Finding exact right parameters seems to be

trial and error• Sudden divergence

GAN convergence sensitive to:• Learning rates• Activation functions• Generator / discriminator

architectures• Alignment of the stars ;)

38/52

Page 53: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

The Most Important Part of the Talk

39/52

Page 54: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

In his book, Shawn Otto outlines how the regulatoryenvironment adapts to new technologies. Thinkthings like pesticides and insecticides, fossil fuelconsumption, opioid pain killers.

40/52

Page 55: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

A new process or tool (for example, a chemical or,today, a nanotechnology or genetic technology) isdiscovered that vastly expands utility, power,convenience, or efficiency.

41/52

Page 56: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Industrial applications are quickly developed andcommercialized, often increasing productivity andlowering costs. But the science of biocomplexity andecology—of how the process or tool will affect and beaffected by its broader context, from the human bodyto the environment—lags behind.

42/52

Page 57: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Industries grow up around the new application. Majorcapital investments are made and its use intensifies.

43/52

Page 58: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

A tipping point is reached at which the applicationhas noticeable negative effects on health or theenvironment. Fueled by growing public outcry,scientists study the degree of the systemic effect todetermine what to do.

44/52

Page 59: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Regulations proposed to minimize the negativeeffects, but vested economic interests sense apotentially lethal blow to their production systems andfight the proposed changes by denying theenvironmental effects, maligning and impeachingwitnesses, questioning the science, attacking orimpugning the scientists, and/or arguing that otherfactors are causing the mounting disaster. A battleensues between the adherents of old science andthose of new science.

45/52

Page 60: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Evedince continues to accumulate from the emergingscience until the causation becomes irrefutable, oftenthrough dramatic deaths or disasters (or in the caseof climate disruption, extreme weather events) thatdraw increased public scrutiny and outrage, finallytipping the politics in the direction of reform.

46/52

Page 61: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Regulations are passed or laws are changed to stopor modify use and to mitigate the effects. Theindustrial approach grudgingly shifts to take intoaccount the relationships between the applicationand its environmental and/or physiological context.Or this does not occur, in which case the processreturns to stage 3.

47/52

Page 62: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Where do you see artificial intelligence here?

How can artificial intelligence become part of society in abeneficial way?

The answers are both political and scientific

How can you bring science into the political discourse?

48/52

Page 63: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

THE SEVEN STAGES OF TECHNOLOGICAL ADAPTATION

This text is from The War on Science by Shawn Otto

1. Discovery2. Application3. Development4. Boomerang5. Battle6. Crisis7. Adaptation

Where do you see artificial intelligence here?

How can artificial intelligence become part of society in abeneficial way?

The answers are both political and scientific

How can you bring science into the political discourse?

48/52

Page 64: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

SCIENTISTS IN THE BUNDESTAG

Percent of subjects studied by MPs of the 18th German Bundestag

*These numbers are hard to calculate exactly, see backup for full methodology

49/52

Page 65: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

Thanks for listening

Page 66: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

REFERENCES

Urs Bergmann, Nikolay Jetchev, and Roland Vollgraf.Learning texture manifolds with the periodic spatial gan.arXiv preprint arXiv:1705.06566, 2017.

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio.Generative adversarial nets.In Advances in neural information processing systems, pages 2672–2680, 2014.

Alec Radford, Luke Metz, and Soumith Chintala.Unsupervised representation learning with deep convolutional generative adversarial networks.arXiv preprint arXiv:1511.06434, 2015.

Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba.Learning deep features for discriminative localization.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2921–2929, 2016.

Page 67: Deep Learning: More Than Classification Calvin Seward.pdf · Deep Learning: More Than Classification Calvin Seward 4 September 2017 AI Summit Vienna. OVERVIEW Introduction Why Neural

abgeordnete

Page 1

natural science + medicine liberal arts + arts teaching Law / governance Business / finance engineering totalAnglistik 11 0 1 11 0 0 0 0 1Architektur 3 0 0 0 0 0 1 3 1Biologie 13 1 13 0 0 0 0 0 1Chemie 6 1 6 0 0 0 0 0 1Geografie / Geologie 7 1 7 0 0 0 0 0 1Germanistik 19 0 1 19 0 0 0 0 1Geschichte 44 0 1 44 0 0 0 0 1Gesellschaftswissenschaften 2 0 1 2 0 0 0 0 1Ernährungs- und Haushaltswissenschaften 1 1 1 0 0 0 0 0 1Informatik 5 1 5 0 0 0 0 0 1Ingenieurwesen 22 0 0 0 0 0 1 22 1Islamwissenschaften 1 0 1 1 0 0 0 0 1Journalistik, Publizistik 3 0 1 3 0 0 0 0 1Kulturwissenschaften 4 0 1 4 0 0 0 0 1Kunstgeschichte 5 0 1 5 0 0 0 0 1Landwirtschaft, Forstwirtschaft 8 0 0 0 0 1 8 0 1Lehramt / Dipl.-Lehrer(in) 38 0 0 1 38 0 0 0 1Literaturwissenschaften 5 0 1 5 0 0 0 0 1Mathematik 10 1 10 0 0 0 0 0 1Medien- und Kommunikationswissenschaften 12 0 1 12 0 0 0 0 1Medizin 8 1 8 0 0 0 0 0 1Musikwissenschaften 5 0 1 5 0 0 0 0 1Orientalistik 1 0 1 1 0 0 0 0 1Pädagogik 31 0 0 1 31 0 0 0 1Pharmazie 1 1 1 0 0 0 0 0 1Philologie, Philosophie 15 0 1 15 0 0 0 0 1Physik 8 1 8 0 0 0 0 0 1Politikwissenschaften 79 0 1 79 0 0 0 0 1Psychologie 7 1 7 0 0 0 0 0 1Rechts- und Staatswissenschaften 149 0 0 0 1 149 0 0 1Romanistik 6 0 1 6 0 0 0 0 1Sozialarbeit 7 0 0 1 7 0 0 0 1Sozialwissenschaften 14 0 1 14 0 0 0 0 1Soziologie 31 0 1 31 0 0 0 0 1Sportwissenschaften 7 1 7 0 0 0 0 0 1Sprachwissenschaften 1 0 1 1 0 0 0 0 1Theologie 13 0 1 13 0 0 0 0 1Umweltwissenschaften 1 1 1 0 0 0 0 0 1Verwaltungswissenschaften 12 0 0 0 1 12 0 0 1Veterinärmedizin 4 1 4 0 0 0 0 0 1Volkskunde 2 0 1 2 0 0 0 0 1Volkswirtschaft 33 0 0 0 0 1 33 0 1Wirtschafts- und Sozialwissenschaften, Betriebswirtschaft 62 0 0 0 0 1 62 0 1

http://www.bundestag.de/abgeordnete18/mdb_zahlen 716 78 273 76 161 103 25 716

natural scienc 0.108938547liberal arts + a 0.381284916teaching 0.106145251Law / governa 0.224860335Business / fina 0.143854749 11.29%

39.51%

11.00%

23.30%

14.91% natural science + medicine

liberal arts + arts

teaching

Law / governance

Business / finance