deeply active learning: approximating human learning with smaller datasets combined with human...

15
DEEPLY ACTIVE LEARNING: Approximating Human Learning with Smaller Datasets Combined with Human Assistance TOP 10 WORLD’S MOST INNOVATIVE COMPANIES IN DATA SCIENCE

Upload: arimo-inc

Post on 16-Apr-2017

1.939 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

DEEPLY ACTIVE LEARNING:

Approximating Human Learning with Smaller Datasets Combined with Human Assistance

TO P 1 0 W O R L D’ S M O S T I N N OVAT I V E

C O M PA N I E S I N DATA S C I E N C E

Page 2: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C2

1. Motivation

2. Deep Active Learning Approach

3. Experiment Results

4. Lessons Learned

Agenda

Page 3: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C3

0%

5%

10%

15%

20%

25%

30%

2010 2011 2012 2013 2014 2015 TodayAlexNet ZF VGG ResNet GoogLeNet-v4

Classification Error

DEEP LEARNING REQUIRES

HUGE Amount of Labeled Data + Computations

H U M A N P E R F O R M A N C E

G P U - B A S E D D N N S

50K people worked on Amazon Mechanical Turk 2007–2009

Page 4: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C4

Active LearningU N L A B E L E D DATA L A B E L A S U B S E T

T R A I N A L E A R N E R O N L A B E L E D DATA P I C K T H E B E S T N E X T P O I N T S TO L A B E L

Most uncertain

R E -T R A I N T H E L E A R N E R O N L A B E L E D DATA

Page 5: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C5

Deep Active Learning framework

Learned Embeddings

C l u s t e r - b a s e d Q u e r y o f N ex t S a m p l e s t o L a b e l s

Silhouette(xi)=1-a/b

Samples w/ Lowest Silhouette Scores

L a b e l i n g

xi avg

avg

b=min

a=avg

L a b e l e d D a t a

Page 6: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C6

Generate Online Product Titles

Page 7: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C7

Deep Active Learning framework

Learned Embeddings

C l u s t e r - b a s e d Q u e r y o f N ex t S a m p l e s t o L a b e l s

Silhouette(xi)=1-a/b

Samples w/ Lowest Silhouette Scores

L a b e l i n g

xi avg

avg

b=min

a=avg

L a b e l e d D a t a

Page 8: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C8

LSTM-based Caption Generator

S O F T M A X S O F T M A X S O F T M A X

y1 y2 yn

LST

M

LST

M

LST

M

LST

MW O R D 2 V E C

x0 x1 xn-1v1

R N N / L S T M D e c o d e r

C N N E n c o d e r

Impress guests with lemon pattern fresh look stoneware ensures lasting use dishwasher safe

E m b e d d i n g Ve c t o r

Page 9: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

EXPERIMENT

Page 10: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C1 0

The Setup

Train Data Test Data

2000 Product Images 100 Product Images-Caption Pairs

PRODUCT TITLE GENERATOR

Deep Active Learning Random Labeling Fully Supervised

Train on 500 Samples Labeled Actively

Train on 500 Samples Labeled Randomly

Train on 2000 Labeled Samples

PRODUCT TITLE GENERATOR

Page 11: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C1 1

The Result

Page 12: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C1 2

The result

Page 13: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C

Product Image Reconstructed Image 100 Samples 300 Samples 500 Samples

Hand cream stirs senses refreshing seashore breeze scent moisturize soften skin

Hand cream stirs senses refreshing eucalyptus and mint scent moisturize and soften your skin

Hand cream stirs your senses with a refreshing seashore breeze scent moisturize and soften your skin

You’ll always wake beautiful day cute dishwasher microwave safe mug

Impress guests with lemon pattern fresh look stoneware ensures lasting use dishwasher safe

Impress guests with lemon pattern fresh look stoneware ensures lasting use dishwasher safe

Little girls dress featuring trendy drop waist silhouette crewneck short tulip fully lined knit bodice lends

Soft sophisticated essential bath towel quick dry holds lovely look rich color use benzoyl peroxide

Soft sophisticated hand towel quick dry highly absorbent benzoyl peroxide friendly

1 3

Generated Captions Examples

Page 14: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

S E P 2 6 , 2 0 1 6 @ A R I M O I N C1 4

1. Successfully applied active learning in our DL

framework

2. Demonstrated an improvement of 3x in

perplexity score btw DAL vs. DRL

3. It turns out DAL works for text generation &

not just classification!

• It’s effective when your data has

clustering structure

4. Future work: Work on larger scale dataset 

Summary & Future Work

Page 15: Deeply active learning: Approximating human learning with smaller datasets combined with human assistance

DEEPLY ACTIVE LEARNING:

Approximating Human Learning with Smaller Datasets Combined with Human Assistance

TO P 1 0 W O R L D’ S M O S T I N N OVAT I V E

C O M PA N I E S I N DATA S C I E N C E