custom models in no time: how active learning is changing ai€¦ · custom models in no time: how...

Kfir BarChief Scientist

Custom Models in No Time:

How Active Learning is Changing AI

HLTCon2018

2

Supervised machine learning

➔ Teaching a machine to classify items,

by showing it pairs of items and their

true labels

Apple

Karambola

Grapefruit

Cherry

3

Peach

Fruit detector

4


GrapesBanana

Apple

Orange

Training set

Training

5

What is this?

6


Sample

?

It’s a cherry

Prediction

7

How training sets get created?

➔ We need domain experts

➔ We need domain experts

➔ and we need to pay them...

8

How training sets get created?

9

Standard (static) annotation process

Collect unlabeled data (e.g., fruits)1

Select an item, randomly2

Assign a label 3

Train and evaluate a classifier4

Not happy with the results?

10

Did I see this item before?

➔ Randomly selecting an item for annotation

might be suboptimal

➔ The dataset could be unbalanced

➔ For example, looking for tweets that mention the word

“move” in the sense of “moving to a new apartment”

11

Unbalanced datasets - document classification

10%

90%New apartment

Other

mood: wanna move to another city and start a new life

Might move to another house next month !

➔ For example, looking for tweets that mention the word

“move” in the sense of “moving to a new apartment”

12

Unbalanced datasets - document classification

10%

90%New apartment

Other

mood: wanna move to another city and start a new life

Might move to another house next month !

Nice/bad move

Move on

13

Introducing Active Learning

➔ We train a classifier as we annotate

➔ We keep training the classifier with

every new annotated item

➔ The classifier helps us select the most important item to annotate next, the one

that has the biggest impact on the

classifier

14


➔ Instead of picking up many apples, the classifier will select

the most difficult items

15


➔ Given 400 random points, distributed in a 2-D space. Half are green squares and half are red triangles

➔ We want to train a classifier to distinguish between the two groups

➔ Let’s say we train the classifier with only 30 points (out of the 400)

16


Random selection of 30 points

Classifier selects the 30 most difficult points

70% Accuracy 90% Accuracy

➔ Builds a classifier behind the scene

➔ Selects the most difficult items for annotation

➔ Serves items with suggested labels, so you only need to correct them, when needed, and submit

➔ Provides a dashboard, where you can see performance metrics change as you annotate more items

17

New Active-Learning annotation tool

For now, it supports Named Entity Recognition (NER).

Next on the list: Document Classification and Sentiment Analysis

18

Automatically find names of people,

organizations, locations, and more in text

across many languages.

First NLP task: Named entity recognition (NER)

According to Elon Musk, Mars rocket will

fly ‘short flights’ next year.

19

20

Example

21

NER Experiment w/ and w/o Active Learning

80%

200K 700K

Accuracy

# Words

Random Selection

Active Learning

22

How does it work?

Upload unlabeled documents

1

23

How does it work?

Annotate a few sentences.

When there are enough annotated sentences, the system trains a classifier

2

24

How does it work?

Then, the classifier predicts labels for the remaining unlabeled sentences.

Each sentence is assigned with a confidence score

30.7 0.9 0.3

0.5 0.50.5

0.5 0.5

0.2

0.1

0.00.8

0.3

0.50.7

0.1

0.3 0.4

0.60.5

25

How does it work?

Next sentence to annotate will be the one with minimum confidence

40.7 0.9 0.3

0.5 0.50.5

0.5 0.5

0.2

0.1

0.00.8

0.3

0.50.7

0.1

0.3 0.4

0.60.5

26

How does it work?

0.7 0.9 0.3

0.5 0.50.5

0.5 0.5

0.2

0.1

0.00.8

0.3

0.50.7

0.1

0.3 0.4

0.60.5

27

Active deep learning

+

LSTM

Washington

LSTM

+

LSTM

said

LSTM

+

LSTM

in

LSTM

+

LSTM

Chicago

LSTM

+

LSTM

last

LSTM

...

OTHER OTHER B-LOC OTHER0.1, 0.5, 0.01, …, 0.08

B-PER

28

Active deep learning

+

LSTM

Washington

LSTM

+

LSTM

said

LSTM

+

LSTM

in

LSTM

+

LSTM

Chicago

LSTM

+

LSTM

last

LSTM

...

0.1, 0.5, 0.01, …, 0.08

B-PER

0.01, 0.2, …, 0.2 0.01, 0.2, …, 0.2 0.6, 0.2, …, 0.2 0.01, 0.2, …, 0.2

OTHER B-LOC

➔ We can use the new annotation tool to adjust an existing model for a specific task

➔ For example, start with our English NER model and adjust it to work on legal contracts

29

Adjusting existing models

Questions?

Thank you!

[email protected]

@kfirbar

mailto:[email protected]

custom models in no time: how active learning is changing ai€¦ · custom models in no time: how...

Documents