custom models in no time: how active learning is changing ai€¦ · custom models in no time: how...

Kfir BarChief Scientist

Custom Models in No Time:

How Active Learning is Changing AI

HLTCon2018

Supervised machine learning

➔ Teaching a machine to classify items,

by showing it pairs of items and their

true labels

Karambola

Grapefruit

Cherry

Fruit detector

GrapesBanana

Orange

Training set

Training

What is this?

Sample

It’s a cherry

Prediction

How training sets get created?

➔ We need domain experts

➔ and we need to pay them...

How training sets get created?

Standard (static) annotation process

Collect unlabeled data (e.g., fruits)1

Select an item, randomly2

Assign a label 3

Train and evaluate a classifier4

Not happy with the results?

Did I see this item before?

➔ Randomly selecting an item for annotation

might be suboptimal

➔ The dataset could be unbalanced

➔ For example, looking for tweets that mention the word

“move” in the sense of “moving to a new apartment”

Unbalanced datasets - document classification

90%New apartment

mood: wanna move to another city and start a new life

Might move to another house next month !

➔ For example, looking for tweets that mention the word

“move” in the sense of “moving to a new apartment”

Unbalanced datasets - document classification

90%New apartment

mood: wanna move to another city and start a new life

Might move to another house next month !

Nice/bad move

Move on

Introducing Active Learning

➔ We train a classifier as we annotate

➔ We keep training the classifier with

every new annotated item

➔ The classifier helps us select the most important item to annotate next, the one

that has the biggest impact on the

classifier

➔ Instead of picking up many apples, the classifier will select

the most difficult items

➔ Given 400 random points, distributed in a 2-D space. Half are green squares and half are red triangles

➔ We want to train a classifier to distinguish between the two groups

➔ Let’s say we train the classifier with only 30 points (out of the 400)

Random selection of 30 points

Classifier selects the 30 most difficult points

70% Accuracy 90% Accuracy

➔ Builds a classifier behind the scene

➔ Selects the most difficult items for annotation

➔ Serves items with suggested labels, so you only need to correct them, when needed, and submit

➔ Provides a dashboard, where you can see performance metrics change as you annotate more items

New Active-Learning annotation tool

For now, it supports Named Entity Recognition (NER).

Next on the list: Document Classification and Sentiment Analysis

Automatically find names of people,

organizations, locations, and more in text

across many languages.

First NLP task: Named entity recognition (NER)

According to Elon Musk, Mars rocket will

fly ‘short flights’ next year.

Example

NER Experiment w/ and w/o Active Learning

200K 700K

Accuracy

# Words

Random Selection

Active Learning

How does it work?

Upload unlabeled documents

How does it work?

Annotate a few sentences.

When there are enough annotated sentences, the system trains a classifier

How does it work?

Then, the classifier predicts labels for the remaining unlabeled sentences.

Each sentence is assigned with a confidence score

30.7 0.9 0.3

0.5 0.50.5

0.5 0.5

0.00.8

0.50.7

0.3 0.4

0.60.5

How does it work?

Next sentence to annotate will be the one with minimum confidence

40.7 0.9 0.3

0.5 0.50.5

0.5 0.5

0.00.8

0.50.7

0.3 0.4

0.60.5

How does it work?

0.7 0.9 0.3

0.5 0.50.5

0.5 0.5

0.00.8

0.50.7

0.3 0.4

0.60.5

Active deep learning

Washington

Chicago

OTHER OTHER B-LOC OTHER0.1, 0.5, 0.01, …, 0.08

Active deep learning

Washington

Chicago

0.1, 0.5, 0.01, …, 0.08

0.01, 0.2, …, 0.2 0.01, 0.2, …, 0.2 0.6, 0.2, …, 0.2 0.01, 0.2, …, 0.2

OTHER B-LOC

➔ We can use the new annotation tool to adjust an existing model for a specific task

➔ For example, start with our English NER model and adjust it to work on legal contracts

Adjusting existing models

Questions?

Thank you!

kfir@basistech.com

@kfirbar

custom models in no time: how active learning is changing ai€¦ · custom models in no time: how...

Documents

heroz, inc. cmlaas (machine learning as a service) shogi ai...

ai history to-m-learning

digital, ai and machine learning solutions for...

ai, machine learning & big data2020

ai & machine learning at...

cs2351 ai unit5 learning

the deep learning ai revolution

ai pptx: robust continuous learning for document generation...

filip maertens - ai, machine learning and chatbots: think...

eac tre brochure - custom learning

dell emc ready solutions for ai deep learning with intel ·...

coineption technology pvt ltd. · • website development...

sigopt for machine learning and ai

oracle® custom guided learning service frequently asked...

supply chain management - custom learning materials |...

accelerating reinforcement learning in engineering...

learning analytics & ai for future-focused...

learning deep architectures for ai contents

building custom investment solutions using artificial...

hierarchical learning in ai - general problem