custom models in no time: how active learning is changing ai€¦ · custom models in no time: how...
TRANSCRIPT
Kfir BarChief Scientist
Custom Models in No Time:
How Active Learning is Changing AI
HLTCon2018
2
Supervised machine learning
➔ Teaching a machine to classify items,
by showing it pairs of items and their
true labels
Apple
Karambola
Grapefruit
Cherry
3
Peach
Fruit detector
4
Supervised machine learning
GrapesBanana
Apple
Orange
Training set
Training
5
What is this?
6
Supervised machine learning
Sample
?
It’s a cherry
Prediction
7
How training sets get created?
➔ We need domain experts
➔ We need domain experts
➔ and we need to pay them...
8
How training sets get created?
9
Standard (static) annotation process
Collect unlabeled data (e.g., fruits)1
Select an item, randomly2
Assign a label 3
Train and evaluate a classifier4
Not happy with the results?
10
Did I see this item before?
➔ Randomly selecting an item for annotation
might be suboptimal
➔ The dataset could be unbalanced
➔ For example, looking for tweets that mention the word
“move” in the sense of “moving to a new apartment”
11
Unbalanced datasets - document classification
10%
90%New apartment
Other
mood: wanna move to another city and start a new life
Might move to another house next month !
➔ For example, looking for tweets that mention the word
“move” in the sense of “moving to a new apartment”
12
Unbalanced datasets - document classification
10%
90%New apartment
Other
mood: wanna move to another city and start a new life
Might move to another house next month !
Nice/bad move
Move on
13
Introducing Active Learning
➔ We train a classifier as we annotate
➔ We keep training the classifier with
every new annotated item
➔ The classifier helps us select the most important item to annotate next, the one
that has the biggest impact on the
classifier
14
Introducing Active Learning
➔ Instead of picking up many apples, the classifier will select
the most difficult items
15
Introducing Active Learning
➔ Given 400 random points, distributed in a 2-D space. Half are green squares and half are red triangles
➔ We want to train a classifier to distinguish between the two groups
➔ Let’s say we train the classifier with only 30 points (out of the 400)
16
Introducing Active Learning
Random selection of 30 points
Classifier selects the 30 most difficult points
70% Accuracy 90% Accuracy
➔ Builds a classifier behind the scene
➔ Selects the most difficult items for annotation
➔ Serves items with suggested labels, so you only need to correct them, when needed, and submit
➔ Provides a dashboard, where you can see performance metrics change as you annotate more items
17
New Active-Learning annotation tool
For now, it supports Named Entity Recognition (NER).
Next on the list: Document Classification and Sentiment Analysis
18
Automatically find names of people,
organizations, locations, and more in text
across many languages.
First NLP task: Named entity recognition (NER)
According to Elon Musk, Mars rocket will
fly ‘short flights’ next year.
19
20
Example
21
NER Experiment w/ and w/o Active Learning
80%
200K 700K
Accuracy
# Words
Random Selection
Active Learning
22
How does it work?
Upload unlabeled documents
1
23
How does it work?
Annotate a few sentences.
When there are enough annotated sentences, the system trains a classifier
2
24
How does it work?
Then, the classifier predicts labels for the remaining unlabeled sentences.
Each sentence is assigned with a confidence score
30.7 0.9 0.3
0.5 0.50.5
0.5 0.5
0.2
0.1
0.00.8
0.3
0.50.7
0.1
0.3 0.4
0.60.5
25
How does it work?
Next sentence to annotate will be the one with minimum confidence
40.7 0.9 0.3
0.5 0.50.5
0.5 0.5
0.2
0.1
0.00.8
0.3
0.50.7
0.1
0.3 0.4
0.60.5
26
How does it work?
0.7 0.9 0.3
0.5 0.50.5
0.5 0.5
0.2
0.1
0.00.8
0.3
0.50.7
0.1
0.3 0.4
0.60.5
27
Active deep learning
+
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
OTHER OTHER B-LOC OTHER0.1, 0.5, 0.01, …, 0.08
B-PER
28
Active deep learning
+
LSTM
Washington
LSTM
+
LSTM
said
LSTM
+
LSTM
in
LSTM
+
LSTM
Chicago
LSTM
+
LSTM
last
LSTM
...
0.1, 0.5, 0.01, …, 0.08
B-PER
0.01, 0.2, …, 0.2 0.01, 0.2, …, 0.2 0.6, 0.2, …, 0.2 0.01, 0.2, …, 0.2
OTHER B-LOC
➔ We can use the new annotation tool to adjust an existing model for a specific task
➔ For example, start with our English NER model and adjust it to work on legal contracts
29
Adjusting existing models