ling 570 day 9: text classification and sentiment analysis 1
TRANSCRIPT
![Page 1: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/1.jpg)
1
Ling 570 Day 9:Text Classification and
Sentiment Analysis
![Page 2: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/2.jpg)
2
Outline
Questions on HW #3 Discussion of Project #1 Text Classification Sentiment Analysis
![Page 3: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/3.jpg)
3
Project #1
![Page 4: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/4.jpg)
4
Your goal: political text analysis
Take a document, predict whether it is more Republican or Democratic
We have harvested blog posts from: The Democratic National Committee The Republican National Committee Fox News The Huffington Post
![Page 5: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/5.jpg)
5
First task
Can you reconstruct the party affiliation of a given document?
We will gather some novel posts, held out from your training data
You predict the political part of each of these posts to the best of your ability
![Page 6: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/6.jpg)
6
Second task
Is the media biased? Is a particular news source biased?
Using the classifier that you’ve learned, see whether documents from a particular news source seem to be left- or right-leaning.
What features are most indicative of the party of a given document?
Do you think your classifier is effective in detecting media bias? Why or why not?
![Page 7: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/7.jpg)
7
Text Classification
![Page 8: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/8.jpg)
8
Text classification
Also known as “text categorization”
Often an instance of supervised learning Start with a large body of pre-classified data Try to map new documents into one of these classes
![Page 9: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/9.jpg)
9
train
classes – often hierarchical
test
Text classification
linguisticsphonology“acoustics”
“IPA”…
morphology“morpheme”“template”
…
…
brewingvarieties
“IPA”“hefeweizen”
..
…
“We transcribed the samples of
this unusual language in IPA…”
![Page 10: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/10.jpg)
10
Classification methods
Manual Yahoo, back in the
day, had a manually curated hierarchy of useful web content
Can be very accurate, consistent…
…but it’s very expensive
Need to move to automatic methods
![Page 11: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/11.jpg)
11
Text categorization
Given: A document
is the set of all possible documents But we need to represent them usefully somehow! Often times we have a high-dimensional representation
A fixed set of categories Determine:
The category of some new document
![Page 12: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/12.jpg)
12
Machine learning:Supervised classification Given:
Instance descriptions A set of outcomes A training set
Determine: A classifier
Classification is a clear instance of this problem
![Page 13: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/13.jpg)
13
Bayesian methods
Learning based on probability theory Bayes theorem plays a big role
Build a generative model that approximates how data is produced
Prior probability of each class Model gives a posterior probability of output given
inputs Naïve Bayes:
Bag of features (generally words) Assumes each feature is independent
![Page 14: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/14.jpg)
14
Bag of words representation
𝑓 ¿According to a study published in the October issue of Current Biology entitled 'Spontaneous human speech mimicry by a cetacean,' whales can talk. Not to burst your bubble ring or anything, but now that we've suckered you in, let's clarify what we mean by 'talk.' A beluga whale named 'NOC' (he was named for an incredibly annoying sort of Canadian gnat), that lived at the National Marine Mammal Foundation (NMMF) in San Diego up until his death five years ago, had been heard making some weird kinds of vocalizations. At first, nobody was sure that it was him: divers hearing what sounded like 'two people were conversing in the distance just out of range for our understanding.' But then one day, a diver in NOC's tank left the water after clearly hearing someone tell him to get out. It wasn't someone, though: it was some whale, and that some whale was NOC.
![Page 15: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/15.jpg)
15
Bag of words representation
𝑓 ( [h𝑤 𝑎𝑙𝑒 3𝑁𝑜𝑐 3𝑡𝑎𝑙𝑘 2
𝑛𝑎𝑚𝑒𝑑 2h𝑒𝑎𝑟𝑖𝑛𝑔 2h𝑤 𝑎𝑙𝑒𝑠 1
𝑤𝑎𝑡𝑒𝑟 1𝑣𝑜𝑐𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛𝑠 1
⋮ ⋮
])
![Page 16: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/16.jpg)
16
Bayes’ Rule for text classification
For a document and a class
![Page 17: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/17.jpg)
17
Bayes’ Rule for text classification
For a document and a class
![Page 18: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/18.jpg)
18
Bayes’ Rule for text classification
For a document and a class
So…
![Page 19: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/19.jpg)
19
Bayes’ Rule for text classification
For a document and a class
So…
Divide by to get:
![Page 20: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/20.jpg)
20
Back to text classification
Pr (𝑆𝑐𝑖𝑒𝑛𝑐𝑒|[ 𝑤h𝑎𝑙𝑒 3𝑁𝑜𝑐 3
𝑣𝑜𝑐𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛𝑠 1⋮ ⋮
])=Pr ([ 𝑤 h𝑎𝑙𝑒 3
𝑁𝑜𝑐 3𝑣𝑜𝑐𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛𝑠 1
⋮ ⋮]|𝑆𝑐𝑖𝑒𝑛𝑐𝑒)Pr (𝑆𝑐𝑖𝑒𝑛𝑐𝑒 )
Pr ( [ 𝑤h𝑎𝑙𝑒 3𝑁𝑜𝑐 3
𝑣𝑜𝑐𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛𝑠 1⋮ ⋮ ])
![Page 21: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/21.jpg)
21
Back to text classification
is just
![Page 22: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/22.jpg)
22
Back to text classification
is just the count of science docs / total docs
![Page 23: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/23.jpg)
23
Back to text classification
is just the count of science docs / total docsBut how do we model the whole matrix ?
![Page 24: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/24.jpg)
24
The “Naïve” part of Naïve Bayes
Assume that everything is conditionally independent given the class:
![Page 25: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/25.jpg)
25
Return of smoothing…
is…
![Page 26: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/26.jpg)
26
Return of smoothing…
is… The number of science documents containing whale Divided by the number of science documents
![Page 27: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/27.jpg)
27
Return of smoothing…
is… The number of science documents containing whale Divided by the number of science documents
What is ?
![Page 28: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/28.jpg)
28
Return of smoothing…
is… The number of science documents containing whale Divided by the number of science documents
What is ? 0! Need to smooth…
![Page 29: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/29.jpg)
29
Return of smoothing…
is… The number of science documents containing whale Divided by the number of science documents
What is ? 0! Need to smooth… What would Add-One (Laplace) smoothing look like?
![Page 30: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/30.jpg)
30
Exercisedocument label
TRAIN Apple poised to unveil iPad Mini TECH
Apple product leaks TECH
Researchers test apple, cherry trees SCIENCE
TEST Dangerous apple, cherry pesticides ?
![Page 31: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/31.jpg)
31
Benchmark dataset #1:20 newsgroups 18,000 documents from 20 distinct newsgroups
A now mostly unused technology for sharing textual information, with hierarchical topical groups
comp.graphicscomp.os.ms-windows.misccomp.sys.ibm.pc.hardwarecomp.sys.mac.hardwarecomp.windows.x
rec.autosrec.motorcyclesrec.sport.baseballrec.sport.hockey
sci.cryptsci.electronicssci.medsci.space
misc.forsale talk.politics.misctalk.politics.gunstalk.politics.Mideast
talk.religion.miscalt.atheismsoc.religion.christian
![Page 32: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/32.jpg)
32
Results:
![Page 33: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/33.jpg)
33
Evaluation methods
“macro”-averaging: Compute Precision and Recall for each category Take average of per-category precision and recall values
gold category totals
news sports arts science
predicted category news 15 7 0 1 23
sports 6 17 0 0 23
arts 0 0 4 0 4
science 1 0 0 7 8
totals 22 24 4 8
![Page 34: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/34.jpg)
34
Evaluation methods
There is also “macro”-averaging: Compute Precision and Recall for each category Take average of per-category precision and recall values
gold category totals
news sports arts science
predicted category news 15 7 0 1 23
sports 6 17 0 0 23
arts 0 0 4 0 4
science 1 0 0 7 8
totals 22 24 4 8
![Page 35: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/35.jpg)
35
gold category prec
news sports arts science
predicted category news 15 7 0 1 0.65
sports 6 17 0 0 0.74
arts 0 0 4 0 1.00
science 1 0 0 7 0.88
recall 0.68 0.71 1.00 0.88
![Page 36: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/36.jpg)
36
Evaluation methods
What is the analogue of precision and recall for multiclass classification?
We can still compute precision and recall as usual for each category Then add up these numbers to compute precision and recall This is called “micro-averaging”, and focuses on document level
accuracy
Gold standard
all other categories
Classifier output
all other categories
![Page 37: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/37.jpg)
37
gold category prec
news sports arts science 0.82
predicted category news 15 7 0 1 0.65
sports 6 17 0 0 0.74
arts 0 0 4 0 1.00
science 1 0 0 7 0.88
recall 0.82 0.68 0.71 1.00 0.88
![Page 38: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/38.jpg)
38
news Gold standardnews other
Classifier output
news 15 8
other 7
sports Gold standardsports other
Classifier output
sports 17 6
other 7
science Gold standardsci other
Classifier output
sci 7 1
other 1
arts Gold standardarts other
Classifier output
arts 4 0
other 0
gold category totals
news sports arts science
predicted category news 15 7 0 1 23
sports 6 17 0 0 23
arts 0 0 4 0 4
science 1 0 0 7 8
totals 22 24 4 8
![Page 39: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/39.jpg)
39
news Gold standardnews other
Classifier output
news 15 8
other 7
sports Gold standardsports other
Classifier output
sports 17 6
other 7
science Gold standardsci other
Classifier output
sci 7 1
other 1
arts Gold standardarts other
Classifier output
arts 4 0
other 0
total Gold standardcorrect other recall
Classifier output
correct 43 15 0.74
other 15
prec 0.74
![Page 40: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/40.jpg)
40
Feature selection
![Page 41: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/41.jpg)
41
Sentiment Analysis
![Page 42: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/42.jpg)
42
Sentiment Analysis
Consider movie reviews: Given a review from a site like Rotten Tomatoes, try to
detect if the reviewers liked it Some observations:
Humans can quickly and easily identify sentiment Easier that performing topic classification, often Suspicion: Certain words may be indicative of
sentiment
![Page 43: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/43.jpg)
43
Simple Experiment[Pang, Lee, Vaithyanathan, EMNLP 2002] Ask two grad students to come up with a list of words
changed with sentiment Create a very simple, deterministic classifier based on this:
Count number of positive and negative hits Break ties to increase accuracy
![Page 44: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/44.jpg)
44
Simple Experiment[Pang, Lee, Vaithyanathan, EMNLP 2002] Ask two grad students to come up with a list of words changed with
sentiment Create a very simple, deterministic classifier based on this:
Count number of positive and negative hits Break ties to increase accuracy
Compare to automatically extracted lists
![Page 45: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/45.jpg)
45
Toward more solid machine learning Prior decision rule was very heuristic
Just count the number of charged words Ties are a significant issue
What happens when we shift to something more complex?
![Page 46: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/46.jpg)
46
Toward more solid machine learning Prior decision rule was very heuristic
Just count the number of charged words Ties are a significant issue
What happens when we shift to something more complex?
Naïve Bayes Maximum Entropy (aka logistic regression, aka log-
linear models) Support Vector Machines
![Page 47: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/47.jpg)
47
Experimental results
Baseline was 69% accuracy.
Here we get just under 79% with all words, just using frequency.
What happens when we use binary features instead?
![Page 48: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/48.jpg)
48
Experimental results
Unigrams are pretty good – what happens when we add bigrams?
![Page 49: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/49.jpg)
49
Experimental results
Why are just bigrams worse than unigrams and bigrams together?
![Page 50: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/50.jpg)
50
Experimental results
![Page 51: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/51.jpg)
51
Experimental results
![Page 52: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/52.jpg)
52
Domain Adaptation
![Page 53: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/53.jpg)
53
What are we learning?
Primary features are unigrams.
For a movie, “unpredictable” is a good thing – likely to be an interesting thriller.
![Page 54: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/54.jpg)
54
What are we learning?
Primary features are unigrams.
For a movie, “unpredictable” is a good thing – likely to be an interesting thriller.
For a dishwasher, “unpredictable” is not so great.
![Page 55: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/55.jpg)
55
Domain shift[Blitzer, Dredze, Pereira, 1997] What happens when we move to another domain?
Gather Amazon reviews from four domains: Books, DVDs, Electronics, Kitchen appliances
Each review has Rating (0-5 stars) Reviewer name and location Product name Review (title, date, and body)
Ratings <3 become negative, >3 become positive; remainder considered ambiguous and discarded
1000 positive and 1000 negative in each domain
![Page 56: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/56.jpg)
56
Domain adaptation effects
Books DVDs Electronics Kitchen50
55
60
65
70
75
80
85
90
Books DVDs Electronics Kitchen
![Page 57: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/57.jpg)
57
Domain adaptation effects
Books DVDs Electronics Kitchen50
55
60
65
70
75
80
85
90
Books DVDs Electronics Kitchen
![Page 58: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/58.jpg)
58
Domain adaptation effects
Books DVDs Electronics Kitchen50
55
60
65
70
75
80
85
90
Books DVDs Electronics Kitchen
![Page 59: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/59.jpg)
59
Lessons learned
Be careful with your classifier: Just because you get high accuracy on one test set
doesn’t guaranteed high accuracy on another test set Domain adaptation can be a major hit
What can we do about this?
![Page 60: Ling 570 Day 9: Text Classification and Sentiment Analysis 1](https://reader030.vdocuments.us/reader030/viewer/2022032703/56649cff5503460f949d01e9/html5/thumbnails/60.jpg)
60
Lessons learned
Be careful with your classifier: Just because you get high accuracy on one test set
doesn’t guaranteed high accuracy on another test set Domain adaptation can be a major hit
What can we do about this? Supervised approaches – say we have a little bit of
training in the NEW domain, and a lot in the OLD domain, learn features from both (“Frustratingly Easy”, Daume 2007)
Unsupervised approaches (Structural Correspondence Learning)