chuma kabaghe, jason qin

Classifying Tweets Based on Climate Change StanceChuma Kabaghe, Jason Qin

Department of Computer Science, Stanford University Department of Chemical Engineering , Stanford University

MotivationClimate change has become an increasingly polarizing topic in popular and political media. Using Twitter, we seek to develop a classifier that can discriminate between text that shows belief vs. disbelief in human-caused climate change. We use semisupervised classification with Quasi-Newton Semi-SVM implementation (QN-S3VM) and Multinomial Naive Bayes with Expectation Maximization (MNB-EM) into positive, neutral, and negative categories. Our results show that semisupervisedlearning does not offer significant improvements over supervised learning.

Data Set- Three categories: positive, neutral,

negative belief of human-caused climate change

- Manually labeled training/ validation/test set of ~14K tweets

- Unlabeled set of ~38K tweets related to climate change/global warming collected using TweePy

- Data split into 70/20/10 train/test/val

Tweet Class

"climate change, a factor in texas floods, largely ignored"

"global warming on mars...." 0

"scientists were attempting the same global warming scam 60 years ago"

ResultsSupervised LearningMultinomial Naive Bayes

- Unigram: !"# is each word

- Bigram: !"#= (&"

#, &"()

# ) pair of words- Assume independence of all words/pairs within and

between each example

Semisupervised LearningSelf Training

QN-S3VM- Maximum-margin classification algorithm - Find the optimal y for the unlabeled data

MNB-EM

- E-step:

- M-step:

0 !"#1 # ; 34|6 0(1 # ; 36)

L = ∑#,)- ∑",)./ log 0 !"

#1 # ; 34|6 + log 0 1 # ; 36 ∗

> ∑#,)4 log ∑

6 ? @A 1A

∏CDE

F? G !HA1 A ; 34|6 G 6 / ;IJ

Train Model with Training

Predict Labels of Unlabeled

DataAdd data with >50% confidence in label

Final Model

@A 1A =

∏H,).? 0 !H

A1 A ; 34|6 0 1 # ; 36

∑6 ? ∏H,)

.? 0 !HA1 A ; 34|6 0 1 # ; 36

34|6 =1 + >∑#,)

- ∑",)./ 1(!"

#= M ⋀1 # = O) + ∑A,)

4 @A(1A ) ∑H,)

.? 1(!HA= M)

P + >∑#,)- ∑

./ 1(1 # = O) + ∑A,)4 @A(1

A ) ∑H,).? 1

36 =1 + >∑#,)

- 1(1 # = O) + ∑A,)4 @A(1

Q + >R + M

MethodsComparing Algorithms

Accuracy Per Class

Model Tuning

Model Training Acc Validation Acc Test Acc

MNB Unigram 76.8 67.6 66.4

MNB Bigram 72.8 62.5 60.6

Naive Bayes Self-Training

66.7 60.9 61.6

QN-S3VM 59.2 59.0 55.0

MNB-EM 81.9 62.5 63.9

Conclusions

References

MNB-EMMNB Unigram

Kernel Training Acc Validation Acc

RBF 59.2 59.0

Linear 52.9 49.2

Alpha Training Acc Validation Acc

1 67.3 60.7

5 76.8 62.3

20 81.9 65.2

50 82.6 65.1

100 82.9 64.5

Data DownsamplingQuestion: how much would unlabeled data help if we had less labeled data?

Conclusions- Unigram Multinomial Naive Bayes model

performs best - Semisupervised learning with NB performs

well, likely since underlying NB model fits data well

- More data would improve predictions

In future- Incorporate more recently labeled data- Attempt classification with DL models

[1] Edward C Qian. “edwardcqian/climate_change_sentiment”. In: GitHub(Mar. 2019). URL:https://github.com/edwardcqian/climate_change_sentiment

[2] Fabian Gieseke et al. “Sparse Quasi-Newton Optimization for Semi-Supervised Support Vector Machines”. In: vol. 1. Jan. 2012, pp. 45–54.

[3] Kamal Nigam et al. “Text Classification from Labeled and Unlabeled Documents using EM”. In: Machine Learning 39.2 (May 2000), pp. 103–134. ISSN:1573-0565. DOI:10.1023/A:1007692713085. URL:https://doi.org/10.1023/A:1007692713085

chuma kabaghe, jason qin

Documents

qin shi huang.docx

qin and han

qin and han dynasties

active m qin action

the zhou and qin dynastiesthe zhou and qin dynasties … ·...

01 qin dynasty

qin shihuangdi (first emperor of qin) died 210 bce tumulus...

the reign of qin shi huang di. qin shi huang di qin shi...

qin and han china

qin state vs. the qin dynasty

ch’in (qin) dynasty

mike chuma bma mn 9 august 2012

the qin dynasty

kong audio qin-powered instrument user manual...

qin dynasty & han dynasty

chapter 7 section 3 the qin dynasty. the qin dynasty 221-206...

rh[qin an[ aihirkd)

qin to ming

mpc cep qin badgwell

curriculum vitae for chuma livingstone harare