topic-factorized ideal point estimation model for legislative voting network
DESCRIPTION
Topic-Factorized Ideal Point Estimation Model for Legislative Voting Network. Yupeng Gu † , Yizhou Sun † , Ning Jiang ‡ , Bingyu Wang † , Ting Chen † † Northeastern University ‡ University of Illinois at Urbana-Champaign. Background. Federal Legislation (bill). The House Senate. Law. - PowerPoint PPT PresentationTRANSCRIPT
Topic-Factorized Ideal Point Estimation Model for Legislative Voting Network
Yupeng Gu†, Yizhou Sun†, Ning Jiang‡, Bingyu Wang†, Ting Chen†
†Northeastern University‡University of Illinois at Urbana-Champaign
2 /27
Background Federal Legislation
(bill)Law
The HouseSenate
Ronald Paul
Bill 1 Bill 2 ……
Barack Obama
Ronald Paul
liberal conservative
Politician
Republican Democrat
Barack Obama
United States Congress
The House Senate
3 /27
Outline
• Background and Problem Definition• Topic Factorized Ideal Point Model and Learning Algorithm• Experimental Results• Conclusion
4 /27
Legislative Voting Network
5 /27
Problem Definition
Input: Legislative Network
Output:: Ideal Points for Politician : Ideal Points for Bill
’s on different topics
6 /27
Existing Work
• 1 dimensional ideal point model (Poole and Rosenthal, 1985; Gerrish and Blei, 2011) • High-dimensional ideal point model (Poole and Rosenthal, 1997)• Issue-adjusted ideal point model (Gerrish and Blei, 2012)
7 /27
Motivations
Topic 1Topic 2Topic 3Topic 4
• Voters have different attitudes on different topics.
• Traditional matrix factorization method cannot give the meanings for each dimension.
𝑀 𝑈≈ ⋅𝑉 𝑇
latent factor
• Topics of bills can influence politician’s voting, and the voting behavior can better interpret the topics of bills as well.
Topic Model:• Health• Public Transport• …
Voting-guided Topic Model:• Health Service• Health Expenses• Public Transport• …
8 /27
Outline
• Background and Problem Definition• Topic Factorized Ideal Point Model and Learning Algorithm• Experimental Results• Conclusion
9 /27
Topic Factorized IPM
𝑢 𝑑𝑤
PoliticiansBills
Terms
Heterogeneous Voting Network
𝑛(𝑑 ,𝑤)𝑣𝑢𝑑
Entities:• Politicians• Bills• Terms
Links:• (, )• ()
10 /27
Text Part
PoliticiansBills
Terms
11 /27
Text Part
• We model the probability of each word in each document as a mixture of multinomial distributions, as in PLSA (Hofmann, 1999) and LDA (Blei et al., 2003)
𝑑 𝑘 𝑤𝜃𝑑𝑘=𝑝 (𝑘∨𝑑) 𝛽𝑘𝑤=𝑝 (𝑤∨𝑘)
Bill Topic Word
12 /27
Voting Part
PoliticiansBills
Terms
13 /27
Voting Part
YEA𝑝 (𝑣𝑢𝑑=1 )=𝜎 (∑𝑘
𝜃𝑑𝑘 𝑥𝑢𝑘𝑎𝑑𝑘+𝑏𝑑)
𝒙𝑢
𝒂𝑑
Topic1
Topic2
…… Topic Topic……
0 1 -1 1 1 1 1 1
0 0 -1 1 1 1 -1 1
1 1 1 1 -1 1 0 0
𝑢1𝑢2
𝑢𝑁𝑈
𝑑1𝑑2 𝑑𝑁 𝐷
……
……
User-Bill voting matrix
�̂�𝑢𝑑=∑𝑘=1
𝐾
𝑥𝑢𝑘𝑎𝑑𝑘
�̂�𝑢𝑑=∑𝑘=1
𝐾
𝜃𝑑𝑘𝑥𝑢𝑘𝑎𝑑𝑘
𝑝 (𝑣𝑢𝑑=−1 )=1−𝜎 (∑𝑘
𝜃𝑑𝑘𝑥𝑢𝑘𝑎𝑑𝑘+𝑏𝑑)NAY
Voter
Bill
𝑝 (𝑽|𝜽 ,𝑿 ,𝑨 ,𝒃 )= ∏(𝑢 ,𝑑 ) :𝑣𝑢𝑑≠ 0
(𝑝 (𝑣𝑢𝑑=1 )1+𝑣𝑢𝑑2 𝑝 (𝑣𝑢𝑑=−1 )
1−𝑣𝑢𝑑2 ¿)¿
𝑥𝑢 1
𝑥𝑢𝑘
𝑥𝑢𝐾
𝑎𝑑1
𝑎𝑑𝑘
𝑎𝑑𝐾
……
……
𝜃𝑑𝑘
𝜃𝑑1
𝜃𝑑𝐾
𝑥𝑢𝑘∈𝑹
𝑎𝑑 𝑘∈𝑹
𝐼 {𝑣𝑢𝑑=1}𝐼 {𝑣𝑢𝑑=−1 }
14 /27
Combining Two Parts Together• The final objective function is a linear combination of the two average
log-likelihood functions over the word links and voting links.
• We also add an regularization term to and to reduce over-fitting.
s.t.and
0≤ 𝜃𝑑𝑘≤1 ,∑𝑘
𝜃𝑑𝑘=10≤ 𝛽𝑘𝑤≤1 ,∑𝑤
𝛽𝑘𝑤=1
𝐽 (𝜽 ,𝜷 ,𝑿 ,𝑨 ,𝒃 )
¿ (1− 𝜆 )∑𝑑 ,𝑤
𝑛 (𝑑 ,𝑤 ) log ¿¿¿𝐽 (𝜽 ,𝜷 ,𝑿 ,𝑨 ,𝒃 )
15 /27
Learning Algorithm
• An iterative algorithm where ideal points related parameters () and topic model related parameters () enhance each other.• Step 1: Update given
• Gradient descent• Step 2: Update given
• Follow the idea of expectation-maximization (EM) algorithm: maximize a lower bound of the objective function in each iteration
16 /27
Learning Algorithm
• Update : A nonlinear constrained optimization problem. Remove the constraints by a logistic function based transformation:
and update using gradient descent.• Update : Since only appears in the topic model part, we use the same updating rule as in PLSA:
where
𝑒𝜇𝑑𝑘
1+∑𝑘′=1
𝐾− 1
𝑒𝜇𝑑𝑘 ′
1
1+∑𝑘′=1
𝐾− 1
𝑒𝜇𝑑𝑘 ′
if
if
𝜃𝑑𝑘=¿
17 /27
Outline
• Background and Problem Definition• Topic Factorized Ideal Point Model and Learning Algorithm• Experimental Results• Conclusion
18 /27
Data Description
• Dataset:• U.S. House and Senate roll call data in the years between 1990 and 2013
• 1,540 legislators• 7,162 bills• 2,780,453 votes (80% are “YEA”)
• Keep the latest version of a bill if there are multiple versions.• Randomly select 90% of the votes as training and 10% as testing.
Downloaded from http://thomas.loc.gov/home/rollcallvotes.html
19 /27
Evaluation Measures
• Root mean square error (RMSE) between the predicted vote score and the ground truth
RMSE =
• Accuracy of correctly predicted votes (using 0.5 as a threshold for the predicted accuracy)
Accuracy =
• Average log-likelihood of the voting linkAvelogL =
20 /27
Experimental Results
Training Data set Testing Data set
21 /27
Parameter Study
Parameter study on Parameter study on (regularization coefficient)
¿ (1− 𝜆 )⋅ 𝑎𝑣𝑒𝑙𝑜𝑔𝐿 (𝑡𝑒𝑥𝑡 )+𝜆⋅𝑎𝑣𝑒𝑙𝑜𝑔𝐿(𝑣𝑜𝑡𝑖𝑛𝑔)−1
2𝜎2 ¿𝐽 (𝜽 ,𝜷 ,𝑿 ,𝑨 ,𝒃 )
22 /27
ForeignEducation
Individual Property
MilitaryFinancial Institution
LawHealth Service
Health Expenses
Funds
Public Transportation
Ronald Paul Barack Obama Joe Lieberman
Case Studies
• Ideal points for three famous politicians: (Republican, Democrat)• Ronald Paul (R), Barack Obama (D), Joe Lieberman (D)
23 /27
Case Studies
• Pick three topics: (Republican, Democrat)
24 /27
Case Studies
𝑝 (𝑣𝑢𝑑=1 )=𝜎 (∑𝑘
𝜃𝑑𝑘 𝑥𝑢𝑘𝑎𝑑𝑘+𝑏𝑑)
Bill: H_R_4548 (in 2004)
NayR. Paul H_R_4548
Topic Model
TF-IPM
Experts
𝜽𝑑
𝒙𝑢
𝒂𝑑 ,𝑏𝑑
𝑝 (𝑣𝑢𝑑=1 )
For Unseen Bill :
𝑎𝑑𝑘 𝜃𝑑𝑘
𝑥𝑢𝑘
25 /27
Outline
• Background and Problem Definition• Topic Factorized Ideal Point Model and Learning Algorithm• Experimental Results• Conclusion
26 /27
Conclusion
• We estimate the ideal points of legislators and bills on multiple dimensions instead of global ones. • The generation of topics are guided by two types of links in the
heterogeneous network.• We present a unified model that combines voting behavior and topic
modeling, and propose an iterative learning algorithm to learn the parameters.• The experimental results on real-world roll call data show our
advantage over the state-of-the-art methods.
27 /27
Thanks
Q & A