![Page 1: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/1.jpg)
Measuring and Explaining Political SophisticationThrough Textual Complexity
Kenneth Benoit Kevin Munger Arthur Spirling
New Directions in Analyzing Text as Data ConferenceNortheastern University, October 14-15, 2016
![Page 2: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/2.jpg)
Political sophistication in the public mind
Source: The Guardian, February 2013
![Page 3: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/3.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 4: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/4.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 5: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/5.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 6: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/6.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 7: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/7.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 8: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/8.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 9: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/9.jpg)
Political Communication and Textual Complexity
Citizen comprehension of political speech
Changes over time, differences between speakers
Problems with existing measures of textual complexity
Preview of our solution:
Crowdsource comparisons of relevant political text
Scale those texts and learn what features best predict easiness
Fit a model that can be applied to other texts
![Page 10: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/10.jpg)
Other measures of reading ease
Name of Method Author Year Citations
Flesch Reading Ease Flesch 1948/49 3,793SMOG McLaughlin 1969 1,402Dale-Chall Dale and Chall 1948 1,389Gunning Fog Index Gunning 1952 1,232Flesch-Kincaid Level Kincaid et al 1975 1,093Fry Graph Fry 1968 1,007Spache Formula Spache 1953 355Coleman-Liau Coleman and Liau 1975 261
Commonly used ‘reading ease’ measures in order of citation via Googlescholar at the time of writing.
![Page 11: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/11.jpg)
Reading Ease
Flesch Reading Ease (FRE) Score
Developed to measure average grade level of students based onability to answer multiple-choice questions after reading a text
In 1948
206.835 − 1.015
(# of words
# of sentences
)− 84.6
(# of syllables
# of words
)
Ostensibly bounded between 0 and 100
Updated by Kincaid et al. 1975 as a linear rescaling to US gradeschool level
![Page 12: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/12.jpg)
Reading Ease
Flesch Reading Ease (FRE) Score
Developed to measure average grade level of students based onability to answer multiple-choice questions after reading a text
In 1948
206.835 − 1.015
(# of words
# of sentences
)− 84.6
(# of syllables
# of words
)
Ostensibly bounded between 0 and 100
Updated by Kincaid et al. 1975 as a linear rescaling to US gradeschool level
![Page 13: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/13.jpg)
Reading Ease
Flesch Reading Ease (FRE) Score
Developed to measure average grade level of students based onability to answer multiple-choice questions after reading a text
In 1948
206.835 − 1.015
(# of words
# of sentences
)− 84.6
(# of syllables
# of words
)
Ostensibly bounded between 0 and 100
Updated by Kincaid et al. 1975 as a linear rescaling to US gradeschool level
![Page 14: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/14.jpg)
Reading Ease
Flesch Reading Ease (FRE) Score
Developed to measure average grade level of students based onability to answer multiple-choice questions after reading a text
In 1948
206.835 − 1.015
(# of words
# of sentences
)− 84.6
(# of syllables
# of words
)
Ostensibly bounded between 0 and 100
Updated by Kincaid et al. 1975 as a linear rescaling to US gradeschool level
![Page 15: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/15.jpg)
Reading Ease
Flesch Reading Ease (FRE) Score
Developed to measure average grade level of students based onability to answer multiple-choice questions after reading a text
In 1948
206.835 − 1.015
(# of words
# of sentences
)− 84.6
(# of syllables
# of words
)
Ostensibly bounded between 0 and 100
Updated by Kincaid et al. 1975 as a linear rescaling to US gradeschool level
![Page 16: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/16.jpg)
Breaking the FRE Score
Consider this sentence
Indeed, the shoemaker was frightened.
FRE = 16.23
Forsooth, the cordwainer was afeared.
FRE = 16.23
No measure of the difficulty of the words (or any othergrammatical challenges)
Is this really the quantity we’re interested in?
![Page 17: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/17.jpg)
Breaking the FRE Score
Consider this sentence
Indeed, the shoemaker was frightened.
FRE = 16.23
Forsooth, the cordwainer was afeared.
FRE = 16.23
No measure of the difficulty of the words (or any othergrammatical challenges)
Is this really the quantity we’re interested in?
![Page 18: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/18.jpg)
Breaking the FRE Score
Consider this sentence
Indeed, the shoemaker was frightened.
FRE = 16.23
Forsooth, the cordwainer was afeared.
FRE = 16.23
No measure of the difficulty of the words (or any othergrammatical challenges)
Is this really the quantity we’re interested in?
![Page 19: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/19.jpg)
Breaking the FRE Score
Consider this sentence
Indeed, the shoemaker was frightened.
FRE = 16.23
Forsooth, the cordwainer was afeared.
FRE = 16.23
No measure of the difficulty of the words (or any othergrammatical challenges)
Is this really the quantity we’re interested in?
![Page 20: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/20.jpg)
Breaking the FRE Score
Consider this sentence
Indeed, the shoemaker was frightened.
FRE = 16.23
Forsooth, the cordwainer was afeared.
FRE = 16.23
No measure of the difficulty of the words (or any othergrammatical challenges)
Is this really the quantity we’re interested in?
![Page 21: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/21.jpg)
Breaking the FRE Score
Consider this sentence
Indeed, the shoemaker was frightened.
FRE = 16.23
Forsooth, the cordwainer was afeared.
FRE = 16.23
No measure of the difficulty of the words (or any othergrammatical challenges)
Is this really the quantity we’re interested in?
![Page 22: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/22.jpg)
The “Out-of-Domain” prediction problem
We want to measure how well adult citizens are able to understandpolitical texts. Previous measures were:
designed for educational research, not political texts;
validated on schoolchildren, not adults; and
mostly designed in the 1940s and 50s, which is a long time ago.
These problems are straightforward to fix.
![Page 23: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/23.jpg)
The “Out-of-Domain” prediction problem
We want to measure how well adult citizens are able to understandpolitical texts. Previous measures were:
designed for educational research, not political texts;
validated on schoolchildren, not adults; and
mostly designed in the 1940s and 50s, which is a long time ago.
These problems are straightforward to fix.
![Page 24: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/24.jpg)
The “Out-of-Domain” prediction problem
We want to measure how well adult citizens are able to understandpolitical texts. Previous measures were:
designed for educational research, not political texts;
validated on schoolchildren, not adults; and
mostly designed in the 1940s and 50s, which is a long time ago.
These problems are straightforward to fix.
![Page 25: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/25.jpg)
The “Out-of-Domain” prediction problem
We want to measure how well adult citizens are able to understandpolitical texts. Previous measures were:
designed for educational research, not political texts;
validated on schoolchildren, not adults; and
mostly designed in the 1940s and 50s, which is a long time ago.
These problems are straightforward to fix.
![Page 26: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/26.jpg)
The “Out-of-Domain” prediction problem
We want to measure how well adult citizens are able to understandpolitical texts. Previous measures were:
designed for educational research, not political texts;
validated on schoolchildren, not adults; and
mostly designed in the 1940s and 50s, which is a long time ago.
These problems are straightforward to fix.
![Page 27: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/27.jpg)
A modern solution: crowdsourcing binary comparisons
![Page 28: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/28.jpg)
Crowdflower specifics
1 Formed all possible 1- and 2- sentence snippets from the SOTUcorpus
2 Discarded those with extreme FRE scores, and those containinglarge numbers
3 Created 10,000 pairwise comparisons between 2,000 randomlysampled snippets, with coarse matching on snippet length andFRE score
sufficient connectivity that we could scale all of them
4 Coded 2,000 of these comparisons three separate times, so 6,000total data points
![Page 29: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/29.jpg)
Crowdflower specifics
1 Formed all possible 1- and 2- sentence snippets from the SOTUcorpus
2 Discarded those with extreme FRE scores, and those containinglarge numbers
3 Created 10,000 pairwise comparisons between 2,000 randomlysampled snippets, with coarse matching on snippet length andFRE score
sufficient connectivity that we could scale all of them
4 Coded 2,000 of these comparisons three separate times, so 6,000total data points
![Page 30: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/30.jpg)
Crowdflower specifics
1 Formed all possible 1- and 2- sentence snippets from the SOTUcorpus
2 Discarded those with extreme FRE scores, and those containinglarge numbers
3 Created 10,000 pairwise comparisons between 2,000 randomlysampled snippets, with coarse matching on snippet length andFRE score
sufficient connectivity that we could scale all of them
4 Coded 2,000 of these comparisons three separate times, so 6,000total data points
![Page 31: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/31.jpg)
Crowdflower specifics
1 Formed all possible 1- and 2- sentence snippets from the SOTUcorpus
2 Discarded those with extreme FRE scores, and those containinglarge numbers
3 Created 10,000 pairwise comparisons between 2,000 randomlysampled snippets, with coarse matching on snippet length andFRE score
sufficient connectivity that we could scale all of them
4 Coded 2,000 of these comparisons three separate times, so 6,000total data points
![Page 32: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/32.jpg)
Crowdflower specifics
1 Formed all possible 1- and 2- sentence snippets from the SOTUcorpus
2 Discarded those with extreme FRE scores, and those containinglarge numbers
3 Created 10,000 pairwise comparisons between 2,000 randomlysampled snippets, with coarse matching on snippet length andFRE score
sufficient connectivity that we could scale all of them
4 Coded 2,000 of these comparisons three separate times, so 6,000total data points
![Page 33: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/33.jpg)
A modern solution: crowdsourcing binary comparisons
![Page 34: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/34.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 35: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/35.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 36: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/36.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 37: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/37.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 38: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/38.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 39: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/39.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 40: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/40.jpg)
Problems for inference
Extant measures have undesirable statistical properties.
1 No way of evaluating “model fit” for measures applied to a newcontext
No way of comparing different measures in a given context
2 No natural interpretation of fine-grained differences in documentscores
eg FRE of 75 vs 70 vs 80
3 No standard errors or other measure of uncertainty
We can model this!
![Page 41: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/41.jpg)
Our approach: Bradley-Terry Regression
1 Consider determining which of two texts, i and j , is “easier”
2 If the ‘easiness’ of i is αi , and the ‘easiness’ of j is αj , then theodds that snippet i is deemed easier than j may be written as αi
αj
3 Defining λi = logαi , the regression model can be rewritten:
logit[Pr(i easier than j)] = λi − λj
4 Using only the labels from crowdsourcing, we fit an unstructuredBradley Terry model to scale the snippets and generate a rankordering and λ score for each
![Page 42: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/42.jpg)
Our approach: Bradley-Terry Regression
1 Consider determining which of two texts, i and j , is “easier”
2 If the ‘easiness’ of i is αi , and the ‘easiness’ of j is αj , then theodds that snippet i is deemed easier than j may be written as αi
αj
3 Defining λi = logαi , the regression model can be rewritten:
logit[Pr(i easier than j)] = λi − λj
4 Using only the labels from crowdsourcing, we fit an unstructuredBradley Terry model to scale the snippets and generate a rankordering and λ score for each
![Page 43: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/43.jpg)
Our approach: Bradley-Terry Regression
1 Consider determining which of two texts, i and j , is “easier”
2 If the ‘easiness’ of i is αi , and the ‘easiness’ of j is αj , then theodds that snippet i is deemed easier than j may be written as αi
αj
3 Defining λi = logαi , the regression model can be rewritten:
logit[Pr(i easier than j)] = λi − λj
4 Using only the labels from crowdsourcing, we fit an unstructuredBradley Terry model to scale the snippets and generate a rankordering and λ score for each
![Page 44: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/44.jpg)
Our approach: Bradley-Terry Regression
1 Consider determining which of two texts, i and j , is “easier”
2 If the ‘easiness’ of i is αi , and the ‘easiness’ of j is αj , then theodds that snippet i is deemed easier than j may be written as αi
αj
3 Defining λi = logαi , the regression model can be rewritten:
logit[Pr(i easier than j)] = λi − λj
4 Using only the labels from crowdsourcing, we fit an unstructuredBradley Terry model to scale the snippets and generate a rankordering and λ score for each
![Page 45: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/45.jpg)
Variable selection
There are millions of potential variables to consider
We begin with all constituent variables of the traditional models,add in some new ones
29 possible variables
Use a variant of random forests to select the variables that best fitthe snippets scaled through unstructured Bradley-Terry regression
VSURF package developed by Genuer, Poggi and Tuleau-Malot
![Page 46: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/46.jpg)
Variable selection
There are millions of potential variables to consider
We begin with all constituent variables of the traditional models,add in some new ones
29 possible variables
Use a variant of random forests to select the variables that best fitthe snippets scaled through unstructured Bradley-Terry regression
VSURF package developed by Genuer, Poggi and Tuleau-Malot
![Page 47: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/47.jpg)
Variable selection
There are millions of potential variables to consider
We begin with all constituent variables of the traditional models,add in some new ones
29 possible variables
Use a variant of random forests to select the variables that best fitthe snippets scaled through unstructured Bradley-Terry regression
VSURF package developed by Genuer, Poggi and Tuleau-Malot
![Page 48: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/48.jpg)
Variable selection
There are millions of potential variables to consider
We begin with all constituent variables of the traditional models,add in some new ones
29 possible variables
Use a variant of random forests to select the variables that best fitthe snippets scaled through unstructured Bradley-Terry regression
VSURF package developed by Genuer, Poggi and Tuleau-Malot
![Page 49: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/49.jpg)
Variable selection
There are millions of potential variables to consider
We begin with all constituent variables of the traditional models,add in some new ones
29 possible variables
Use a variant of random forests to select the variables that best fitthe snippets scaled through unstructured Bradley-Terry regression
VSURF package developed by Genuer, Poggi and Tuleau-Malot
![Page 50: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/50.jpg)
Google nGram
A collection of word counts in the Google books corpus
1790 onward, filter out tokens that appeared fewer than five timesor did not match a dictionary of 133,000 English words/wordforms 615,362,456,717 token counts from 85,623 word types
Normalize each word frequency to its frequency relative to theword “the” in that year, smoothing by decade
Word frequency in the 2000s–the closest decade to the present–tomeasure the presence of words that are rare from the perspectiveof our coders
When “plugging in” values of covariates to evaluate older texts,we will use the word frequency from the decade in which theyoriginate
![Page 51: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/51.jpg)
Google nGram
A collection of word counts in the Google books corpus
1790 onward, filter out tokens that appeared fewer than five timesor did not match a dictionary of 133,000 English words/wordforms
615,362,456,717 token counts from 85,623 word types
Normalize each word frequency to its frequency relative to theword “the” in that year, smoothing by decade
Word frequency in the 2000s–the closest decade to the present–tomeasure the presence of words that are rare from the perspectiveof our coders
When “plugging in” values of covariates to evaluate older texts,we will use the word frequency from the decade in which theyoriginate
![Page 52: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/52.jpg)
Google nGram
A collection of word counts in the Google books corpus
1790 onward, filter out tokens that appeared fewer than five timesor did not match a dictionary of 133,000 English words/wordforms 615,362,456,717 token counts from 85,623 word types
Normalize each word frequency to its frequency relative to theword “the” in that year, smoothing by decade
Word frequency in the 2000s–the closest decade to the present–tomeasure the presence of words that are rare from the perspectiveof our coders
When “plugging in” values of covariates to evaluate older texts,we will use the word frequency from the decade in which theyoriginate
![Page 53: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/53.jpg)
Google nGram
A collection of word counts in the Google books corpus
1790 onward, filter out tokens that appeared fewer than five timesor did not match a dictionary of 133,000 English words/wordforms 615,362,456,717 token counts from 85,623 word types
Normalize each word frequency to its frequency relative to theword “the” in that year, smoothing by decade
Word frequency in the 2000s–the closest decade to the present–tomeasure the presence of words that are rare from the perspectiveof our coders
When “plugging in” values of covariates to evaluate older texts,we will use the word frequency from the decade in which theyoriginate
![Page 54: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/54.jpg)
Google nGram
A collection of word counts in the Google books corpus
1790 onward, filter out tokens that appeared fewer than five timesor did not match a dictionary of 133,000 English words/wordforms 615,362,456,717 token counts from 85,623 word types
Normalize each word frequency to its frequency relative to theword “the” in that year, smoothing by decade
Word frequency in the 2000s–the closest decade to the present–tomeasure the presence of words that are rare from the perspectiveof our coders
When “plugging in” values of covariates to evaluate older texts,we will use the word frequency from the decade in which theyoriginate
![Page 55: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/55.jpg)
Google nGram
A collection of word counts in the Google books corpus
1790 onward, filter out tokens that appeared fewer than five timesor did not match a dictionary of 133,000 English words/wordforms 615,362,456,717 token counts from 85,623 word types
Normalize each word frequency to its frequency relative to theword “the” in that year, smoothing by decade
Word frequency in the 2000s–the closest decade to the present–tomeasure the presence of words that are rare from the perspectiveof our coders
When “plugging in” values of covariates to evaluate older texts,we will use the word frequency from the decade in which theyoriginate
![Page 56: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/56.jpg)
Structured Bradley-Terry Model
We have our covariates
We still model this comparison:
logit[Pr(i easier than j)] = λi − λj
We can model λi as a function of the covariates r that weselected using a structured Bradley-Terry model:
λi =
p∑r=1
βrxir
We thus estimate the relevant β̂r ’s and can then “plug in”covariates to evaluate other texts
![Page 57: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/57.jpg)
Structured Bradley-Terry Model
We have our covariates
We still model this comparison:
logit[Pr(i easier than j)] = λi − λj
We can model λi as a function of the covariates r that weselected using a structured Bradley-Terry model:
λi =
p∑r=1
βrxir
We thus estimate the relevant β̂r ’s and can then “plug in”covariates to evaluate other texts
![Page 58: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/58.jpg)
Structured Bradley-Terry Model
We have our covariates
We still model this comparison:
logit[Pr(i easier than j)] = λi − λj
We can model λi as a function of the covariates r that weselected using a structured Bradley-Terry model:
λi =
p∑r=1
βrxir
We thus estimate the relevant β̂r ’s and can then “plug in”covariates to evaluate other texts
![Page 59: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/59.jpg)
Structured Bradley-Terry Model
We have our covariates
We still model this comparison:
logit[Pr(i easier than j)] = λi − λj
We can model λi as a function of the covariates r that weselected using a structured Bradley-Terry model:
λi =
p∑r=1
βrxir
We thus estimate the relevant β̂r ’s and can then “plug in”covariates to evaluate other texts
![Page 60: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/60.jpg)
Results
All variables Simple Model
Characters per sentence −0.01∗ −0.01∗
(0.00) (0.00)Proportion of 3-syllable words −1.04∗ −1.31∗
(0.34) (0.28)Proportion of words from Dale-Chall -0.41
(0.28)Proportion of adpositions −0.99∗ −1.11∗
such as to, with, from, under (0.48) (0.46)Mean word frequency (/’the’) −1.74∗ −1.68∗
(0.35) (0.35)Proportion of conjunctions 0.70
(0.71)PCP 0.663 0.662AIC 7419.90 7419.09
Standard errors in parentheses. indicates significance at p < 0.05
![Page 61: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/61.jpg)
Speeches in 2016 Campaign Debates
Sep Nov Jan Mar
0.35
0.40
0.45
0.50
0.55
0.60
Pr(
easi
er th
an 5
th g
rade
)
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
● ●●
●
●
●
●●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●●
●
●
●●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
● GOPDemTrumpClinton
![Page 62: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/62.jpg)
SOTU Re-evaluated
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●●
●●●
●
●●●
●●
●●
●●
●
●●
●
●
●
●
●
●
●
●
●●
●
●●●●
●
●
●●
●●
●
●
●●
●●
●●●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●●
●●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●●
●●
●
●
●
●
●●
●●
●
●
20
40
60
80
FR
E
BT
mod
el
0.1
0.2
0.3
0.4
0.5
1800 1840 1880 1920 1960 2000
Year
FREBradley−Terry, Machine Learned
![Page 63: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/63.jpg)
Evaluating traditional measures
We can check the predictive ability of extant measures on our rankedsnippets
AIC % Correct
FRE 7,893 0.602Dale-Chall 7,895 0.603FOG 7,619 0.638SMOG 7,726 0.574Spache 7,665 0.635
Coleman-Liau 8,219 0.552
![Page 64: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/64.jpg)
Results–Refit FRE Model
FRE, refit Sentence only Syllables only
Mean syllables/word −1.34∗ −0.71∗
(0.12) (0.11)Mean words/sentence −0.07∗ −0.06∗
(0.00) (0.00)PCP 0.66 0.64 0.53AIC 7494.81 7625.82 8275.97
Standard errors in parentheses. indicates significance at p < 0.05
![Page 65: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/65.jpg)
Moving forward
Sophistication is a normatively important component of politicalspeech
We have demonstrated the insufficency of existing measures, anddeveloped a framework for creating a better one
However, the predictive accuracy of our best model isunderwhelming; improvements include:
Calculating word rarity for different parts of speech
Performing comparisons between longer snippets of text
Incoporate more syntactic information
![Page 66: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/66.jpg)
Moving forward
Sophistication is a normatively important component of politicalspeech
We have demonstrated the insufficency of existing measures, anddeveloped a framework for creating a better one
However, the predictive accuracy of our best model isunderwhelming; improvements include:
Calculating word rarity for different parts of speech
Performing comparisons between longer snippets of text
Incoporate more syntactic information
![Page 67: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/67.jpg)
Moving forward
Sophistication is a normatively important component of politicalspeech
We have demonstrated the insufficency of existing measures, anddeveloped a framework for creating a better one
However, the predictive accuracy of our best model isunderwhelming; improvements include:
Calculating word rarity for different parts of speech
Performing comparisons between longer snippets of text
Incoporate more syntactic information
![Page 68: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/68.jpg)
Moving forward
Sophistication is a normatively important component of politicalspeech
We have demonstrated the insufficency of existing measures, anddeveloped a framework for creating a better one
However, the predictive accuracy of our best model isunderwhelming; improvements include:
Calculating word rarity for different parts of speech
Performing comparisons between longer snippets of text
Incoporate more syntactic information
![Page 69: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/69.jpg)
Moving forward
Sophistication is a normatively important component of politicalspeech
We have demonstrated the insufficency of existing measures, anddeveloped a framework for creating a better one
However, the predictive accuracy of our best model isunderwhelming; improvements include:
Calculating word rarity for different parts of speech
Performing comparisons between longer snippets of text
Incoporate more syntactic information
![Page 70: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/70.jpg)
Moving forward
Sophistication is a normatively important component of politicalspeech
We have demonstrated the insufficency of existing measures, anddeveloped a framework for creating a better one
However, the predictive accuracy of our best model isunderwhelming; improvements include:
Calculating word rarity for different parts of speech
Performing comparisons between longer snippets of text
Incoporate more syntactic information
![Page 71: Measuring and Explaining Political Sophistication Through ...kmunger.github.io/pdfs/TAD_2016_BMS_presentation_updated.pdfForsooth, the cordwainer was afeared. FRE = 16.23 No measure](https://reader036.vdocuments.us/reader036/viewer/2022071505/6125afee6cbcef6e541363fd/html5/thumbnails/71.jpg)
R package