computational models of discourse analysis

Computational Models of Discourse Analysis

Carolyn Penstein Rosé

Language Technologies Institute/

Human-Computer Interaction Institute

Warm Up Discussion Look at the Bitter Lemons entries in the

handout. What stands out to you as evidence for the Israeli versus Palestinian perspective?

Do you think they’re just picking up on writing style?

What would it mean to pick up on perspective versus writing style?

Examples from the paper:

Warm Up Discussion Look at the Bitter Lemons entries in the

handout. What stands out to you as evidence for the Israeli versus Palestinian perspective?

Do you think they’re just picking up on writing style?

What would it mean to pick up on perspective versus writing style?

Student Comment:

If we were indeed looking for patterns of language-usage, some sort of template-finding approach (as we mused over in the last sentiment paper) might be interesting - are there rhetorical techniques or sentence structures favored by one perspective over the other (or out of many)? Are there structures that indicate perspective-heavy sentences shared by both (or many) sides, or that couch/leverage the "opponents'" rhetoric within opposite-perspective texts?

Positioning the paper

* Note: It’s true that it was unique at the time. Since then, there has been a lot of follow up work modeling perspectives using word distributions like this, which shows that the work was valued by the community.

Student Comment It seems like the discourse level problem that the authors are trying to

solve is that of bias. I differentiate between bias and perspectives because people can have the same bias but have different perspectives (a casual movie goer's positive review vs. a weather movie critic's positive review); people can also have different biases but the same perspective (a democrat who believes that taxes should be increased because the the money will go towards social programs that they support versus a republican who believes that taxes should be increased because constituents voted democrat and that's what you get when you vote democrat).

Bias is?: Positive versus negative or reasons for being for or against something

Perspective is?: Level of expertise or position on a bill

Student Comment There's also a weird thing going on in the

comments about saying "perspective" and really meaning "background, upbringing, culture, and innate biases of the writers." Personally, I'd be much happier about being able to model that kind of detail about a writer than I would about some one-off measure of "perspective."

DiscourseEnvironmentalism

ConversationGlobal Warming

DiscourseStatusQuo

Socially Situated IdentityEnvironmentalist

Social LanguageLiberal rhetoric

Figured WorldExpected structure of Conservationist Commercial

Form-Function CorrespondenceRange of meanings for the word “sustainability”

Situated MeaningMeaning of “sustainability” in the commercial

Imagine an environmentalist commercial

Where does perspective fit in this picture?

Where does perspective fit in this picture?

One take on Perspective in SFL

Does this suggest an answer to this student question:

“are there rhetorical techniques or sentence structures favored by one perspective over the other (or out of many)? Are there structures that indicate perspective-heavy sentences shared by both (or many) sides, or that couch/leverage the "opponents'" rhetoric within opposite-perspective texts?”

Perspective from Rhetoric

Implied author: Communication style is a projection of identityImpression management, not

necessarily the ground truthImplied reader: What we assume

about who is listeningReal assumptions, possibly incorrectWhat we want recipients or

overhearers to think are our assumptions

Reader: may or may not understand the text the way it was intended

Author

ImpliedAuthor

ImpliedReader

Text

Effect

Reader

A good example of perspective…

3 Views on Perspective Unit 3 Connection: perspective is kind of like

sentiment

Unit 4 Connection: perspective is kind of like Personality and Identity Presentation of self models We’ll look at the Blog corpus

Unit 5 Connection: perspective is kind of like positioning Get back to Appraisal: Engagement metafunction We’ll come back to the Bitter Lemons Corpus

Would you prefer to swap Units 4 and 5 so we do Bitte

r Lemons next?

Revisiting Tips for Monday’s Reading Assignment Skip Section 4 and the Appendix the first time

you read the paper Then skim through section 4, skipping over any

sentences you don’t understand Focus on the initial paragraphs in

sections/subsections, as these tend to give a high level idea of what the message is

Keep in mind that their Latent Sentence Perspective Model is just Naïve Bayes with one twist – can you find what that one twist is?

Statistical Model The document

Perspective

Strength of Bias

In the original model, each word contributes to the likelihood depending on its own strength. In the revised model, polarized words within sentences that are on the whole less polarizing count less than the same words in sentences that are on the whole more polarizing.

Or: Increase certainty by deemphasizing sentences that appear to be leaning towards the minority view. Will this work? Would work if the kinds of things that you mention but don’t take responsibility for are consistent within perspectives.

Student Comment Their model did show a slight improvement

over the other models at document classification, probably because treating sentence evidence as a latent variable is almost like smoothing, in a sense.

Evaluation

Note that both the words themselves and their ranking will influence the model.

* Bigger difference on experts (always the same two people, might be more consistent about what things they mention as ancillary details.).

Student Comment Student comment: It really does not seem

to me that "the small but positive improvement due to sentence-level modeling in LSPM us encouraging." Their specialized model is very slightly better than Naive Bayes.

Student Comment ...and does anyone in this community ever

run statistics to see if their accuracy is actually statistically significantly different from other models' accuracy?

Questions?

computational models of discourse analysis

Documents

palestinian perspective

perspective fit

perspectiveheavy sentences

oppositeperspective

oneoff measure of perspective

writing style

sentence structures

different perspectives