variation and genrefication in blogs (presented at dgfs 2007, siegen, germany)
DESCRIPTION
These are the slides used for my presentation on syntactic variation in blogs, held as part of the workshop "Syntactic Variation and Emerging Genres".TRANSCRIPT
Variation and “Genrefication” in Blogs
Cornelius Puschmann
AG 3: Syntactic Variation and Emerging Genres
DGfS 29, Siegen
28.02.2007
Thesis project
The corporate blog as an emerging genre of
computer-mediated communication
Focus● survey of a new form of domain-specific publishing● both linguistic and extra-linguistic aspects
Question: is the corporate blog a genre?
Research context
“A blog is a user-generated website where entries are made in journal style and displayed in a reverse chronological order. Blogs often provide commentary or news on a particular subject, such as food, politics, or local news; some function as more personal online diaries.” http://en.wikipedia.org/wiki/Blog
“Corporate blogging is the use of blogs to further organizational goals” Debbie Weil, The Corporate Blogging Book
Blogs? Corporate Blogs?
An example: GM Fastlane
Genre: “A class of communicative events with a shared set of communicative purposes” (Swales)
Text typology: “Linguistic features, their co-occurrence
and relative distribution in a text” (Biber, paraphrase)
Assumption: genre is one factor determining text typology
Genre vs. Text Typology
My focus: differences in the relative distribution of features
=> quantitative variation
is shaped by
formal factors
mode/channel
register
speaker
Quantifying stylistic variation
Quantifiable stylistic variation in blogs can occuron several levels
1. post 2. author
3. blog 4. type of blog (corporate,..)
Assumption: By vertically and horizontally assessing the
degree of variation on these levels for an emerging genre
(e.g. the corporate blog), we should be able to observe its
degree of typological stability.
Assessing variation on multiple levels
- web feeds (RSS and Atom protocols) used to retrieve,store and analyse language data
- implemented TreeTagger for automated tagging
- 134 blogs (115 corporate, 1 political*, 18 private)
- 3 press editorial sections (NYT, WashPo, LA Times)
- 5 press release sections (Microsoft, GM, Sun, Oracle, McD)
- 16,895 posts
- 4,041,133 tokens
The corpus
F-score (Heylighen & Dewaele):a metric to quantify the level of formality in a text, where formality is specifically defined as the diametrical opposite of contextuality
formula:
0.5 * ((N + ADJ + PRP + DET) - (PN + V + ADV + ITJ) + 100)
Measuring formality via f-scores
The Toshiba Portege R400 is a Windows Vista-inspired signature mobile PC that incorporates innovative connectivity and display technologies to provide timely access to e-mail and appointments via Active Notifications and is built on Windows SideShow™ technology. [...]
http://www.microsoft.com/presspass/press/2007/jan07/01-07CES2007PR.mspx
- high noun frequency- high adjective frequency- more nominal than verbal- often relate complex information- often describe future events/potentiality
Example: high f-score (press release)
OK, OK, I'm partly at fault here. But, hear me out. Last year at Gnomedex I had my son demonstrate Second Life up on stage while I was hosting a panel discussion. Someone from Linden Labs (the folks who make Second Life), Beth Goza (she now works at Microsoft), saw that, and told me and my son to knock it off. People under 18 aren't allowed in Second Life. So, what did I do? I just told Patrick never to go into Second Life and I didn't go back into Second Life either. [...]
http://scobleizer.com/2007/02/18/second-life-has-my-credit-card-and-wont-let-go/
- high frequency of personal pronouns- more verbal than nominal- often describe past events, personal impressions, feelings
Example: low f-score (blog)
F-score & variability: Baking with Rose
F-score & variability: Jonathan Schwartz
Scores for both blogs plotted together
F-scores and stdev for all sources
O1: Blogs are characterized by dynamicity and fluidity
O2: Functional blog subtypes tend to be consistent
O3: Internal variation may correlate with functional complexity
O4: Quantitative variation can be a measure ofa) genre dynamicity andb) genre stability/fluidity
Observations
Variation and “Genrefication” in Blogs