hcc class lecture 8 - university of california, berkeley
TRANSCRIPT
HCC classlecture 8
John Canny2/23/09
Vygotsky’s Genetic Planes
PhylogeneticSocial-historicalOntogeneticMicrogenetic
What did he mean by genetic?
Internalization
Social functions
Internal (mental) functions
Social Plane
Internal (mental) Plane
InternalizationScaffoldingShowing, explaining
Listening and reading
Externalization
Social/historical artifacts
Internal (mental) functions
Social/historical Plane
Internal (mental) Plane
Externalization Talking, Writing
Internalization/Externalization
Power Laws
Pick a corpus such as:English (collection of many samples)Works of ShakespeareJames Joyce’s “Ulysses”
and count the occurrences of each word. Sort in decreasing order, let r be the rank in this order. Then
where is the frequency of the word of rank r.
αrcrf ≈)(
)(rf
Power Law – alternate form
Instead of frequency vs. rank, we can plot frequency vs. number of sets with that frequency.
The value β in this form is related to α via β=1/α+1.
This was Zipf’s original form, and the one analyzed by Newell.
βicig ')( ≈
Examples of Power Laws
Note: size vs frequency of that size – Zipf’s original form
Examples of Power Laws
These are in rank-frequency form.
Examples of Power Laws
Examples of Power Laws
Examples of Power Laws
Examples of Power Laws
Examples of Power Laws
AlsoNumber of users’ Facebook friendsThe popularity of Facebook appsNumber of pages in web sitesNumber of links into a web siteNumber of links out of a web site
Preferential Attachment
Yule’s law (1925)
Number ofspecies ineach genera
Genera
Pure birth process:Only new species are added
Literary Theory: Structuralism
Looks for “structures” in the domain of study, e.g. literature or anthropology, and their relation to otherStructure includes local (sentence) structure as on the next slide.Also includes deeper structures such as role and plot. E.g. “West Side Story” is the same plot structure as “Romeo and Juliet” Structuralists often look for “universal” structures, e.g. Freud’s Oedipal complex
Literary Theory: Structuralism
Bakhtin: “The Dialogic Imagination”Multiple voices are evident in a text: heteroglossia or
multivocality or polyphony.
Kristeva: IntertextualityKristeva elaborated Bakhtin’s ideas into the theory of
intertextuality: Texts borrowed and adapted from other texts.
AllusionCharactersPlotFormScene
Barthes: “S/Z”“A text is... a multidimensional space in which a variety of writings,
none of them original, blend and clash. The text is a tissue of quotations... The writer can only imitate a gesture that is always anterior, never original. His only power is to mix writings, to counter the ones with the others, in such a way as never to rest on any one of them”
Lexia
Simon’s model of textsText is built by sampling earlier texts:
Association: sampling earlier passages in the same corpus.Imitation: “sampling segments of word sequences from other works he has written, from works of other authors, and, of course, from sequences he has heard.”
Simon’s model of textsStatified sampling:Sampling and re-assembly of small segments of text.
The choice of which segments to assemble does not have to be random.
Simon’s model of textsSimon’s model explains the familiar Zipf curve.Limitations:
Pure “birth” process*Should work for differentnotions of “strata”
* But birth-death processesin equilibrium also produceZipf curves
Genetic LawsWe have given an explanation of Power Law behavior in
texts via internalization/externalization:
Genetic LawsOther similar phenomena may be explained in this way:
Sales of books, or many other itemsCitations of scientific articlesNumber of pages in web sitesNumber of links into a web siteNumber of links out of a web siteNumber of users’ Facebook friendsThe popularity of Facebook apps
Language as ActionWhat we have seen so far:
Many choice phenomena show the fingerprint of internalization/externalization and genetic origin. This includes language – both collective and individual.
Is there a more general link between language and action, as Vygotsky and others have suggested?
Georgia Tech Home26 occupancy sensorsData recorded over several weeks
N-gramsN-gram are sequences of n tokens, in this case n sensorsThe following is a 6-gram sequence of locations:3-11-27-12-19-20
N-gram statisticsNot only words in English, but n-grams of words in
English follow power laws*. In the smart home data, n-grams are a more reasonable
unit of analysis than individual sensor sites.We might expect to see power law behavior if movement
about the house is governed by “familiar habit” rather than optimal movement or planning.
* For small corpora, the n-gram stats for n>1 are often closer to an exact power law than for 1-grams (words).
N-gram statisticsHere is the data from the smart home experiment in Zipf’s
original form. All plots show a β close to 2, which corresponds to α close to 1.
Slope β increases slightly as n increases (so α decreasing)
ConclusionsThere appears to be a genetic mechanism at play, even in
simple physical movement about the house.
At least from one perspective (n-gram analysis), language and one type of action are remarkably similar.
Many other human phenomena show power law behavior, either through internalization/externalization or purely internal mechanisms.
Discussion questions1. Suggest another measure of human behavior that
might show genetic dynamics, and research whether it shows power law behavior (do a web search). Be prepared to explain the genetic mechanism.
2. Discuss the freedom of the author given the statistical similarities of new texts to old ones.