compair: compare and visualise the usage of language, david beavan, university of glasgow, dh2011
Post on 20-Jun-2015
309 Views
Preview:
DESCRIPTION
TRANSCRIPT
ComPair: Compare and Visualise the Usage of Language
David Beavan University of Glasgow David.Beavan@glasgow.ac.uk @DavidBeavan
‘You shall know a word by the company it keeps’
Firth, John R., 1957. Modes of meaning. Oxford: Oxford University Press.
Collocation
• Words which go together • More than by chance, they show an association
• Take a corpus • Search for a term (node word) • Examine words in a window (e.g. 5) either side of node • Aggregate these co-occurring words • Rank (e.g. by frequency or collocational strength)
‘Stanford’ collocate search via Davies, Mark. (2004-) BYU-BNC: The British National Corpus.Available online at http://corpus.byu.edu/bnc.
Collocates
Collocate Cloud
‘Stanford’ search via Beavan, David. (2008-) BNC Collocate Cloud. Available online at http://www.scottishcorpus.ac.uk/corpus/bnc/collocatecloud.php
Collocate Cloud properties
• 100 most frequent collocates listed alphabetically • Font size shows frequency of word • Brightness shows collocational strength of word • Interactively create new clouds
• Best New Idea for Improving a Current Web-Based Tool,
2008 TADA Research Evaluation eXchange (T-REX)
Comparison
• Investigate and compare word usage – Expose attitudes and cultures – Investigate degrees of synonymy
• Semantic prosody – How synonymous words can actually take on positive or negative
connotations
• Applications for language learning – Examine real-world usage of words
ComPair properties
• Visualise usage of two node words • Distribute 150+ collocates on a continuum • Colour shows attraction to node • Brightness shows degree of collocational attraction
• Currently uses British National Corpus • Can be applied to any corpus or dataset (in progress)
ComPair how-to
• Take two collocate word lists – Same corpus, different node words – Different corpora, same node word
• Calculate collocational strength towards each node – Mutual Information etc.
• Place collocates on continuum between node words – Those with attraction to a single node appear near that node – Those with little attraction to either node appear central and dim – Those with attraction to both nodes appear central and bright
ComPair: http://www.scottishcorpus.ac.uk/corpus/bnc/compair.php
David Beavan University of Glasgow David.Beavan@glasgow.ac.uk @DavidBeavan
top related