1 textual portraits: using word clouds as an analysis tool for digitized texts shonn haren wichita...

13
1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

Upload: kory-powers

Post on 19-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

1

Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts

Shonn Haren

Wichita State University Libraries

February 28, 2015

Page 2: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

2

Visualizing the Text

“…often the most effective way to describe, explore and summarize a set of numbers-even a very large set-is to look at pictures of those numbers.”

– Edward Tufte, The Visual Display of Quantitative Information

Page 3: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

3

Word Clouds for Text Search

• Google Books– https://books.google.com– Accessible from the books’

record page– Click on a word to see the places

where it occurs within the text.

Page 4: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

4

Word Clouds for Textual Analysis

• Brad Borevitz’ State of the Union Website– http://

stateoftheunion.onetwothree.net/index.shtml

– Compare the clouds of Speeches from Washington to Obama

– Track the use of terms over time

Page 5: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

5

What are Word Clouds?

• Machine-readable text is broken down into its component words

• Common words (a, an, the, etc…) are filtered out

• Remaining words are assigned values based on frequency of use

• Words arranged randomly in a cloud, with their size determined by their frequency of use:

Page 6: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

6

The Bill of Rights

Page 7: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

7

The Gettysburg Address

Page 8: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

8

Comparing Word Clouds

Topeka Constitution Lecompton Constitution

Page 9: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

9

Comparing Word Clouds

Topeka Constitution Lecompton Constitution

Page 10: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

10

So How does this work?

• http://www.wordle.net

Page 11: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

11

Benefits and Drawbacks

• Benefits– Wordle.net is free to use– Shallow Learning Curve– Images produced can be

used for outreach too

• Drawbacks– Text must be machine-

readable– Synonyms aren’t

eliminated– Archaic

texts/translations have uncommon common words that must be manually removed

Page 12: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

12

In Conclusion…

• Word Clouds don’t replace traditional textual analysis or reading

• A first glance tool for comparing unfamiliar texts with minimal metadata

Page 13: 1 Textual Portraits: Using Word Clouds as an Analysis Tool for Digitized Texts Shonn Haren Wichita State University Libraries February 28, 2015

13

Any Questions?