demystifying digital humanities: winter 2014 session #1
DESCRIPTION
Slides from the January 18th Demystifying Digital Humanities workshop on Exploring Programming in the Humanities, held at the Simpson Center for the Humanities, and taught by Paige Morgan, Sarah Kremen-Hicks, and Brian GutierrezTRANSCRIPT
![Page 1: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/1.jpg)
DMDH Winter 2014 Session #1:Exploring Programming in the Digital Humanities
![Page 2: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/2.jpg)
Programming is complex enough that just figuring out what you want to do and
what sort of language you need is work.
![Page 3: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/3.jpg)
Thinking that you ought to be able to do everything almost immediately is a recipe for
feeling terrible.
![Page 4: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/4.jpg)
Being aware that it is genuine work, and not just work for newbies,
matters.
![Page 5: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/5.jpg)
Photo by MK Fautoyére, via Flickr
![Page 6: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/6.jpg)
There will always be new programs and
platforms that you will want to experiment
with.
![Page 7: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/7.jpg)
Working with technology means periodically
starting from scratch -- a bit like working with a
new time period or culture; or figuring out
how to teach a new class.
![Page 8: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/8.jpg)
What can programming languages do?
![Page 9: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/9.jpg)
Programming languages can...
![Page 10: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/10.jpg)
They can also do all these things in combination.
![Page 11: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/11.jpg)
Example #1• find all the statements in quotes ("")
from a novel.
• count how many words are in each statement
• put the statements in order from smallest amount of words to largest
•write all the statements from the novel in a text file
![Page 12: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/12.jpg)
Example #2• allow a user to type in some information,
i.e., "Benedict Cumberbatch"
• compare “Benedict Cumberbatch” to a much larger file
• retrieve any data that matches the information
• print the retrieved information on screen
![Page 13: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/13.jpg)
Example #3• "read" two texts -- say, two plays by
Seneca
• search for any words that the two plays have in common
• print the words that they have in common on screen
• calculate what percentage of the words in each play are shared
• print that percentage onscreen
![Page 14: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/14.jpg)
Example #4•if the user is located in geographic
location Z, i.e., 45th and University, go to an online address and retrieve some text
•print that text on the user’s tablet screen
•receive input from the user and respond
![Page 15: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/15.jpg)
However...• In Example #1, the computer is focusing
on things that characters say. But what if you want to isolate speeches from just one character?
• In Example 2, how does the computer know how much text to print? Will it just print "Benedict Cumberbatch" 379 times, because that's how often it appears in the larger file?
![Page 16: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/16.jpg)
These are the areas of programming where critical thinking and
humanities skills become vital.
![Page 17: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/17.jpg)
The Difference•Humans are good at differentiating
between material in complex and sophisticated ways.
•Computers are good at not differentiating between material unless they’ve been specifically instructed to do so.
![Page 18: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/18.jpg)
Computers work with data.
You work with data, too -- but in
most cases, you'll have to make
your data readable by computer.
![Page 19: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/19.jpg)
How to make your data machine-readable
•Annotate it with markup language
•Organize it in patterns that the computer can understand
•Add data that is not explicitly readable in the current format (i.e., hardbound/softbound binding; language:English; date of record creation)
![Page 20: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/20.jpg)
Depending on the data you have, and the way
you annotate or structure it, different
things become possible.
![Page 21: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/21.jpg)
Your goal is to make the data As Simple As Possible -- but not so simple that it
stops being useful.
![Page 22: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/22.jpg)
Depending on the data you work with, the
work of structuring or annotating becomes
more challenging, but also more useful.
![Page 23: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/23.jpg)
The work of creating data is social.
![Page 24: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/24.jpg)
Many programming languages have governing bodies that establish standards for their
use:
•the World Wide Web (W3C) Consortium (http://www.w3.org/standards/)•the TEI Technical Council
![Page 25: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/25.jpg)
BREAK!
![Page 26: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/26.jpg)
Data Examples
•Annotated (Markup Languages: HTML, TEI)
•Structured (MySQL)
•Combination (Semantic Web)
![Page 27: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/27.jpg)
Markup: HTML
<i> This text is italic.</i> =
This text is italic.
![Page 28: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/28.jpg)
Markup: HTML
<a href=“http://www.dmdh.org”>This text</a> will take you to a webpage.
=This text will take you to a webpage.
![Page 29: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/29.jpg)
Markup: HTML
Anything can be data -- and markup languages provide instructions for how
computers should treat that data.
![Page 30: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/30.jpg)
Markup: HTMLHTML is used to format text on webpages.
<p> separates text into paragraphs.
<em> makes text bold (emphasized).
These are just a few of the HTML formatting instructions that you can use.
![Page 31: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/31.jpg)
HTML Syntax Rules
•Open and closed tags: <> and </>•Attributes (2nd-level information) defined using =“”
![Page 32: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/32.jpg)
Markup languages are popular in digital
humanities because lots of humanists work
with texts.
![Page 33: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/33.jpg)
Without markup languages, the things that a computer can
search for are limited.
![Page 34: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/34.jpg)
Ctrl + F: any text in iambic pentameter.
![Page 35: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/35.jpg)
With markup, the things you can
search for are only limited by your interpretation.
Markup: TEI
![Page 36: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/36.jpg)
TEI(Text Encoding
Initiative)
Markup: TEI
![Page 37: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/37.jpg)
Poetry w/ TEI<text xmlns="http://www.tei-c.org/ns/1.0" xml:id="d1">
<body xml:id="d2"><div1 type="book" xml:id="d3">
<head>Songs of Innocence</head><pb n="4"/><div2 type="poem" xml:id="d4">
<head>Introduction</head><lg type="stanza">
<l>Piping down the valleys wild, </l><l>Piping songs of pleasant glee, </l><l>On a cloud I saw a child, </l><l>And he laughing said to me: </l>
</lg>
![Page 38: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/38.jpg)
Grammar w/ TEI<entry> <form> <orth>pamplemousse</orth> </form> <gramGrp> <gram type="pos">noun</gram> <gram type="gen">masculine</gram> </gramGrp></entry>
![Page 39: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/39.jpg)
TEI’s syntax rules are identical to HTML’s -- though your normal browser can’t work with TEI the way it works with HTML.
![Page 40: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/40.jpg)
TEI is meant to be a highly social language
-- meaning that the committee who
maintains its standards want it to be something that anyone
can use.
![Page 41: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/41.jpg)
In order for TEI to successfully encode texts, it has to be
adaptable to individual projects.
![Page 42: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/42.jpg)
Anything that you can isolate (and put in brackets) can (theoretically) be pulled
out and displayed for a reader.
![Page 43: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/43.jpg)
TEI can be used to encode more than just text:
<div type="shot"> <view>BBC World symbol</view> <sp> <speaker>Voice Over</speaker> <p>Monty Python's Flying Circus tonight comes to you live from the Grillomat Snack Bar, Paignton.</p>
</sp></div><div type="shot"> <view>Interior of a nasty snack bar. Customers around, preferably real people. Linkman sitting at one of the plastic tables.</view>
<sp> <speaker>Linkman</speaker> <p>Hello to you live from the Grillomat Snack Bar.</p> </sp></div>
![Page 44: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/44.jpg)
Or, you could encode all Stephenie Meyer’s Twilight according to its emotional register.
![Page 45: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/45.jpg)
Whether you include or exclude some
aspect of the text in your markup can be very important from
an academic perspective.
![Page 46: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/46.jpg)
The challenge of creating good data is
one reason that collaboration is so important to digital
scholarship.
![Page 47: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/47.jpg)
Data Collaboration
•Avoid reinventing the wheel (has the markup for this text already been done?)
•Consider the labor involved vs. the outcome (and future use of the data you create.)
![Page 48: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/48.jpg)
Structured Data
![Page 49: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/49.jpg)
Study Scenario #1
•You study urban espresso stands: their hours, brands of coffee, whether or not they sell pastries, and how far the espresso stands are from major roadways.
![Page 50: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/50.jpg)
Study Scenario #2
•You study female characters in novels written between 1700 and 1850. Encoding a whole novel just to study female characters isn’t practical for you.
![Page 51: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/51.jpg)
Both scenarios involve aggregating
information, rather than encoding it.
![Page 52: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/52.jpg)
Structured Data: Example #1
(MySQL)ID Name Location Hours Coffee Brand Pastries (Y/N) Distance from
Street
008 Java the Hut
56 Farringdon Road, London, UK
7:00 a.m.-2:00 p.m.
Square Mile Roasters
N 25 meters
009 Prufrock Coffee
18 Shoreditch High Street
7:00 a.m. – 10:00 p.m.
Monmouth Y 10 meters
![Page 53: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/53.jpg)
![Page 54: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/54.jpg)
Structured Data: Example #2 (RDF)
![Page 55: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/55.jpg)
How your data is (or can be) structured will
influence the technology that you
(can) use to work with it.
![Page 56: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/56.jpg)
Digital humanists see creating machine-readable data as
valuable scholarship.
![Page 57: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/57.jpg)
Examples
•Homer Multi-Text Project
•Modernist Versions Project
•Scalar (platform)
•Century Ireland
![Page 58: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/58.jpg)
Exercise: You Create the Data!
![Page 59: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/59.jpg)
Your data determines
your project.
![Page 60: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/60.jpg)
Every project has data.
Text objects, images, tags, geographical coordinates, categories, records, creator
metadata, etc.
![Page 61: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/61.jpg)
Even if you’re not planning to learn any programming skills, you are still working
with data.
![Page 62: Demystifying Digital Humanities: Winter 2014 session #1](https://reader036.vdocuments.us/reader036/viewer/2022062709/558dff441a28aba30d8b476d/html5/thumbnails/62.jpg)
Next time:Programming on the Whiteboard
February 1st, 9:30, CMU 202•Cleaning data before you work with it!•Identifying specific programming tasks•How access affects your project idea•Flash project development•Homework: bring some data to work with.
Please take our quick eval survey!http://tinyurl.com/dmdh14jan