dmds winter 2015 workshop 1 slides

65
Winter 2015 Session #1: Exploring Programming in Digital Scholarship February 12, 2015 Paige Morgan Sherman Centre for Digital Scholarship

Upload: paige-morgan

Post on 16-Jul-2015

165 views

Category:

Education


1 download

TRANSCRIPT

Page 1: DMDS Winter 2015 Workshop 1 slides

Winter 2015 Session #1:

Exploring Programming in Digital

ScholarshipFebruary 12, 2015

Paige Morgan

Sherman Centre for Digital Scholarship

Page 2: DMDS Winter 2015 Workshop 1 slides

Programming is complex

enough that just figuring

out what you want to do

and what sort of language

you need is work.

Page 3: DMDS Winter 2015 Workshop 1 slides

Thinking that you ought to be able

to do everything almost

immediately is a recipe for feeling

terrible.

Page 4: DMDS Winter 2015 Workshop 1 slides

Photo by MK Fautoyére, via Flickr

Page 5: DMDS Winter 2015 Workshop 1 slides

There will always be new

programs and platforms

that you will want to

experiment with.

Page 6: DMDS Winter 2015 Workshop 1 slides

Working with technology

means periodically starting

from scratch -- a bit like

working with a new time

period or culture; or figuring

out how to teach a new

class.

Page 7: DMDS Winter 2015 Workshop 1 slides
Page 8: DMDS Winter 2015 Workshop 1 slides

Being able to effectively

communicate about your

project as it relates to

programming is a skill in

itself.

Page 9: DMDS Winter 2015 Workshop 1 slides

What can programming

languages do?

Page 10: DMDS Winter 2015 Workshop 1 slides

Programming languages

can...• search for things

• match things

• read things

• write things

• receive information, and give it

back, changed or unchanged

• count things

• do math

• arrange things in quantitative or

random order

• respond: if x, do y OR do x until

y happens

• compare things for similarity

• go to a file at a location, and

retrieve readable text

• display things according to

instructions that you provide

• draw points, lines, and shapes

Page 11: DMDS Winter 2015 Workshop 1 slides

They can also do many or

all of these things in

combination.

Page 12: DMDS Winter 2015 Workshop 1 slides

Example #1• find all the statements in quotes ("") from a

novel.

• count how many words are in each statement

• put the statements in order from smallest

amount of words to largest

• write all the statements from the novel in a

text file

Page 13: DMDS Winter 2015 Workshop 1 slides

Example #2• allow a user to type in some information, i.e.,

"Benedict Cumberbatch"

• compare “Benedict Cumberbatch” to a much

larger file

• retrieve any data that matches the

information

• print the retrieved information on screen

Page 14: DMDS Winter 2015 Workshop 1 slides

Example #3• "read" two texts -- say, two plays by Seneca

• search for any words that the two plays have in

common

• print the words that they have in common on

screen

• calculate what percentage of the words in each

play are shared

• print that percentage onscreen

Page 15: DMDS Winter 2015 Workshop 1 slides

Example #4• if the user is located in geographic

location Z, i.e., 45th and University, go

to an online address and retrieve some

text

• print that text on the user’s tablet

screen

• receive input from the user and respond

Page 16: DMDS Winter 2015 Workshop 1 slides

However...

• In Example #1, the computer is focusing on

things that characters say. But what if you want

to isolate speeches from just one character?

• In Example 2, how does the computer know

how much text to print? Will it just print

"Benedict Cumberbatch" 379 times, because

that's how often it appears in the larger file?

Page 17: DMDS Winter 2015 Workshop 1 slides

These are the areas of

programming where

critical thinking and

specialized disciplinary

knowledge become vital.

Page 18: DMDS Winter 2015 Workshop 1 slides

The Difference

• Humans are good at differentiating

between material in complex and

sophisticated ways.

• Computers are good at not

differentiating between material unless

they’ve been specifically instructed to

do so.

Page 19: DMDS Winter 2015 Workshop 1 slides

Computers work with

data.

You work with data, too --

but you may have to do

extra work to make your

data readable by

computer.

Page 20: DMDS Winter 2015 Workshop 1 slides

Ways to make your data

machine-readable• Annotate it with markup language

• Organize it in patterns that the

computer can understand

• Add metadata that is not explicitly

readable in the current format (i.e.,

hardbound/softbound binding;

language:English; date of record

creation)

Page 21: DMDS Winter 2015 Workshop 1 slides

Depending on the data

you have, and the way

you annotate or structure

it, different things become

possible.

Page 22: DMDS Winter 2015 Workshop 1 slides

Your goal is to make the

data As Simple As

Possible -- but not so

simple that it stops being

useful.

Page 23: DMDS Winter 2015 Workshop 1 slides

Depending on the data

you work with, the work of

structuring or annotating

becomes more

challenging, but also

more useful.

Page 24: DMDS Winter 2015 Workshop 1 slides

The work of creating data

is social.

Page 25: DMDS Winter 2015 Workshop 1 slides

Many programming languages

have governing bodies that

establish standards for their

use:

• the World Wide Web (W3C)

Consortium

(http://www.w3.org/standards/)

• the TEI Technical Council

Page 26: DMDS Winter 2015 Workshop 1 slides

Data Examples

• Annotated (Markup Languages: HTML,

TEI)

• Structured (MySQL)

• Combination (Linked Open Data)

• Object-Oriented Programming (Java,

Python, Ruby on Rails)

Page 27: DMDS Winter 2015 Workshop 1 slides

Markup: HTML

<i> This text is

italic.</i> =This text is italic.

Page 28: DMDS Winter 2015 Workshop 1 slides

Markup: HTML

<a href=“http://www.dmdh.org”>This text</a> will take you to a webpage.

=

This text will take you to a webpage.

Page 29: DMDS Winter 2015 Workshop 1 slides

Markup: HTML

Anything can be data -- and markup

languages provide instructions for how

computers should treat that data.

Page 30: DMDS Winter 2015 Workshop 1 slides

Markup: HTMLHTML is used to format text on webpages.

<p> separates text into paragraphs.

<em> makes text bold (emphasized).

These are just a few of the HTML formatting instructions

that you can use.

Page 31: DMDS Winter 2015 Workshop 1 slides

HTML Syntax Rules

• Open and closed tags: <> and </>

• Attributes (2nd-level information)

defined using =“”

Page 32: DMDS Winter 2015 Workshop 1 slides

Markup languages are

popular in digital

humanities because lots

of humanists work with

texts.

Page 33: DMDS Winter 2015 Workshop 1 slides

Without markup

languages, the things that

a computer can search for

are limited.

Page 34: DMDS Winter 2015 Workshop 1 slides

Ctrl + F: any text in iambic

pentameter.

Page 35: DMDS Winter 2015 Workshop 1 slides

With markup, the

things you can

search for are only

limited by your

interpretation.

Markup: TEI

Page 36: DMDS Winter 2015 Workshop 1 slides

TEI

(Text Encoding Initiative)

Markup: TEI

Page 37: DMDS Winter 2015 Workshop 1 slides

Poetry w/ TEI<text xmlns="http://www.tei-c.org/ns/1.0" xml:id="d1">

<body xml:id="d2">

<div1 type="book" xml:id="d3">

<head>Songs of Innocence</head>

<pb n="4"/>

<div2 type="poem" xml:id="d4">

<head>Introduction</head>

<lg type="stanza">

<l>Piping down the valleys wild, </l>

<l>Piping songs of pleasant glee, </l>

<l>On a cloud I saw a child, </l>

<l>And he laughing said to me: </l>

</lg>

Page 38: DMDS Winter 2015 Workshop 1 slides

Grammar w/ TEI<entry>

<form>

<orth>pamplemousse</orth>

</form>

<gramGrp>

<gram type="pos">noun</gram>

<gram

type="gen">masculine</gram>

</gramGrp>

</entry>

Page 39: DMDS Winter 2015 Workshop 1 slides

TEI’s syntax rules are

identical to HTML’s --

though your normal

browser can’t work with

TEI the way it works with

HTML.

Page 40: DMDS Winter 2015 Workshop 1 slides

TEI is meant to be a

highly social language

that anyone can use and

adapt for new purposes.

Page 41: DMDS Winter 2015 Workshop 1 slides

In order for TEI to

successfully encode texts,

it has to be adaptable to

individual projects.

Page 42: DMDS Winter 2015 Workshop 1 slides

Anything that you can isolate

(and put in brackets) can

(theoretically) be pulled out and

displayed for a reader.

Page 43: DMDS Winter 2015 Workshop 1 slides

TEI can be used to encode more than just text:

<div type="shot">

<view>BBC World symbol</view>

<sp>

<speaker>Voice Over</speaker>

<p>Monty Python's Flying Circus tonight comes to you live

from the Grillomat Snack Bar, Paignton.</p>

</sp>

</div>

<div type="shot">

<view>Interior of a nasty snack bar. Customers around, preferably

real people. Linkman sitting at one of the plastic tables.</view>

<sp>

<speaker>Linkman</speaker>

<p>Hello to you live from the Grillomat Snack Bar.</p>

</sp>

</div>

Page 44: DMDS Winter 2015 Workshop 1 slides

Or, you could encode all

Stephenie Meyer’s

Twilight according to its

emotional register.

Page 45: DMDS Winter 2015 Workshop 1 slides

Whether you include or

exclude some aspect of

the text in your markup

can be very important

from an academic

perspective.

Page 46: DMDS Winter 2015 Workshop 1 slides

The challenge of creating

good data is one reason

that collaboration is so

important to digital

scholarship.

Page 47: DMDS Winter 2015 Workshop 1 slides

Wise Data Collaboration

• Avoid reinventing the wheel (has

someone else already created an

effective method for working with this

data?)

• Consider the labor involved vs. the

outcome (and future use of the data you

create.)

Page 48: DMDS Winter 2015 Workshop 1 slides

Structured Data

Page 49: DMDS Winter 2015 Workshop 1 slides

Study Scenario #1

• You study urban espresso stands: their

hours, brands of coffee, whether or not

they sell pastries, and how far the

espresso stands are from major

roadways.

Page 50: DMDS Winter 2015 Workshop 1 slides

Study Scenario #2

• You study female characters in novels

written between 1700 and 1850.

Encoding a whole novel just to study

female characters isn’t practical for you.

Page 51: DMDS Winter 2015 Workshop 1 slides

Both scenarios involve

aggregating information,

rather than encoding it.

Page 52: DMDS Winter 2015 Workshop 1 slides

Structured Data: Example

#1

(MySQL)ID Name Location Hours Coffee Brand Pastries (Y/N) Distance from

Street

008 Java the Hut 56

Farringdon

Road,

London, UK

7:00 a.m.-

2:00 p.m.

Square Mile

Roasters

N 25 meters

009 Prufrock

Coffee

18

Shoreditch

High Street

7:00 a.m. –

10:00 p.m.

Monmouth Y 10 meters

Page 53: DMDS Winter 2015 Workshop 1 slides
Page 54: DMDS Winter 2015 Workshop 1 slides

Structured Data:

Example #2 (RDF)

Page 55: DMDS Winter 2015 Workshop 1 slides

Object-Oriented

Programming

• Java, Python, C++, Perl, PHP, Ruby, etc.

• Widely used, highly flexible, very powerful

Page 56: DMDS Winter 2015 Workshop 1 slides

What’s an “object”?• An object is a structure that contains data in

one or more forms.

• Common forms include strings, integers, and

arrays (groups of data).

• Example (handout)

Page 57: DMDS Winter 2015 Workshop 1 slides

Object-oriented programming, cont’d

• Learning a bit about an OOP language can

help you become accustomed to working

with programming

• Reading OOP code can also be useful

• Many free tutorials are available

• Goal: to be able to converse more effectively

with professional programmers, rather than

become an expert yourself.

Page 58: DMDS Winter 2015 Workshop 1 slides

How your data is

structured will influence

the technology that you

(can) use to work with it.

Page 59: DMDS Winter 2015 Workshop 1 slides

Digital scholars see

creating machine-

readable data as valuable

scholarship.

Page 60: DMDS Winter 2015 Workshop 1 slides

Examples

• Homer Multi-Text Project

• Modernist Versions Project

• Scalar (platform)

• Century Ireland

Page 61: DMDS Winter 2015 Workshop 1 slides

Exercise:

You Create the Data!

Page 62: DMDS Winter 2015 Workshop 1 slides

Your data determines your

project.

Page 63: DMDS Winter 2015 Workshop 1 slides

Every project has data.

Text objects, images, tags, geographical

coordinates, categories, records, creator

metadata, etc.

Page 64: DMDS Winter 2015 Workshop 1 slides

Even if you’re not planning to

learn any programming skills,

you are still working with data.

Page 65: DMDS Winter 2015 Workshop 1 slides

Next time:Programming on the Whiteboard

February 19th, 3:00-5:00 p.m., Sherman

Centre

• Cleaning data before you work with it!

• Identifying specific programming tasks

• How access affects your project idea

• Flash project development

• Homework: bring some data to work

with.