data for the humanities

Post on 22-Jan-2018

506 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data for the HumanitiesFebruary 21, 2017

Rafia MirzaDigital Humanities Librarianrafia@uta.edu @librarianrafia

Peace Ossom WilliamsonDirector of Research Data Services peace@uta.edu @123POW

Learning Outcomes

• Understand the use of data in answering humanities research questions

• Understand descriptive metadata and the rationale for its use

• Recognize areas of potential bias and ambiguous or misleading representation in reporting

What are data?

“All content in digital formats can be characterized as structured or unstructured data.”

Introduction to Digital Humanities: Concepts, Methods, and Tutorials

Examples:

•Audio

•Notes

•Geospatial

•Textual

Data are more than numbers

https://www.lib.umn.edu/datamanagement/whatdata

What is data literacy?

the ability to read, create, utilize, communicate, and criticize data.

Data Literacy

data quality

accessibility, usability, and understandability on the basis of context, providence, and metadata

Data Literacy

data structure

of different objects in a way that works to evaluate developing hypotheses

Data Literacy

recognizeResearch potential

be aware ofResearch methods

understandContext and provenience

Humanities Data Literacy

“Humanists have data, and they need data skills.”

Digital Humanities Data Curation

Data in the Humanities

Types of Humanities Data

• Scholarly editions

• Text corpora

• Text with markup

• Thematic research collections

• Data with accompanying analysis or annotation

• Finding aids and other information maps, such as bibliographies

Digital Humanities Data Curation Introduction

Big Data Digital Humanities vs.Small Data Digital Humanities

• “Research in Big Data Digital Humanities focuses on large or dense cultural datasets, which call for new processing and interpretation methods”

• “..Small Data Digital Humanities regroup more focused works that do not use massive data processing..”

• A map for big data research in digital humanities, Frédéric Kaplan

1. research the context: know the data about the data (so meta!)

How to understand data

Data versus Metadata

Big? Smart? Clean? Messy? Data in the Humanities, Christof Schöch

Metadata Metadata Metadata Metadata

data data data data

data data data data

data data data data

data data data data

About this dataset:

Title: Metadata Date Created: MetadataCreator: MetadataMethods Used: Metadata

2. research who the data is about

How to understand data

What are historical contexts around their language and style?

A note on data ethics.

Zine Librarians Code of Ethics

• “Zines are not like mass-distributed books. They are often self-published and self-distributed, and sometimes printed in very small runs, intended for a small audience. In addition, perzinesare by definition “personal”, and zinesters may feel different about having their zines distributed in print than they would about having them openly available on the internet or print. This can be especially true in the case of “historical” zines in library collections — for example, a teen girl writing a zine for her close friends in 1994 may not want her zine distributed online or in print 20 years later.”

• Via Zinelibraries

3. investigate the source

How to understand data

Recognizing uncertainty and bias

Data on killings in the Syrian conflict.

https://responsibledata.io/reflection-stories/uncertainty-statistics/

Let’s investigate the source…

Recognizing uncertainty and bias

Sources include

• Syrian government

• Syrian Center for Statistics and Research

• Syrian Network for Human Rights

• Syrian Observatory for Human Rigets

and many more.

https://responsibledata.io/reflection-stories/uncertainty-statistics/

there are lots of human decisions that go into creating these statistics

without knowing how these deaths have been coded, it’s difficult to trust in the figures

4. highlight un/common data entries to gain rough insights

How to understand data

Descriptive analysis

i.e., description of the data from a sample

Quick descriptive statistics

• frequency

•rank from lowest to highest

•average (mean, median, mode)

•variability

Bivariate descriptive statistics

fancy way of saying we are looking at two variables at once

Hamlet Macbeth Othello

Similes 50 9 59

Metaphors 20 38 58

Total 70 47 117

Evaluating Comparison Methods

Correlation

most common way to describe a relationship between two measures

What if the dataset you needdoes not exist?

How to data1. Determine what to say

2. Find/collect/create the data you need

3. Wrangle!

4. Clean!

5. Do it many more times.

ID Religion Income Age Q1 Q2 Q3

26371 Jewish <$10K 19 Yes 6 20

26372 Atheist $50-75K 24 - 4 21

26373 Catholic $75-100K 56 Yes 3 21

26374 Withheld $75-100K 33 No 6 21

26375 Pentecostal withheld 49 Yes 8 20

26376 Jewish $40-50K 29 Yes 5 19

26377 Catholic $20-30K 37 No 4 22

http://vita.had.co.nz/papers/tidy-data.pdf

Tidy Data

Most common problems

• Column headers are values, not variable names.

• Multiple variables are stored in one column.

• Variables are stored in both rows and columns.

• Multiple types of observational units are stored in the same table.

• A single observational unit is stored in multiple tables

http://vita.had.co.nz/papers/tidy-data.pdf

if you torture data long enough,

it will confess to anything

How can a visualization be misleading?

What’s wrong?

A little less dramatic than you thought.

http://www.visualisingdata.com/2014/04/the-fine-line-between-confusion-and-deception/

https://thesyriacampaign.org/

Open Data: Things to Consider

http://www.slideshare.net/libereurope/humanities-data-literacy-student-perspective-on-digital-cultural-heritage-collections?qid=70bd86f2-10c5-43a6-b053-56d264ca28ab&v=&b=&from_search=1

Recommended Reading / Viewing

“Numbers are Only Human” – Brian Root

“Ethical Principles of Psychologists and Code of Conduct” –American Psychological Association

“On Not Looking: Ethics and Access in the Digital Humanities” –Kimberly Cristen-Withey

Upcoming Workshops and Eventslibrary.uta.edu/scholcomm

Rafia Mirzarafia@uta.edu @librarianrafia

Peace Ossom Williamsonpeace@uta.edu @123POW

top related