preservation and institutional repositories for the digital arts and humanities

Download Preservation and institutional repositories for the digital arts and humanities

Post on 21-Oct-2014

960 views

Category:

Technology

2 download

Embed Size (px)

DESCRIPTION

For the Digital Humanities Data Curation Institute

TRANSCRIPT

Dorothea SaloUniversity of Wisconsin

salo@wisc.edu

Institutional repositories for the digital arts and

humanities

Dorothea SaloUniversity of Wisconsin

salo@wisc.edu

Preservation for the digital arts and

humanities

Dorothea SaloUniversity of Wisconsin

salo@wisc.edu

Dorothea SaloUniversity of Wisconsin

salo@wisc.edu

Preservation andinstitutional repositories for the digital arts and

humanities

And I said...

... youre giving me how much time for this?

Environment

As several of you are intimately aware, higher ed is trying to figure out What To Do About Data.

This spells opportunity... IF you can get a seat at the table, and IF you know what to ask for!

Humanists will not be the first people they think of, sadly.

Serious (insoluble?) problem: data diversity Expect compromise solutions.

Do not let IT pros intimidate you. They do not know everything they think they know.

PICK SOFTWARELAST.

Friendly wordof advice:

Photo: Briana Calderon; future educator of america. http://www.flickr.com/photos/46132085@N03/4703617843/

Arielle Calderon / CC-BY 2.0

ITS WHATTHE SOFTWAREWONT DO.

Its not what the software doesthatll kill you.

Photo: Briana Calderon; future educator of america. http://www.flickr.com/photos/46132085@N03/4703617843/

Arielle Calderon / CC-BY 2.0

DONT CHASE THE SHINY.

Another friendly word of advice:

Photo: Sparkle Texture http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

its much lessshiny.

In five years...

Photo: Sparkle Texture http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

its not shinyat all.

In ten years...

Photo: Sparkle Texture http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

In twenty years...

its probablyuseless.

NOT A SOLUTION:your graduate students

You have a bright, tech-savvy grad student.She builds an Awesome Tech Thing.You have no idea how it works.She graduates. Youre hosed.

Because she didnt (know how to) build it sustainably... Because you dont have any documentation... Because nobody made contingency plans for it...

I have seen this pattern over and over again. Its killed more digital culture and research materials than

anything I can think of in academe.

Am I saying dont experiment?

Nah, of course not. Im saying know what an experiment means. Im saying dont mistake an experiment for an

archive. Im saying dont experiment and then expect

everybody else to pick up your pieces because you didnt plan for metadata or preservation.

That said?

You gotta do what you gotta do.Some friendly advice:

Know where the exits are. (Can you export your data? In a reusable format?)

When you finish a project, USE that export. Triply true if youre relying on the cloud!

Your overriding goal, while a project is in progress: keep your eventual options open!

Long-term... is a totally other kettle of fish.

Your best strategyThe single best strategy for a digital humanist

concerned about long-term preservation... ... is to figure out how to make it Somebody

Elses Problem. Right now, this is hard. I do believe it will get easier.

Its a lot easier to figure this out from the start than at the end.

Dierent Somebody Elses will have dierent things that they want. If you know that from the get-go, youre much better o.

Institution-internal solutions

Institution-internal solutions

Rolling your own

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Adopting open-source software

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Adopting open-source software e.g. Omeka, Dataverse, ArchivesSpace...

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Adopting open-source software e.g. Omeka, Dataverse, ArchivesSpace... Better, but not foolproof. Upgrades? Security? Backups?

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Adopting open-source software e.g. Omeka, Dataverse, ArchivesSpace... Better, but not foolproof. Upgrades? Security? Backups? Writing plugins/mods = rolling your own. Avoid if possible.

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Adopting open-source software e.g. Omeka, Dataverse, ArchivesSpace... Better, but not foolproof. Upgrades? Security? Backups? Writing plugins/mods = rolling your own. Avoid if possible.

Adopting institutional infrastructure

Institution-internal solutions

Rolling your own Please dont, if you can possibly avoid it.

Adopting open-source software e.g. Omeka, Dataverse, ArchivesSpace... Better, but not foolproof. Upgrades? Security? Backups? Writing plugins/mods = rolling your own. Avoid if possible.

Adopting institutional infrastructure Make sure itll survive your departure from the institution!

Outside the institution

Outside the institution

Lists of data repositories

Outside the institution

Lists of data repositories Databib: http://databib.org/

Outside the institution

Lists of data repositories Databib: http://databib.org/ re3data: http://re3data.org/

Outside the institution

Lists of data repositories Databib: http://databib.org/ re3data: http://re3data.org/ N.b. you will find less here on the humanities than you

would probably prefer. Long story.

Outside the institution

Lists of data repositories Databib: http://databib.org/ re3data: http://re3data.org/ N.b. you will find less here on the humanities than you

would probably prefer. Long story.

Figshare

Outside the institution

Lists of data repositories Databib: http://databib.org/ re3data: http://re3data.org/ N.b. you will find less here on the humanities than you

would probably prefer. Long story.

Figshare ... and other web services springing up, e.g. omeka.net

You will be limited by... Infrastructure your library/IT has already

committed to this is why you want to be in on ground-floor discussions!

Their willingness and ability to tweak, rewrite, or replace it with something suiting your needs

Your willingness and ability to evaluate, install, and maintain a software stack that suits you

... perhaps indefinitely!

The availability of hosted solutions, and your ability to pay for them (perhaps indefinitely!)

You need to know what the options are like.

Your library and IT folks may well need guidance. At minimum, they need clearly-expressed requirements.

The requirements you give them need to go beyond end-user access, use, and UI.

Back end: getting material in as eciently as possible, allowing for additions/changes/deletions

Preservation requirements Data and metadata purity, clarity, preservability,

reusability, mashuppability, migratability, standards

Institutional repositories

Whats an IR?[A]ttics (and often fairly empty ones), with

random assortments of content of questionable importance

Brown, Griths, Rasco, University publishing in a digital age. Ithaka 2007. http://www.sr.ithaka.org/research-publications/university-publishing-digital-age

A basic digital preservation-and-access platform designed to allow faculty to deposit and describe single PDFs.

Quite commonly available in research libraries or through library consortia.

You probably have one available to you!

IR softwareOpen source

Fedora Commons: http://fedora-commons.info/ (youll need a layer on top of this)

DSpace: http://dspace.org/ EPrints: http://eprints.org/

Commercial ContentDM: http://contentdm.com/ DigiTool: http://www.exlibrisgroup.com/category/

DigiToolOverview

Hosted ContentDM: http://contentdm.com/ BePress: http://bepress.com/

Two minutes!

Find an IR available to you for depositing content.

You can typically expect...To get in touch with someone in the library to

get an account set up, and a space for you to deposit into

Have a collection name and description ready. Default descriptors, if you have any, also a good idea. Need access controls? To delegate deposit? Talk about this.

To be able to put materials in on your own, through web forms

To find the deposit process fiddly and annoying

To have material appear on the web right after deposit.

IRs work for...

Small(ish), discrete files that never change So an Excel-using researcher is just fine with an IR.

Documentation for data held elsewhereSome IRs can handle static website captures.Files with uncomplicated IP lives

... which complicates the static website question.

Access restriction may be possible, as may dark archiving; it depends on the IR platform. Expect it to be annoying to implement, though.

IRs dont work forReally Big Data

including, sometimes, audio and video This is less a reflection on IR software than of most IRs being

horribly underprovisioned with storage and bandwidth.

Work in progress; files that may change or be updated

Complex digital objects