oxford text archive

25
Funded by: © AHDS Oxford Text Archive and good practice in the creation of electronic resources http://ota.ahds.ac.uk Martin Wynne [email protected] with a lot of help from Ylva Berglund [email protected]

Upload: others

Post on 06-Dec-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Funded by:

© AHDS

Oxford Text Archiveand good practice in the creation

of electronic resources

http://ota.ahds.ac.uk

Martin [email protected]

with a lot of help from Ylva [email protected]

Funded by:

© AHDS

Case study: Terms of Address in Ben Jonson’s plays

• Database of all address terms • Instances coded for various parameters

– addresser/addressee, type of address, reference, etc.

• Stored on computer + back-ups on floppy disks

• The creator is happy to share the resource…

Funded by:

© AHDS

Problem 1: Hardware/media• Changes in hardware → data on old

computer and floppies now largely inaccessible

• Storage media vulnerable

Funded by:

© AHDS

Problem 2: Software

• Specialist software → data tied into application

• Proprietary software → ongoing support for application reliant on commercial interests

• Program versions: compatibility, different platforms → not compatible

Funded by:

© AHDS

Problem 3: Coding/mark-up

• Personal scheme – not comparable to other resources– doesn’t work with generic software

• Documentation– resource soon becomes unusable if

documentation is inadequate or impossible to find

Funded by:

© AHDS

Problem 4: Sharing• Dissemination

– Awareness– Distribution– Sustainability

Funded by:

© AHDS

Solution 1: Hardware/media• Migrate data from old hardware• Don’t keep on only one machine• Don’t store on vulnerable media

Funded by:

© AHDS

Solution 2: Software • Avoid specialised software unless

possible to migrate data• Consider open source

Funded by:

© AHDS

Solution 3: Coding/mark-up• Use standards• Document!

Funded by:

© AHDS

Example: htmlThis is boldThis is <b>bold</b>

This is a nounThis is a <b>noun</b>

Funded by:

© AHDS

Example: COCOA<Q FRENCH><A SARTRE><T NAUSEE><I 9><L 1>%"$C' EST UN GAR*CON SANS IMPORTANCE COLLECTIVE,%

%C' EST TOUT JUSTE UN INDIVIDU."%\L#-\F# \CE[LINE.%\L' E[GLISE\.%<F 00><P 11><L 1>%%AVERTISSEMENT DES E[DITEURS%%%$CES CAHIERS ONT E[TE[ TROUVE[S PARMI LES PAPIERS D'

\ANTOINE R\OQUENTIN.\%%$NOUS LES PUBLIONS SANS Y RIEN CHANGER.%…

Funded by:

© AHDS

Example: XML<s n="97"><w type="AJ0">Normal</w> <w type="NN1">economy</w> <w type="NN1">return</w> <w type="VBZ">is</w> <w type="NN0">£262</w> <c type="PUN">.</c>

</s>

Funded by:

© AHDS

Text Encoding Initiative (TEI)• Guidelines for the encoding of electronic

texts using XML• For interchange• Guidelines freely available• http://www.tei-c.org/

Funded by:

© AHDS

Encoding

• Choose language/format– e.g. XML

• Choose coding scheme– e.g. TEI

Funded by:

© AHDS

Markup and encoding options

• Word processing files• PDF• Database• HTML• SGML• XML• Plain text• Unicode

Funded by:

© AHDS

Advantages of XML• Standard not only in scholarly text

encoding, but publishing, web, etc.• Growing number of tools• Community of users, support• Re-usable skills, useful to learn• XML resources good for repurposing• XML resources good for preservation• Disadvantages? Cost, complexity

Funded by:

© AHDS

Advantages of TEI• Standard in scholarly text encoding,• Community of users, support available• Extensible• TEI resources good for interchange• Disadvantages:

– Cost, complexity– Compromises to text integrity– Overlapping hierarchies

Funded by:

© AHDS

Disadvantages of TEI• Cost, complexity• Compromises to text integrity• Overlapping hierarchies…

Funded by:

© AHDS

<p><sp cat=”NRS”>One officer said:</sptag><sp cat=”DS”>'This is like an episode from Inspector Morse.</p><p>"The victim was single but we believe he had several lady friends.</p><p>"It is possible that it was something in the background of one of those relationships that caused his death. .</p><p>"We don't think he was linked with any criminals or involved in any secret wrong doing." .</p></sp><p><sp cat=”N”>Police have not ruled out the possibility of a contract killing by ahitman.</sp></p>

Funded by:

© AHDS

Solution 4: Sharing

• Inform user/subject communities • Metadata• Consider using archives

Funded by:

© AHDS

Oxford Text Archive• Founded in 1976• Collect, catalogue, preserve, and

distribute high-quality electronic resources

• Advise creators and users• Part of the AHDS

Funded by:

© AHDS

AHDS http://www.ahds.ac.uk

• Archaeology • History• Literature, Languages and Linguistics• Performing Arts• Visual Arts

+• Executive

Funded by:

© AHDS

The OTA today

• About 2000 resources, in 25 languages• Mostly primary texts, ‘classics’• Language corpora• Increasingly, more complex resources,

with more intellectual content• http://www.ota.ahds.ac.uk/

Funded by:

© AHDS

The OTA in the future• New webpage

– More information– New ways of accessing resources

• Workshops and training events– Digitisation workshop– For new projects– Specialised workshops

• New resources

Funded by:

© AHDS

Oxford Text Archiveand good practice in the creation

of electronic resources

http://ota.ahds.ac.ukMartin Wynne

[email protected] a lot of help from Ylva Berglund

[email protected]