digitisation workshop pres 2009(v1)

38
Managing Digitisation Programs Workshop Sydney, 16 July 2009 Mal Booth DERSU

Upload: mal-booth

Post on 19-May-2015

977 views

Category:

Education


1 download

DESCRIPTION

Slides from a half day workshop that I gave a couple of times in 2009. Better late than never I suppose. You need to read my blog post here: http://frommelbin.blogspot.com/2010/09/some-old-news-about-digitisation.html for an explanation about some slides and for references.

TRANSCRIPT

Page 1: Digitisation workshop pres 2009(v1)

Managing DigitisationPrograms

WorkshopSydney, 16 July 2009

Mal Booth –

DERSU

Page 2: Digitisation workshop pres 2009(v1)

My background?The Australian War Memorial’s Research Centre

functions as a library and an archive. It develops, manages and provides public access to Australia’s official, personal, & published records of

war.

Page 3: Digitisation workshop pres 2009(v1)

Global trends in digitisation• Faster, better, cheaper equipment & storage

• Better DAMS & CMS software• Institutional & shared repositories• More audio & film• Collaboration• Shared collections (eg. Picture Australia)• Mass digitisation programs: Google, Microsoft,

Yahoo, Open Content Alliance (OCA), Internet Archive

• Pressure for online access & pressures on real storage space

Page 4: Digitisation workshop pres 2009(v1)

I’m not sure what these are, but they are important!

• Dynamism

• Preservation (as a benefit & obligation where necessary)

• Playing• Management & planning• Compromise• Access

Page 5: Digitisation workshop pres 2009(v1)

Recent Examples - AWM• WW1, WW2, Korea & Vietnam

unit war diaries

• 260k+ images of our collections

• Official histories (published works)

• Digitisation on demand

Page 6: Digitisation workshop pres 2009(v1)

Digitisation for Accessc90,000 pp

per year

Page 7: Digitisation workshop pres 2009(v1)

Supporting Teaching & Learning

• Digital Resource Register• Alternative Format

Service• Exam Papers

Access only

Supporting Research• eScholarship (UTS ePress,

iResearch, eData)• Australian Digital Theses

Collection

Access & Preservation (data curation)

Recent Examples – UTS Library

Page 8: Digitisation workshop pres 2009(v1)

About one fifth of these images

Page 9: Digitisation workshop pres 2009(v1)

What we will cover today1. GETTING STARTED

a. Why and what to digitise?

b. How (preservation/access) & Principles

c. Copyright and IP considerations (briefly)

d. Resources needed; in-house or outsource?

e. Process outline: from planning to long term maintenance (life-cycle)

2. METHODS, CONTENT & STORAGE

a. Production: file formats & standards, scanners & cameras, software

b. Output: indexing, access, search optimisation, delivery options

c. Storage, ongoing maintenance & management requirements

d. Just doing it, lessons learned & key issues

Page 10: Digitisation workshop pres 2009(v1)

Why and what to digitise?WHY

• Increase & broaden access (remote & 24/7)

• Fragile, valuable &/or unique materials (loss or damage would be catastrophic)

• Support research & education

• Anticipating future use or re-use

• Improved search, retrieval & storage

• Promoting knowledge, understanding & recognition of collections

• Relationships to other collections

• Preservation of at-risk collections by risk reduction & conservation

WHAT: popular collections; fragile/unique; at-risk; significant priorities; relationships (corporate or collaborative); & what you have the right to digitise!

Page 11: Digitisation workshop pres 2009(v1)

How: some Principles* - Collections (organised groups of objects)

• Agreed collection development policy

• Sound description

• Lifecycle curation

• Broad access to all

• Respect for IP

• Evaluation for use & usefulness

• Interoperability

• Integration of staff & user workflows

• Sustainability & continued usability

* NISO Framework of Guidance for the Building of Good Digital Collections

Page 12: Digitisation workshop pres 2009(v1)

How: some Principles - Objects (digital assets)

• Production ensures collection priorities & maintains interoperability and re-use

• Preservability: persistence & accessibility over time; across evolving media, software & formats

• Meaningful outside its context: portable, reusable, interoperable

• Persistent identifiers: URLs or URIs

• Authentication: veracity, accuracy & authenticity

• Inclusion of associated metadata: descriptive, administrative & structural

Page 13: Digitisation workshop pres 2009(v1)

How: some Principles - Metadata

(selection and implementation of information about objects: descriptive; administrative; technical; structural; & preservation)

• Appropriate to materials, users and use

• Support for interoperability: mappings & crosswalks between schemes

• Use of authority control and content standards

• Includes a clear statement on conditions of use for the objects (eg. fair use)

• Support for long term management, eg. PREMIS

• Metadata records are treated as digital objects

RUBRIC overview:

http://cairss.caul.edu.au/packages/RUBRIC_Toolkit/docs/Metadata_lite.htm

Page 14: Digitisation workshop pres 2009(v1)

How: some Principles - Initiatives (the creation & management of collections)

• A substantial design and planning component

• Appropriate staffing and expertise

• Best practice project management

• An evaluation plan

• A project report that documents the process & outcomes

• Consideration of the entire lifecycle (ongoing management)

Page 15: Digitisation workshop pres 2009(v1)

Copyright & Intellectual Property (1)Concerns:

• What sort of items are protected by copyright?

• What is the duration of copyright protection?

• What sorts of activities infringe copyright?

• When is a copyright licence required?

• Understanding the “exceptions” to copyright infringement

See: Copyright and Cultural Institutions: Short Guidelines for Digitisation by Emily Hudson and Andrew Kenyon

& ACC’s Special case exception: education, libraries, collections (deals with the new section 200AB)

Page 17: Digitisation workshop pres 2009(v1)
Page 18: Digitisation workshop pres 2009(v1)
Page 19: Digitisation workshop pres 2009(v1)
Page 20: Digitisation workshop pres 2009(v1)

Resources required (1)• Hardware – scanners, cameras, computers, monitors, digital

storage, memory & processing power

• Software – scanning, OCR, office apps, image editing & management, DAM?, video/audio capture, metadata capture?, file conversion, calibration

• Furnishings – for staff, computers, scanners, storage

• Facility space – scanning, preparation & storage, QA

• Specialist staff – curatorial, cataloguers, IT/DBA, web, scanning, project management, conservators

• Training needs

• Conservation needs – archival supplies & consultancies

• Budget funds – salaries, hardware/software purchases & lease, licenses, running/ongoing costs, contingency

• Corporate support – context within corporate or other priorities and strategies

Page 21: Digitisation workshop pres 2009(v1)

WW1 Diaries scanning facilities

Approximately 200,000 high

res. images per year

Page 22: Digitisation workshop pres 2009(v1)

Outsource or Inhouse?• Contractor responsible for

capital equipment, training and technology obsolescence costs costs

• No need to find scanning space

• Less need for digitisation knowledge

• Economies of scale (& capability for large volumes & throughput)

• The bureau may be able to achieve a better quality result & have a broader range of services

• A better fix on costs and timescales (but these can vary widely)

• Better institutional knowledge, understanding & capacity

• Less risk than working with external parties

• Better ability to meet specific needs and deadlines?

• Cheaper costs for oversized or non-standard materials?

• QA may be more efficient

• Saving on transport and insurance and less risk with onsite scanning

• Assured staff and expertise

Page 23: Digitisation workshop pres 2009(v1)

Dealing with an external bureau

• Clear contracts are important

• Choosing a bureau – check with reference sites

• Range and scope of material - non-standard materials

• Collaboration with others to achieve further economies of scale may be possible

• QA can be a project killer

• Metadata – what will the bureau record?

• Consider partial outsourcing or bringing a specialist partner onsite

Page 24: Digitisation workshop pres 2009(v1)

Some funding options

• Program funding – dependent on corporate priorities

• User pays – but will they?

• Grants - eg. http://www.nla.gov.au/chg/

• Donors or sponsors - from or associated with a web presence

• Collection Depreciation – depends on valuation and an accounting standard

• As a training activity – can be viable learning experience for a small team & project

• New policy proposals

Page 25: Digitisation workshop pres 2009(v1)

“Investing in an Intangible Asset”• The benefits of long term preservation of digital assets are

difficult to value (reliably and objectively), but the costs of not doing so are high if action isn’t taken. More information on costs and benefits is needed.

• Digital preservation is still new, so there is scope for market creation & development, research and experimentation.

• Information managers know why such programs are important, but find it hard to communicate this to those who control our finances. Business cases based on empirical evidence need something like the balanced scorecard approach to bridge the gap between us and decision makers.

• Digital preservation is still an organisational innovation and must be managed effectively as it is dependent on independently driven technological developments.

From DCC’s Investment in an Intangible Asset

Page 26: Digitisation workshop pres 2009(v1)

The AWM Document Digitisation Process

1. Appraise & Scope

2. Determine

Specifications & Purpose

3. Estimate resources

4. Create databases

5. Prepare collections

6. Scan, manipulate

& save

7. QA

8. Archive & create

derivatives

9. Metadata creation

10. Image back-up

11. Create web export

& pages

12. Ongoing DAM

13. Retrieval as required

Page 27: Digitisation workshop pres 2009(v1)

Cornell’s digital imaging process map

• Radiating out from the goals and deliverables of the project are the institutional resources

• The outer wheel represents the processes or stages of digital imaging initiatives – clockwise from Selection

Page 28: Digitisation workshop pres 2009(v1)
Page 30: Digitisation workshop pres 2009(v1)

PRODUCTION: file formats –

how and where they are used

Page 31: Digitisation workshop pres 2009(v1)

PRODUCTION: scanners & cameras

• Flatbed scanners• Map/plan scanners• Overhead scanners• Digital cameras• Book scanners• Book-edge scanners• Microfilm and slide scanners

Page 32: Digitisation workshop pres 2009(v1)

PRODUCTION: softwareImage editing software

• Consider: cost; hardware requirements; usability; functionality

• Options : Adobe Photoshop CS3 (expensive/best) & Photoshop Elements (cheap); Gimp (free); + prop. software for RAW files

• Derivative, OCR and pdf production: Adobe Acrobat 9 Pro; OmniPage; ImageMagick (conversion software); Ghostscript (pdf interpreter); & pdftk (pdf toolkit)

Other useful open source software:

• JHOVE object validation

• FedoraCommons object repository management system

• ebXML e-business suite

• Xena digital document preservation software (from NAA)

• DSpace institutional repository system

• DROID automated batch identification of file formats (from TNA UK)

• OpenEdit ; Razuna ; ResourceSpace - Open source & free DAM software

Page 34: Digitisation workshop pres 2009(v1)

STORAGE & MAINTENANCE

Storage

Consider: Speed (read/write, data transfer); Capacity; Reliability (stability, redundancy); Standardization; Cost; & Fitness to task

Management, maintenance & preservation

• Digital preservation practices

• Preservation metadata

• Trusted digital repositories?

Page 35: Digitisation workshop pres 2009(v1)

What we want● Accuracy / authenticity

● Accessibility● Searchability ● Easy navigation &

download● Cost effectiveness ● Good quality product● Text capture and search

(OCR) where poss.● Integration● Scalability● Web interactivity● Simple solutions

● Costs estimates escalate ● Technology has limits, but is improving● You learn with new technology by doing● There is more to copyright than owning it● Anticipate needs & increasing expectations● $ hard to find for access (sponsorship?)● Better management & storage of assets● A need to educate managers & suppliers!● Keeping trained staff is a challenge● Costs/benefits of new technologies (risk?)● Importance of QA in projects!● Need for a strategic plan(s)● Be prepared to compromise

What we are finding

Lessons

Page 36: Digitisation workshop pres 2009(v1)

Enterprise Content Management: management, search & web facilities

for digital assets and services• Extensive digital asset management features

• Excellent electronic document & record management

• Intuitive web content management features

• Facilitate simple and complex workflow processes

• Extensive and unified searching constructs

• Scaleable

• Compliant with all government recordkeeping requirements & emerging digital preservation standards

• Integrate easily with existing systems

• Simple to administer in terms of security, auditing & storage management

Page 37: Digitisation workshop pres 2009(v1)

implementing user-friendly technologies

• make sure they are findable and useable

• pick a few “winners” & lead by example

• collaborate & network

• get involved in your core business

• don't leave it just to IT-staff (get involved)

• learn to compromise (the 80:20 rule)

• experiment

• start now! it is sometimes easier to seek forgiveness than gain permission

Page 38: Digitisation workshop pres 2009(v1)

JISC 2007 – five key issues for digitisation

1. Re-focus on the user (simple, easily found & used output)

2. Aggregate and present content that can resonate with multiple communities

3. Learn from Google & YouTube but keep your values

4. New business models are needed, collaborating with and without the private sector

5. More collaboration between publishers, curators, funders, users, vendors and standards bodies