digitisation workshop pres 2008(v1)

38
Digitisation Digitisation Revolutionising Library Management Revolutionising Library Management Day 2 Day 2 Sydney, April 2007 Sydney, April 2007 Mal Booth Head, Research Centre

Upload: mal-booth

Post on 19-May-2015

2.554 views

Category:

Technology


1 download

DESCRIPTION

Slides and notes from a presentation that I gave as part of a masterclass for library managers in April 2008. Some slides contain links and the slides are best read in conjunction with the notes that appear at the bottom of the slideshare screen.

TRANSCRIPT

Page 1: Digitisation Workshop Pres 2008(V1)

DigitisationDigitisationRevolutionising Library Management Revolutionising Library Management

Day 2Day 2

Sydney, April 2007Sydney, April 2007

Mal Booth – Head,

Research Centre

Page 2: Digitisation Workshop Pres 2008(V1)

Where am I from?Where am I from?The Memorial’s Research Centre functions as a The Memorial’s Research Centre functions as a

library and an archive. We develop, manage and library and an archive. We develop, manage and provide public access to Australia’s official, provide public access to Australia’s official,

personal, & published records of war.personal, & published records of war.

Page 3: Digitisation Workshop Pres 2008(V1)

Global trends in digitisationGlobal trends in digitisation• Faster, better, cheaper equipment & storage

• Better DAMS & CMS software• Institutional repositories• More audio & film• Collaboration• Shared collections (eg. Picture Australia)• Mass digitisation programs: Google, Microsoft,

Yahoo, Open Content Alliance (OCA), Internet Archive

Page 4: Digitisation Workshop Pres 2008(V1)

I’m not sure what these are, but I’m not sure what these are, but they are important!they are important!

• Dynamism

• Preservation (as a benefit & obligation)

• Playing• Management & planning• Compromise• Access

Page 5: Digitisation Workshop Pres 2008(V1)

Recent Digitisation ExamplesRecent Digitisation Examples• WW1, WW2, Korea & Vietnam WW1, WW2, Korea & Vietnam

unit war diaries

• 260k+ 260k+ images of our collections

• Official histories (published works)

• Digitisation on demand Digitisation on demand

Page 6: Digitisation Workshop Pres 2008(V1)

Digitisation on demandDigitisation on demandCurrently running at 90,000 pp p.a.Currently running at 90,000 pp p.a.

Page 7: Digitisation Workshop Pres 2008(V1)

About one fifth of these About one fifth of these imagesimages

Page 8: Digitisation Workshop Pres 2008(V1)

What we will cover todayWhat we will cover today1. GETTING STARTED1. GETTING STARTED

a. Why and what to digitise?a. Why and what to digitise?

b. How (preservation/access) & Principlesb. How (preservation/access) & Principles

c. Copyright and IP considerations (briefly)c. Copyright and IP considerations (briefly)

d. Resources needed; in-house or outsource?d. Resources needed; in-house or outsource?

e. Process outline: from planning to long term e. Process outline: from planning to long term maintenance (life-cycle)maintenance (life-cycle)

2. METHODS, CONTENT & STORAGE2. METHODS, CONTENT & STORAGE

a. Production: file formats & standards, scanners & a. Production: file formats & standards, scanners & cameras, softwarecameras, software

b. Output: indexing, access, search optimisation, b. Output: indexing, access, search optimisation, delivery optionsdelivery options

c. Storage, ongoing maintenance & management c. Storage, ongoing maintenance & management requirementsrequirements

d. Just doing it, lessons learned & key issuesd. Just doing it, lessons learned & key issues

Page 9: Digitisation Workshop Pres 2008(V1)

Why and what to digitise?Why and what to digitise?WHYWHY

• Increase & broaden access (remote & 24/7)Increase & broaden access (remote & 24/7)

• Fragile, valuable &/or unique materials (loss or damage Fragile, valuable &/or unique materials (loss or damage would be catastrophic)would be catastrophic)

• Support research & educationSupport research & education

• Anticipating future use or re-useAnticipating future use or re-use

• Improved search & retrieval Improved search & retrieval

• Promoting knowledge, understanding & recognition of Promoting knowledge, understanding & recognition of collectionscollections

• Relationships to other collectionsRelationships to other collections

• Preservation of at-risk collections by risk reduction & Preservation of at-risk collections by risk reduction & conservationconservation

WHATWHAT: popular collections; fragile/unique; at-risk; significant : popular collections; fragile/unique; at-risk; significant priorities; relationships (corporate or collaborative); & what priorities; relationships (corporate or collaborative); & what you have the right to digitise!you have the right to digitise!

Page 10: Digitisation Workshop Pres 2008(V1)

How: some Principles* - How: some Principles* - CollectionsCollections ((organised groups of objectsorganised groups of objects))

• Agreed collection development policyAgreed collection development policy

• Sound descriptionSound description

• Lifecycle curationLifecycle curation

• Broad access to allBroad access to all

• Respect for IPRespect for IP

• Evaluation for use & usefulnessEvaluation for use & usefulness

• InteroperabilityInteroperability

• Integration of staff & user workflowsIntegration of staff & user workflows

• Sustainability & continued usabilitySustainability & continued usability

* * NISO Framework of Guidance for the Building of Good Digital NISO Framework of Guidance for the Building of Good Digital Collections Collections

Page 11: Digitisation Workshop Pres 2008(V1)

How: some Principles - How: some Principles - ObjectsObjects ((digital assetsdigital assets))

• Production ensures collection priorities & Production ensures collection priorities & maintains interoperability and re-usemaintains interoperability and re-use

• Preservability: persistence & accessibility Preservability: persistence & accessibility over time; across evolving media, over time; across evolving media, software & formatssoftware & formats

• Meaningful outside its context: portable, Meaningful outside its context: portable, reusable, interoperablereusable, interoperable

• Persistent identifiers: URLs or URIsPersistent identifiers: URLs or URIs

• Authentication: veracity, accuracy & Authentication: veracity, accuracy & authenticityauthenticity

• Inclusion of associated metadata: Inclusion of associated metadata: descriptive, administrative & structuraldescriptive, administrative & structural

Page 12: Digitisation Workshop Pres 2008(V1)

How: some Principles - How: some Principles - MetadataMetadata

((selection and implementation of information about objects: selection and implementation of information about objects: descriptive; administrative; technical; structural; & preservationdescriptive; administrative; technical; structural; & preservation))

• Appropriate to materials, users and useAppropriate to materials, users and use

• Support for interoperability: mappings & crosswalks Support for interoperability: mappings & crosswalks between schemesbetween schemes

• Use of authority control and content standardsUse of authority control and content standards

• Includes a clear statement on conditions of use for Includes a clear statement on conditions of use for the objects (eg. fair use)the objects (eg. fair use)

• Support for long term management, eg. PREMISSupport for long term management, eg. PREMIS

• Metadata records are treated as digital objectsMetadata records are treated as digital objects

Page 13: Digitisation Workshop Pres 2008(V1)

How: some Principles - How: some Principles - InitiativesInitiatives ((the creation & management of collectionsthe creation & management of collections))

• A substantial design and planning componentA substantial design and planning component

• Appropriate staffing and expertiseAppropriate staffing and expertise

• Best practice project managementBest practice project management

• An evaluation planAn evaluation plan

• A project report that documents the process & A project report that documents the process & outcomesoutcomes

• Consideration of the entire lifecycle (ongoing Consideration of the entire lifecycle (ongoing management)management)

Page 14: Digitisation Workshop Pres 2008(V1)

Copyright & Intellectual Property (1)Copyright & Intellectual Property (1)Concerns:Concerns:

• What sort of items are protected by copyright? What sort of items are protected by copyright?

• What is the duration of copyright protection? What is the duration of copyright protection?

• What sorts of activities infringe copyright? What sorts of activities infringe copyright?

• When is a copyright licence required?When is a copyright licence required?

• Understanding the “exceptions” to copyright Understanding the “exceptions” to copyright infringementinfringement

See: See: Copyright and Cultural Institutions: Short Guidelines for Copyright and Cultural Institutions: Short Guidelines for Digitisation Digitisation by Emily Hudson and Andrew Kenyonby Emily Hudson and Andrew Kenyon

& ACC’s & ACC’s SSpecial case exception: education, libraries, collections (deals with the new section 200AB)

Page 16: Digitisation Workshop Pres 2008(V1)
Page 17: Digitisation Workshop Pres 2008(V1)
Page 18: Digitisation Workshop Pres 2008(V1)
Page 19: Digitisation Workshop Pres 2008(V1)

Resources required (1)Resources required (1)• HardwareHardware – scanners, cameras, computers, monitors, digital – scanners, cameras, computers, monitors, digital

storage, memory & processing powerstorage, memory & processing power

• SoftwareSoftware – scanning, OCR, office apps, image editing & – scanning, OCR, office apps, image editing & management, DAM?, video/audio capture, metadata capture?, file management, DAM?, video/audio capture, metadata capture?, file conversion, calibrationconversion, calibration

• FurnishingsFurnishings – for staff, computers, scanners, storage – for staff, computers, scanners, storage

• Facility space Facility space – scanning, preparation & storage, QA– scanning, preparation & storage, QA

• Specialist staff Specialist staff – curatorial, cataloguers, IT/DBA, web, scanning, – curatorial, cataloguers, IT/DBA, web, scanning, project management, conservatorsproject management, conservators

• Training needsTraining needs

• Conservation needsConservation needs– archival supplies & consultancies– archival supplies & consultancies

• Budget funds Budget funds – salaries, hardware/software purchases & lease, – salaries, hardware/software purchases & lease, licenses, running/ongoing costs, contingencylicenses, running/ongoing costs, contingency

• Corporate support Corporate support – context within corporate or other priorities – context within corporate or other priorities and strategiesand strategies

Page 20: Digitisation Workshop Pres 2008(V1)

WW1 Diaries scanning facilitiesWW1 Diaries scanning facilities

Approximately 200,000 high

res. images per year

Page 21: Digitisation Workshop Pres 2008(V1)

Outsource or Inhouse?Outsource or Inhouse?• Contractor responsible for Contractor responsible for

capital equipment, training capital equipment, training and technology obsolescence and technology obsolescence costs costscosts costs

• No need to find scanning No need to find scanning spacespace

• Less need for digitisation Less need for digitisation knowledgeknowledge

• Economies of scale (& Economies of scale (& capability for large volumes & capability for large volumes & throughput)throughput)

• The bureau may be able to The bureau may be able to achieve a better quality result achieve a better quality result & have a broader range of & have a broader range of servicesservices

• A better fix on costs and A better fix on costs and timescales (but these can timescales (but these can vary widely)vary widely)

• Better institutional Better institutional knowledge, understanding & knowledge, understanding & capacitycapacity

• Less risk than working with Less risk than working with external partiesexternal parties

• Better ability to meet specific Better ability to meet specific needs and deadlines?needs and deadlines?

• Cheaper costs for oversized or Cheaper costs for oversized or non-standard materials?non-standard materials?

• QA may be more efficientQA may be more efficient

• Saving on transport and Saving on transport and insurance and less risk with insurance and less risk with onsite scanningonsite scanning

• Assured staff and expertise Assured staff and expertise

Page 22: Digitisation Workshop Pres 2008(V1)

Dealing with an external bureauDealing with an external bureau

• Clear contracts are importantClear contracts are important

• Choosing a bureau Choosing a bureau – check with reference sites– check with reference sites

• Range and scope of material Range and scope of material - non-standard - non-standard materialsmaterials

• Collaboration with others to achieve further Collaboration with others to achieve further economies of scale economies of scale may be possiblemay be possible

• QA QA can be a project killercan be a project killer

• Metadata Metadata – what will the bureau record?– what will the bureau record?

• Consider partial outsourcing or bringing a specialist Consider partial outsourcing or bringing a specialist partner onsitepartner onsite

Page 23: Digitisation Workshop Pres 2008(V1)

Some funding optionsSome funding options

• Program funding Program funding – dependent on corporate priorities– dependent on corporate priorities

• User pays User pays – but will they?– but will they?

• Grants Grants - eg. - eg. http://www.nla.gov.au/chg/

• Donors or sponsors Donors or sponsors -- from or associated with a web from or associated with a web presencepresence

• Collection Depreciation Collection Depreciation – depends on valuation and – depends on valuation and an accounting standardan accounting standard

• As a training activity As a training activity – can be viable learning – can be viable learning experience for a small team & projectexperience for a small team & project

• New policy proposalsNew policy proposals

Page 24: Digitisation Workshop Pres 2008(V1)

““Investing in an Intangible Asset”Investing in an Intangible Asset”• The benefits of long term preservation of digital assets are The benefits of long term preservation of digital assets are

difficult to value (reliably and objectively), but the costs of difficult to value (reliably and objectively), but the costs of not doing so are high if action isn’t taken. not doing so are high if action isn’t taken. More information More information on costs and benefits is neededon costs and benefits is needed..

• Digital preservation is still new, so there is Digital preservation is still new, so there is scope for market scope for market creation & development, research and experimentationcreation & development, research and experimentation..

• Information managers know why such programs are Information managers know why such programs are important, but find it hard to communicate this to those important, but find it hard to communicate this to those who control our finances. who control our finances. Business casesBusiness cases based on empirical based on empirical evidence need something like the balanced scorecard evidence need something like the balanced scorecard approach to approach to bridge the gap between us and decision bridge the gap between us and decision makersmakers..

• Digital preservation is still an Digital preservation is still an organisational innovation organisational innovation and and must be must be managed effectively managed effectively as it is dependent on as it is dependent on independently driven technological developments.independently driven technological developments.

From DCC’s From DCC’s Investment in an Intangible Asset

Page 25: Digitisation Workshop Pres 2008(V1)

The AWM Document Digitisation The AWM Document Digitisation ProcessProcess

Page 26: Digitisation Workshop Pres 2008(V1)

Cornell’s digital imaging process mapCornell’s digital imaging process map

• Radiating out Radiating out from the goals from the goals and deliverables and deliverables of the project are of the project are the institutional the institutional resourcesresources

• The outer wheel The outer wheel represents the represents the processes or processes or stages of digital stages of digital imaging initiatives imaging initiatives – clockwise from – clockwise from SelectionSelection

Page 27: Digitisation Workshop Pres 2008(V1)

Draft DCC Curation LifecycleDraft DCC Curation Lifecycle

Page 29: Digitisation Workshop Pres 2008(V1)

PRODUCTION: PRODUCTION: file formats – file formats –

how and where how and where they are usedthey are used

Page 30: Digitisation Workshop Pres 2008(V1)

PRODUCTION: scanners & camerasPRODUCTION: scanners & cameras

• Flatbed scannersFlatbed scanners• Map/plan scannersMap/plan scanners• Overhead scannersOverhead scanners• Digital camerasDigital cameras• Book scannersBook scanners• Book-edge scannersBook-edge scanners• Microfilm and slide scannersMicrofilm and slide scanners

Page 31: Digitisation Workshop Pres 2008(V1)

PRODUCTION: softwarePRODUCTION: softwareImage editing software

• Consider: cost; hardware requirements; usability; functionality

• Options : Adobe Photoshop CS3 (expensive/best) & Photoshop Elements (cheap); Gimp (free); + prop. software for RAW files

• Derivative and pdf production: Acrobat Writer (expensive); ImageMagick (conversion software); Ghostscript (pdf interpreter); & pdftk (pdf toolkit)

Other useful open source software:

• JHOVE object validation

• FedoraCommons object repository management system

• ebXML e-business suite

• Xena digital document preservation software (from NAA)

• DSpace institutional repository system

• DROID automated batch identification of file formats (from TNA UK)

Page 32: Digitisation Workshop Pres 2008(V1)

OUTPUTOUTPUTIndexing

• Most descriptive metadata will come from your MARC records

• If a separate database is needed: Access, SQL & Oracle

Access options (also part of just doing it)

• Collection OPACs, databases, Zoomify, EAD, DVDs, CDs

• Other: Blogs, Facebook ArtShare, Flickr, Facebook page

Search engine optimisation

• How can I create a Google-friendly site?

Page 33: Digitisation Workshop Pres 2008(V1)

STORAGE & MAINTENANCESTORAGE & MAINTENANCE

Storage

Consider: Speed (read/write, data transfer); Capacity; Reliability (stability, redundancy); Standardization; Cost; & Fitness to task

Management, maintenance & preservation

• Digital preservation practices

• Preservation metadata

• Trusted digital repositories?

Page 34: Digitisation Workshop Pres 2008(V1)

What we want● Accuracy / authenticity● Searchability ● Easy navigation &

download● Cost effectiveness ● Good quality product● Text capture and search

(OCR) where poss.● Integration● Scalability● Web interactivity● Simple solutions

● Costs estimates escalate ● Technology has limits, but is improving● You learn with new technology by doing● There is more to copyright than owning it● Anticipate needs & increasing expectations● $ hard to find for access (sponsorship?)● Better management & storage of assets● A need to educate managers & suppliers!● Keeping trained staff is a challenge● Costs/benefits of new technologies (risk?)● Importance of QA in projects!● Need for a strategic plan(s)● Be prepared to compromise

What we are findingWhat we are finding

LessonsLessons

Page 35: Digitisation Workshop Pres 2008(V1)

Enterprise Content Management: Enterprise Content Management: management, search & web facilities management, search & web facilities

for digital assets and servicesfor digital assets and services• Extensive Extensive digital asset managementdigital asset management features features

• Excellent Excellent electronic document & record electronic document & record managementmanagement

• Intuitive Intuitive web content managementweb content management features features

• Facilitate simple and complex Facilitate simple and complex workflowworkflow processes processes

• Extensive and Extensive and unified searchingunified searching constructs constructs

• Scaleable Scaleable

• CompliantCompliant with all government recordkeeping with all government recordkeeping requirements & emerging requirements & emerging digital preservation digital preservation standardsstandards

• IntegrateIntegrate easily with existing Memorial systems easily with existing Memorial systems

• Simple to administerSimple to administer in terms of security, auditing & in terms of security, auditing & storage managementstorage management

Page 36: Digitisation Workshop Pres 2008(V1)

Other Corporate Systems

Digital AssetManagement

Electronic Document & Records Management

Record Management E:mailMemorial Intranet

Web Content Management

AJRPWebsite

Lotus NotesOAI InterfaceFIRST OPAC

MICA OPAC(CAS)

ECM - Conceptual Overview

CMS Digital ObjectMgmt System

DOMS

BiographicalDatabases &War Diaries

RecordSearchNAA

Collection MgmtMICA

Library SystemFIRST

Fund RaisingSystem

Raisers Edge

Financial & HRSystem

SAP

POS System,Advance Retail

CAS

InternalOrders

OnLine ShopSearch

PhotocopyQuotes

ReQuest

eSalesPICTION

Page 37: Digitisation Workshop Pres 2008(V1)

implementing user-friendly implementing user-friendly technologiestechnologies

• make sure they are findable and useable

• pick a few “winners” & lead by example

• collaborate & network

• get involved in your core business

• don't leave it to IT-staff

• learn to compromise (the 80:20 rule)

• experiment

• start now! it is sometimes easier to seek forgiveness than gain permission

Page 38: Digitisation Workshop Pres 2008(V1)

JISC 2007 – five key issues for JISC 2007 – five key issues for digitisationdigitisation

1. Re-focus on the user (simple, easily found & used output)

2. Aggregate and present content that can resonate with multiple communities

3. Learn from Google & YouTube but keep our values

4. New business models are needed, collaborating with and without the private sector

5. More collaboration between publishers, curators, funders, users, vendors and standards bodies