prioritizing digitization british library centre for conservation, february 23 2010 the scanning on...

28
Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Upload: allison-merriweather

Post on 14-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Prioritizing digitization

British Library Centre for Conservation, February 23 2010

The scanning on demand system of the

Amsterdam City Archives

Page 2: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Projectleader for the Imagebank

Projectleader development search and retrieval applications

Projectleader digitization

Started working at the Amsterdam Archives in 2001

Who am I?

British Library Centre for Conservation, February 23 2010

Marc Holtman

Current job

Coordination all digitization projectsCoordination all digitization projects

Development workflowDevelopment workflow

Development workflow toolsDevelopment workflow tools

ArchiefbankArchiefbank

ImagebankImagebank

Page 3: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Archiefbank online with more than 7 million scans and 15.000 registered users

Image Bank online with 300.000 high end quality scans

2010

2001 – Developing of the Image Bank: building an application and digitizing of 25.000

photo’s, drawings and prints

2006 – Developing the Archiefbank: expanding of the inventories with integration of scans,

Indexes, scanning on demand service and a workflow for large scale digitization

2003 – Developing an application for online inventories: all inventories, no scans

2000 - Start with digitization of highlights from the collections and three large genealogy

sources

History

British Library Centre for Conservation, February 23 2010

Brief history of digitization at the Amsterdam Archives

Page 4: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

From (relatively) small scale digitization to a “scan it all” approach

And a spectacular growth of users on the website

Trigger was an ongoing decline in visitors of our reading rooms

Turning point in 2006

History

British Library Centre for Conservation, February 23 2010

From small selections to large scale digitization

Visitors

Year Reading rooms Website

1982 24.027  

1988 29.788  

1992 27.738  

1998 26.598 40.048

2002 25.014 224.050

2006 17.958 512.592

Page 5: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Users expects to find everything digitally available

……when we have 20 miles of archives in our repositories

Strategy

British Library Centre for Conservation, February 23 2010

Everybody should be able to consult digitized documents 24 /7 onlineEverybody should be able to consult digitized documents 24 /7 online

But where to start?

And how to finance?

After the realization of the online inventories users started to ask

“Where’s the button for the images?”

Page 6: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Strategy

Q. How long does it take to scan it all?

1 feet = 2.000 scans

Production = 10.000 scans a week

A. 406 years

Q. How many scans can be made from 20 miles of archives?

A. 739.200.001 scans

British Library Centre for Conservation, February 23 2010

The pessimistic math

Page 7: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

It was clear we had to:

Strategy

British Library Centre for Conservation, February 23 2010

Rethink our policy in prioritizing digitization

Rethink our financial principals on digitization

Develop a workflow in which large scale and low costs are starting points

Develop a user friendly web application

Page 8: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

And started thinking about the documents the users need for their research

Users only need a few documents, not everything that is being digitized

We stopped thinking about the 20 miles of archives in our repositories

British Library Centre for Conservation, February 23 2010

The user priorities

The documents needed for your research should be the first

documents to be digitized, not the last

The documents needed for your research should be the first

documents to be digitized, not the last

This asks for client-driven digitization This asks for client-driven digitization

Prioritizing digitization

Page 9: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

The user doesn’t commit to anything by placing a request, but neither does the archive

In principle all requests are honored, unless

It can not be digitized for material reasonsIt can not be digitized for material reasons

Copyright materialCopyright material

Disclosure restrictions applyDisclosure restrictions apply

In the Archiefbank we let the user set priorities in digitization

Prioritizing digitization

British Library Centre for Conservation, February 23 2010

The user priorities

All archive files can be requested for digitization via the

online inventories

All archive files can be requested for digitization via the

online inventories

Page 10: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Prioritizing digitization

British Library Centre for Conservation, February 23 2010

Page 11: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

After digitization the originals can not be requested in the reading room anymore

The scans in the scanning on request service are made for the purpose of archival research

Not as a substitute for the originals

Nevertheless, digitization does have a real conservation function

Conservation of the originals remains our major concernConservation of the originals remains our major concern

Prioritizing digitization

British Library Centre for Conservation, February 23 2010

The preservation side

Damage or loss of the originals caused by use is ruled outDamage or loss of the originals caused by use is ruled out

Page 12: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

If the material is too fragile, or asks for complex restoration we cancel the request for

digitization

If necessary – and possible – our restoration employees perform small restorations

All inventory nrs are checked before they are transported to the digitizer

Basic rules:

We perform small preservation tasks

Prioritizing digitization

The preservation side

Removal of staples Removal of staples

repackaging when necessaryrepackaging when necessary

The sequence of the originals is not checked or alteredThe sequence of the originals is not checked or altered

We do not number the originalsWe do not number the originals

British Library Centre for Conservation, February 23 2010

Page 13: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Prioritizing digitization

The preservation side

British Library Centre for Conservation, February 23 2010

Page 14: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Digital preservation: all scans are stored in a controlled e-repository environment (OAIS)

Prioritizing digitization

The preservation side

British Library Centre for Conservation, February 23 2010

Page 15: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Hundreds to millions of scans in each project

Purpose of digitization varies from accessibility to substitution of the originals

Besides the selection made by users we scan on project basis

Prioritizing digitization

British Library Centre for Conservation, February 23 2010

Digitization projects

Grants from (national) program, often on specific topics

Cooperation with Amsterdam district councils and services

Page 16: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

But: consulting the scans at our reading rooms is for free

In the Netherlands free access to archives is legislated

Users have to pay to get access to the scans

But for reproductions you have to pay

We regard reading and downloading of digitized archival documents via the

web as delivery of reproductions

We regard reading and downloading of digitized archival documents via the

web as delivery of reproductions

Grants for digitization are not enough for realizing our vision

Financing

British Library Centre for Conservation, February 23 2010

The idea is that by buying scans the audience makes (part of the) financing

of digitization possible

The idea is that by buying scans the audience makes (part of the) financing

of digitization possible

Page 17: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Customers think a low price is important

This means that costs for producing and storing scans have to be as low as possible

Archival research easily runs into the use of dozens to hundreds of documents

The price of an ordinary copy in our reading room should be the benchmark

100 scans should not cost € 1000

The costs when purchasing scans online should be competitive with travel

costs when visiting our reading room

The costs when purchasing scans online should be competitive with travel

costs when visiting our reading room

Financing

British Library Centre for Conservation, February 23 2010

Pricing policy

Page 18: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Reducing incidental costs (production of scans):

Digitization on al large scale only is possible when both incidental and structural costs are

as low as possible

Reducing structural costs (storage of scans):

1. Standardized and efficiently organized workflow

Financing

British Library Centre for Conservation, February 23 2010

Reducing costs

2. Choosing quality standards that fit the purpose of the scans

3. Filesizes as small as possible

Page 19: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Financing

British Library Centre for Conservation, February 23 2010

Reducing costs

2. Choosing quality standards that fit the purpose of the scans

Price comparison scanning costs

Price rates scanning, external partner

High-end 2 – 10 €

“Legibility” 0,20 – 0,40 €

“Legibility”, auto-feed 0,10 €

In every project we choose a quality that fits the purpose of the digitizing In every project we choose a quality that fits the purpose of the digitizing

Scanning a modern, printed book for means of accessibility is not the same

as scanning of a vulnerable charter for preservation

Scanning a modern, printed book for means of accessibility is not the same

as scanning of a vulnerable charter for preservation

Page 20: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Example of scan with a “legibility” standard of quality

Financing

British Library Centre for Conservation, February 23 2010

Reducing costs

2. Choosing quality standards that fit the purpose of the scans

Is this scan ok for the purpose of doing archival research: yes

Is this scan ok for the publication in an art book: no

Page 21: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

3. Filesizes as small as possible

We use a combination of 1 and 3

Storage costs still are considerably high when producing large quantities of scans

In order to bring structural costs down file size of the scans has to be as low as possible

This can be achieved in three ways

1. Skimming on resolution

3. Using (lossless or lossy) compression on the files

2. Skimming on bit depth / amount of colors (only possible in formats like TIFF and PNG)

Financing

British Library Centre for Conservation, February 23 2010

Reducing costs

Page 22: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Financing

British Library Centre for Conservation, February 23 2010

Reducing costs

3. Filesizes as small as possible

Fileformat Storage Costs 1 year Costs 10 years

Tiff uncompressed 11 TB € 38.500 € 380.500

JPEG 10 1,1 TB € 3.850 € 38.500

JPEG 4 (200 dpi) 124 GB € 434 € 4.340

JPEG 2000 (part 1) 6 TB € 21.000 € 210.000

Storage of 500.000 images Avg size per scan uncompressed = 22,1 MB

Price rate: 1 TB, storage in a controlled e-repository environment on two separate locations, including IT costs

€ 3.500 (NLD, jan 2010)

Page 23: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Also, digitization simply is a powerfull way to fulfill our mission: making our archives accessible

What we win by digitization is more than what we can simply measure in euro’s as income

For example, after digitizing logistics and physical reading room with climate control and

security isn’t necessary anymore for these documents when requested

What should we put in and what not?

Calculating real costs and income is difficult

Financing

British Library Centre for Conservation, February 23 2010

Costs and income Archiefbank

Costs Archiefbank (2009)

Digitization on request € 140,000

Digitization projects € 200,000

Webservices € 50.000

Total € 390.000

Income Archiefbank (2009)

Sales of scans € 100,000

Project funding € 200,000

Government (digitization) € 90,000

Total € 390.000

Page 24: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Conclusion in our framework is that the scanning on request service is financially feasible

Financing

British Library Centre for Conservation, February 23 2010

Costs and income Archiefbank

Costs Archiefbank (2009)

Digitization on request € 140,000

Digitization projects € 200,000

Webservices € 50.000

Total € 390.000

Income Archiefbank (2009)

Sales of scans € 100,000

Project funding € 200,000

Government (digitization) € 90,000

Total € 390.000

Page 25: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Goals of digitization projects vary from access to substitution of the originals

In every project quality standard and method are set, depending on purpose

and type of material

We always work on project basis

Every type of document can be digitized in this workflow

We developed a standardized workflow for all digitization

British Library Centre for Conservation, February 23 2010

Standardized workflow

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Workflow

Page 26: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Scanning is contracted out

Identification of the file and assigning filenames by means of an

order ticket

Always scanning of complete inventory numbers

Use of workflow tools for managing the originals and performing of checks on scans

Workflow

British Library Centre for Conservation, February 23 2010

workflow Principles

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 27: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Workflow

British Library Centre for Conservation, February 23 2010

Weekly schedule scanning on demand

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Task Hours

Retrieving the originals 4

Preparing the originals 6

Checking scans 6

Returning the originals 4

Contact with customers 1

Coordination and administration 3

Page 28: Prioritizing digitization British Library Centre for Conservation, February 23 2010 The scanning on demand system of the Amsterdam City Archives

Archiefbank

British Library Centre for Conservation, February 23 2010

Demonstration of the Archiefbank

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

More:

http://www.slideshare.net/ktheimer