you ask we scan

79
You ask, we scan MARAC Conference October 30 2009 The Amsterdam City Archives and the Archiefbank

Upload: kate-theimer

Post on 03-Nov-2014

13 views

Category:

Education


3 download

DESCRIPTION

This is a copy of the presentation given by Ellen Fleurbaay and Marc Holtman of the Amsterdam City Archives at the the MARAC Plenary Session in Jersey City on Friday October 30, 2009.

TRANSCRIPT

Page 1: You Ask We Scan

You ask, we scan

MARAC Conference October 30 2009

The Amsterdam City Archives

and the Archiefbank

Page 2: You Ask We Scan

This morning

• Ellen: Amsterdam City Archives – a new Service Concept

• Marc: large scale scanning – on request, by order and subsidised projects– economic principles and workflow

• Ellen: Mission accomplished– government and customers satisfied

MARAC Conference October 30 2009MARAC Conference October 30 2009

Page 3: You Ask We Scan

Growing FAST

- 5.000 different archives- 15 repositories with 20 miles of shelf-length- 91.000 prints, maps and drawings- 824.000 photo’s- 372.000 reference books- 16.000 video- and audio tapes

MARAC Conference October 30 2009

City Archives 1848 - 2009

Page 4: You Ask We Scan

Growing FAST

- 5.000 different archives- 15 repositories with 20 miles of shelf-length- 91.000 prints, maps and drawings- 824.000 foto’s- 372.000 referencebooks- 16.000 video- and audiotapes

MARAC Conference October 30 2009

City Archives 1848 - 2009

Page 5: You Ask We Scan

BUT…

MARAC Conference October 30 2009

less visitors each yearVisitors

Year Reading rooms

1982 24.027

1988 29.788

1992 27.738

1998 26.598

2002 25.014

2006 17.958

Page 6: You Ask We Scan

Archives are dusty

MARAC Conference October 30 2009

Page 7: You Ask We Scan

MORE webvisitors

MARAC Conference October 30 2009

And…

Visitors

Year Reading rooms Website

1982 24.027  

1988 29.788  

1992 27.738  

1998 26.598 40.048

2002 25.014 224.050

2006 17.958 512.592

Page 8: You Ask We Scan

New

1. We want visitors to come to the archives

– to experience the look and feel of authentic archival documents

– to teach them the pleasure of doing their your own historical research

2. Everybody should be able to use all archival collections at home 24/7

MARAC Conference October 30 2009

Service Concept

Page 9: You Ask We Scan

How

To experience look and feel of an archive

– be where the visitors are: in the city centre

to attract visitors?

MARAC Conference October 30 2009

Page 10: You Ask We Scan

How

To experience look and feel of an archive

– be where the visitors are: in the city centre

to attract visitors?

MARAC Conference October 30 2009

Page 11: You Ask We Scan

How

To experience look and feel of an archive

– be where the visitors are: in the city centre

to attract visitors?

MARAC Conference October 30 2009

Page 12: You Ask We Scan

How

To experience look and feel of an archive

– be where the visitors are: in the city centre

– new corporate identity, new name, new logo

to attract visitors?

MARAC Conference October 30 2009

Page 13: You Ask We Scan

How

To experience look and feel of an archive

– be where the visitors are: in the city centre

– new corporate identity, new name, new logo: City Archives

– new products: museumnight, historical building, weekend open on Saturday and Sunday

to attract visitors?

MARAC Conference October 30 2009

Page 14: You Ask We Scan

How

To experience pleasure of research

- New readingroom formula: use the internet, use reference library, no silence please, do discuss with your fellow researchers

to attract visitors?

MARAC Conference October 30 2009

Page 15: You Ask We Scan

How

To experience pleasure of research

- New readingroom formula: use reference library, use the internet and no silence please

to attract visitors?

MARAC Conference October 30 2009

Page 16: You Ask We Scan

How

To experience pleasure of research

- New readingroom formula: use reference library, use the internet and no silence please

- staff is walking around and offers free assistance

to attract visitors?

MARAC Conference October 30 2009

Page 17: You Ask We Scan

How

To experience pleasure of research

- New readingroom formula: use reference library, use the internet and no silence please

- staff is walking around and offers free assistance

- Staff is trained in educational and social skills

to attract visitors?

MARAC Conference October 30 2009

Page 18: You Ask We Scan

How

MARAC Conference October 30 2009

to create an internet reading room?

All documents online?

– do not think about the 20 miles in your repository, think about the few

thousand documents your customers use per week

Realistic and economic principles

– estimate costs of complete proces, not just costs of scanproduction

– Dutch legislation: consult original is free, reproduction is paid for, so

scans are to be paid for

Page 19: You Ask We Scan

MARAC Conference October 30 2009

Page 20: You Ask We Scan

You ask

Scanning on customer’s request, economic principles, technical issues

and work process

We Scan

MARAC Conference October 30 2009

Page 21: You Ask We Scan

You ask

We Scan

MARAC Conference October 30 2009

We Store

We Do

Scanning on customer’s request, economic principles

Image quality and workflow principles

Compression and filesize

Workflow, tools and practical issues

Page 22: You Ask We Scan

Q. How long does it take to scan it all?

MARAC Conference October 30 2009

1 feet = 2.000 scans

Production = 10.000 scans a week

A. 406 years

Will this be our ultimate solution?

Q. How many scans can be made from 20 miles of archives?

A. 739.200.001 scans

Page 23: You Ask We Scan

The user doesn’t commit to anything by placing a request, but neither does the archive

You ask

We let our users set priorities in digitization

In principle all requests are honored, unless

We speak of a request for digitization and not of an order

MARAC Conference October 30 2009

1. Scanning at customer’s request

It can not be digitized for material reasonsIt can not be digitized for material reasons

Copyright materialCopyright material

Disclosure restrictions applyDisclosure restrictions apply

All archive files can be requested for digitization via the

online the finding aids

All archive files can be requested for digitization via the

online the finding aids

Page 24: You Ask We Scan

Costs for purchasing scans are equal for all users (the more you buy, the cheaper it gets)

Scans available are integrated in the online finding aids

All scans made are available for all users

The requester is not obliged to buy all scans

MARAC Conference October 30 2009

You ask

1. Scanning at customer’s request

Page 25: You Ask We Scan

Customers think a low price is important

This means that costs for producing and storing scans have to be as low as possible

Archival research easily runs into the use of dozens to hundreds of documents

You ask

The price of an ordinary copy in our reading room should be the benchmark

MARAC Conference October 30 2009

2. Low costs

100 scans should not cost $ 100

The costs when purchasing scans online should be competitive with travel

costs when visiting our reading room

The costs when purchasing scans online should be competitive with travel

costs when visiting our reading room

Page 26: You Ask We Scan

This asks for a streamlined, efficiently organized work process

You ask

Digitization takes time, but research should not have to be planned weeks ahead

Delivery time in a scanning on request service should be as short as possible

MARAC Conference October 30 2009

3. Fast delivery

Aim is a delivery time of 2 – 3 weeksAim is a delivery time of 2 – 3 weeks

Page 27: You Ask We Scan

An efficiently organized work process

Low incidental and structural costs

You ask

MARAC Conference October 30 2009

Conclusion

If we can make sure that

All finding aids can be selected for digitization by users

The scans are delivered in short time

For low costs

it can be stated that we have no backlog in digitizing and the objective that the customer is able to consult digitized item has been achieved

We need:

Page 28: You Ask We Scan

Besides scanning on request projects are based on:

In this presentation the focus is on large scale digitization at customer’s request

We scan

However, scanning on request is only a part of all digitization that takes place in the archives

MARAC Conference October 30 2009

Digitization at the Amsterdam City Archives in general

Grant money (often on specific topics, like WWII)

Selections of photographs, drawings etc for the Imagebank (Beeldbank)

Cooperation with Amsterdam district councils and services

Page 29: You Ask We Scan

Goals of digitization projects vary from access to substitution of the originals

In every project quality standard and method are set, depending on purpose and type of material

For all projects we have one workflow

We always work on project basis

We scan

MARAC Conference October 30 2009

Digitization at the Amsterdam City Archives in general

Page 30: You Ask We Scan

Experience shows that a constant production of 10.000 scans (at cutomer’s request) each

week is achievable

This way tasks can be planned best and deployment of staff is most efficient

We scan

1. At large scale

the more scans being made, the lower the price per scan

Large scale production is a prerequisite in order to keep production costs as low as possible

Large scale production is a prerequisite in order to keep production costs as low as possible

MARAC Conference October 30 2009

2. With a constant production

Large scale production can only be organized effectively when constant production is assumed

Large scale production can only be organized effectively when constant production is assumed

Page 31: You Ask We Scan

Documents that are being digitized in this reproduction process can have the following forms

We scan

MARAC Conference October 30 2009

Small and large sizeSmall and large size

Bound and loose-leafed entitiesBound and loose-leafed entities

Card indexesCard indexes

Old and modern materialOld and modern material

Low and high contrast documentsLow and high contrast documents

Text alone, text and image togetherText alone, text and image together

Hybrid formsHybrid forms

3. A broad spectrum of document types

Page 32: You Ask We Scan

Costs for producing and storing scans are determined to a high extent by the quality standard

set for the scans

Purpose of the scans: archival research using the web, straight from screen or print

We scan

4. For archival research from screen or print

The higher the standard of quality, the higher the costs will be

In order to keep costs low it is prudent to allow the standard of quality follow from the requirement the end user places on the scan

In order to keep costs low it is prudent to allow the standard of quality follow from the requirement the end user places on the scan

Textual information legible in de originals must be legible in the scans

MARAC Conference October 30 2009

Page 33: You Ask We Scan

But has no added value for the customer at all

A quality higher than that inevitably will push up both incidental and structural costs

We scan

4. For archival research from screen or print

Specified (basic) quality standard:

MARAC Conference October 30 2009

Reproduction of all significant information

Reproduction of all significant information

Reproduction of details which are not part of the textual information is not required

Page 34: You Ask We Scan

We scan

MARAC Conference October 30 2009

Scan quality and legibility

High quality scan

Modified scan (contrast)

Optimal tonal range

Example: very “light” original

Excellent flexibility

Poor tonal range

Little flexibility

Experience in practice learns that what is

experienced as being “good legibility” is very

personal.

We decided to solve this problem with a smart

filter in the document viewer.

Experience in practice learns that what is

experienced as being “good legibility” is very

personal.

We decided to solve this problem with a smart

filter in the document viewer.

Poor legibility

Excellent legibility

Which one would you

buy?

Which one would you

buy?

Page 35: You Ask We Scan

Skimming on the quality of scans (it can be better) is purely an economic decision, not one taken

on principle

We scan

MARAC Conference October 30 2009

4. For archival research from screen or print

Price comparison scanning costs

Price rates scanning, external partner

High-end 3 – 10 $

Legibility 0,30 – 0,75 $

Legibility, auto-feed 0,05 $

It does make sense to let the standard of quality follow from the purpose the end-uses places on of the scans

Page 36: You Ask We Scan

This way damage or loss of the originals is ruled out

After digitization the originals can not be requested in the reading room anymore

We scan

5. For conservation and security

The scans in the scanning on request service are made for the purpose of access / archival research

Not as a substitute for the originals

Nevertheless, digitization does have a real conservation function

MARAC Conference October 30 2009

Conservation of the originals remains the major

concern

Conservation of the originals remains the major

concern

Page 37: You Ask We Scan

A file can contain one – hundreds of documents

We scan

By definition the entire file is scanned

Never just a selection of pages

There are a few reasons for this:

MARAC Conference October 30 2009

6. Always complete files

The costs for scanning are not so much a factor of quantity, but rather of the manual processing involving in it

In the originals or the metadata it has to be indicated which documents are being digitized

When shown in the Archiefbank, the user expects completeness

When non-scanned pages have to be digitized later, the entire preparation process has to be gone through once again

Page 38: You Ask We Scan

Contracting out of scanning was a logical choice

We scan

The in-house scan facilities are not designed for large-scale digitizing

The complexity of the workflow and material to be scanned calls for

Investing only makes sense by very high production, organized on a large scale

MARAC Conference October 30 2009

7. Contracting out the scanning to external partners

Specialized hard- and softwareSpecialized hard- and software

Specialized set-upsSpecialized set-ups

KnowledgeKnowledge

Very complex technical infrastructureVery complex technical infrastructure

Page 39: You Ask We Scan

This calls for intensive collaboration

Also, the workflows of archive and digitizer have to dovetail

We scan

There are many scanning companies

Most do have experience in bulk processing

But not in this degree of complexity and diversity

MARAC Conference October 30 2009

7.

Contracting out scanning is more than awarding a contract to a supplier

Contracting out the scanning to external partners

Page 40: You Ask We Scan

We use a combination of 1 and 3

We store

Storage costs still are considerably high when producing large quantities of scans

In order to bring structural costs down file size of the scans has to be as low as possible

This can be achieved in three ways

MARAC Conference October 30 2009

Scans with a file size as small as possible

1. Skimming on resolution

3. Using (lossless or lossy) compression on the files

2. Skimming on bit depth / amount of colors (only possible in formats like TIFF and PNG)

Page 41: You Ask We Scan

We store

Resolution, compression and legibility: an example

MARAC Conference October 30 2009

300 dpi, high quility JPEG

200 dpi, low quility JPEG

Scans with a file size as small as possible

Page 42: You Ask We Scan

We store

Storage of compressed files as master images was “not done”

The main arguments where

Research after these arguments learned:

MARAC Conference October 30 2009

When using lossy compression you’ll loose information

Compressed files are more vulnerable (preservation)

Even when using strong lossy compression legibility is still guaranteed

Compressed files are not more vulnerable to loss then uncompressed files

But no compression means: large files high storage costs

Storage of uncompressed files is not necessary

Scans with a file size as small as possible

Page 43: You Ask We Scan

Filesize

Format Compression Type Resolution Color Avg 500.000 %

TIFF No --- 300 dpi 24 bits 22,1 Mb 11 Tb 100%

JPEG

Qua (ps) 12 Lossy 300 dpi 24 bits 7,5 Mb 3,7 Tb 34%

Qua (ps) 10 Lossy 300 dpi 24 bits 2,1 Mb 1,1 Tb 10%

Qua (ps) 4 Lossy 200 dpi 24 bits 255 Kb 124 Gb 1,1%

Qua (ps) 10 Lossy 400 dpi 24 bits 3,3 Mb 1,6 Tb 15%

JPEG2000Part 1 Lossless 300 dpi 24 bits 12 MB 6 Tb 55%

Part 6 Lossy 300 dpi 24 bits 120 Kb 59 Gb 0,5%

MARAC Conference October 30 2009

Comparison between file format, compression,

resolution and file size

Scans with a file size as small as possible

We store

Page 44: You Ask We Scan

Filesize

Format Compression Type Resolution Color Avg 500.000 %

TIFF No --- 300 dpi 24 bits 22,1 Mb 11 Tb 100%

JPEG

Qua (ps) 12 Lossy 300 dpi 24 bits 7,5 Mb 3,7 Tb 34%

Qua (ps) 10 Lossy 300 dpi 24 bits 2,1 Mb 1,1 Tb 10%

Qua (ps) 4 Lossy 200 dpi 24 bits 255 Kb 124 Gb 1,1%

Qua (ps) 10 Lossy 400 dpi 24 bits 3,3 Mb 1,6 Tb 15%

JPEG2000Part 1 Lossless 300 dpi 24 bits 12 MB 6 Tb 55%

Part 6 Lossy 300 dpi 24 bits 120 Kb 59 Gb 0,5%

TIFF uncompressed

MARAC Conference October 30 2009

Comparison between file format, compression,

resolution and file size

Scans with a file size as small as possible

We store

Page 45: You Ask We Scan

Filesize

Format Compression Type Resolution Color Avg 500.000 %

TIFF No --- 300 dpi 24 bits 22,1 Mb 11 Tb 100%

JPEG

Qua (ps) 12 Lossy 300 dpi 24 bits 7,5 Mb 3,7 Tb 34%

Qua (ps) 10 Lossy 300 dpi 24 bits 2,1 Mb 1,1 Tb 10%

Qua (ps) 4 Lossy 200 dpi 24 bits 255 Kb 124 Gb 1,1%

Qua (ps) 10 Lossy 400 dpi 24 bits 3,3 Mb 1,6 Tb 15%

JPEG2000Part 1 Lossless 300 dpi 24 bits 12 MB 6 Tb 55%

Part 6 Lossy 300 dpi 24 bits 120 Kb 59 Gb 0,5%

JPEG (psd) 10

MARAC Conference October 30 2009

Comparison between file format, compression,

resolution and file size

Scans with a file size as small as possible

We store

Page 46: You Ask We Scan

Filesize

Format Compression Type Resolution Color Avg 500.000 %

TIFF No --- 300 dpi 24 bits 22,1 Mb 11 Tb 100%

JPEG

Qua (ps) 12 Lossy 300 dpi 24 bits 7,5 Mb 3,7 Tb 34%

Qua (ps) 10 Lossy 300 dpi 24 bits 2,1 Mb 1,1 Tb 10%

Qua (ps) 4 Lossy 200 dpi 24 bits 255 Kb 124 Gb 1,1%

Qua (ps) 10 Lossy 400 dpi 24 bits 3,3 Mb 1,6 Tb 15%

JPEG2000Part 1 Lossless 300 dpi 24 bits 12 MB 6 Tb 55%

Part 6 Lossy 300 dpi 24 bits 120 Kb 59 Gb 0,5%

JPEG (psd) 4

MARAC Conference October 30 2009

Comparison between file format, compression,

resolution and file size

Scans with a file size as small as possible

We store

Page 47: You Ask We Scan

Filesize

Format Compression Type Resolution Color Avg 500.000 %

TIFF No --- 300 dpi 24 bits 22,1 Mb 11 Tb 100%

JPEG

Qua (ps) 12 Lossy 300 dpi 24 bits 7,5 Mb 3,7 Tb 34%

Qua (ps) 10 Lossy 300 dpi 24 bits 2,1 Mb 1,1 Tb 10%

Qua (ps) 4 Lossy 200 dpi 24 bits 255 Kb 124 Gb 1,1%

Qua (ps) 10 Lossy 400 dpi 24 bits 3,3 Mb 1,6 Tb 15%

JPEG2000Part 1 Lossless 300 dpi 24 bits 12 MB 6 Tb 55%

Part 6 Lossy 300 dpi 24 bits 120 Kb 59 Gb 0,5%

JPEG2000 lossless

MARAC Conference October 30 2009

Comparison between file format, compression,

resolution and file size

Scans with a file size as small as possible

We store

Page 48: You Ask We Scan

We store

Comparison storage costs

MARAC Conference October 30 2009

Fileformat Storage Costs 1 year Costs 10 years

Tiff uncompressed 11 TB $ 77.000 $ 770.000

JPEG 10 1,1 TB $ 7.700 $ 77.000

JPEG 4 (200 dpi) 124 GB $ 868 $ 8.680

JPEG 2000 (part 1, ll) 6 TB $ 42.000 $ 420.000

Storage of 500.000 images Avg size per scan uncompressed = 22,1 MB

Price rate: 1 TB, storage in a controlled e-repository environment on two separate locations, including IT costs

$ 7.000 (NLD, nov 2009)

Scans with a file size as small as possible

(File)size still does matter!

Page 49: You Ask We Scan

Projects with different goals, document types and partners take place at the same time

A streamlined, standardized process is indispensable when digitizing on a large scale

Guidelines and best practices often take no account of these complex factors

and the amount of scans to be produced

We developed a process in which large scale and flexibility are starting points

All digitization projects follow this process

Developing the reproduction process

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

We Do

Page 50: You Ask We Scan

We developed a simple, but effective workflow application in-house

This asks for workflow management with a user-friendly application

For all projects, at any moment, it has to be clear:

We Do

MARAC Conference October 30 2009

What the current status is of each to digitize unit

Where each unit can be located

What current and succeeding tasks are to be performed on each unit

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Developing the reproduction process

Page 51: You Ask We Scan

In the following slides we focus on the weekly production of 10.000 scans

in the digitizing on request service

We developed a simple, but effective workflow application in-house

This asks for workflow management with a user-friendly application

For all projects, at any moment, it has to be clear:

We Do

MARAC Conference October 30 2009

What the current status is of each to be digitized unit

Where each unit can be located

What current and succeeding tasks are to be performed on each unit

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Developing the reproduction process

Page 52: You Ask We Scan

All public files can be requested for digitization via the findings aids in the Archiefbank

Just by clicking on the “digitize” button

Production of 10.000 scans on weekly basis

1. Requesting for digitization

MARAC Conference October 30 2009

We Do2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 53: You Ask We Scan

A unit to be digitized must be able to be identified at each step of the handling process

The units therefore get a unique meaningless order number

An order number is provided by the metadata management system

and is the basis for

In practice: all units to be digitized get an order ticket

2. Providing ordernumbers

MARAC Conference October 30 2009

Communication with the digitizerCommunication with the digitizer

ScanningScanning

Assigning filenamesAssigning filenames

Registration of filenamesRegistration of filenames

Billing by digitizerBilling by digitizer

We Do2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 54: You Ask We Scan

A unit to be digitized must be able to be identified at each step of the handling process

The units therefore get a unique meaningless order number

An order number is provided by the metadata management system

and is the basis for

In practice: all units to be digitized get an order ticket

2. Providing ordernumbers

MARAC Conference October 30 2009

Communication with the digitizerCommunication with the digitizer

ScanningScanning

Assigning filenamesAssigning filenames

Registration of filenamesRegistration of filenames

Billing by digitizerBilling by digitizer

We Do2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 55: You Ask We Scan

The workflow system generates a list of all originals to asses from the repositories

The list is sorted on repository / shelf to make retrieval efficient

We Do

3. Assessing the originals

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 56: You Ask We Scan

MARAC Conference October 30 2009

All assessed originals are stored in a special room

In this room all checks are executed

We Do

4. Checking the originals

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 57: You Ask We Scan

MARAC Conference October 30 2009

Information about the originals in our management

systems is not always complete

If an item falls into one of these categories the request is rejected

B. Condition of the material

A rough check of the originals takes place

A. Content

We Do

4. Checking the originals

Copyrights Publicity Privacy

Items that are in such a condition that digitizing or transport could cause damage, or are packaged in a way that scanning in conventional set-ups is not possible do not qualify for standard way of digitization

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 58: You Ask We Scan

MARAC Conference October 30 2009

Information about the originals in our management

systems is not always complete

If an item falls into one of these categories the request is rejected

B. Condition of the material

A rough check of the originals takes place

A. Content

We Do

4. Checking the originals

Copyrights Publicity Privacy

Items that are in such a condition that digitizing or transport could cause damage, or are packaged in a way that scanning in conventional set-ups is not possible do not qualify for standard way of digitization

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 59: You Ask We Scan

Material preparation is limited to the most minimal

We Do

4. Checking the originals

MARAC Conference October 30 2009

Staples are being removed as a rule

Small reparations are executed by our restoration employees

The sequence of the originals as found in the repository is not checked or altered

We Do

We don’t

The originals are not numbered

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 60: You Ask We Scan

But this is only true when the numbering tallies exact, because:

Numbering the originals has one advantage:

We Do

Not number the originals

MARAC Conference October 30 2009

The completeness of the scans (compared to the originals) can be guaranteed

Numbers that are assigned double lead to illogical end numbers (100 scans: scan 100 has been numbered as 99)

Experiments with numbering in practice learned that faultless numbering can not be realized

A missing number in a sequence of scans leads to the conclusion that there is one original that has not been scanned

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 61: You Ask We Scan

Securing completeness can be realized by other means:

We Do

MARAC Conference October 30 2009

Comparing scans to originals 1:1 after digitization

Scanning the originals twice

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

# scans = 365 # scans = 365

Low quality High quality master files

Not number the originals

Page 62: You Ask We Scan

For secure transport, special flight cases are used

We Do

5. Transport

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 63: You Ask We Scan

It has to be perfectly clear which filenames this should be

After scanning the scan operator or data manager has to assign filenames to the scans

Because, when the meaning changes, filenames should change too

As a rule filenames contain no meaningful information

We Do

6. / 7. Scanning and assigning filenames

MARAC Conference October 30 2009

Filenames are the key between scans

metadata

Filenames are the key between scans

metadata

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 64: You Ask We Scan

Assigning filenames at City Archives Amsterdam

MARAC Conference October 30 2009

Customer request Management systems

First 6#: ordernr

Last 6#: serial nr

Order ticket

Filename

Scanning the order

A20758000001

A20758000002

A20758000003

Range

A20758000001 – A20758999999

Archive 195File 836 Order: A20758

A20758000004

A20758000005

Scan report

A20758000001

A20758000002

A20758000003

A20758000004

A20758000005

12 digits

Registration

filenames

Registration

filenames

Import

Page 65: You Ask We Scan

An application from which all checks can be executed is in development

Scans and metadata are checked efficiently

Where possible checks are automated

10. 11. Checking scans and metadata

Check Method

Viruses Virus checker

Data integrity MD-5 checksum comparison

File format validity Jhove

Quality scansVisual check reference scans

Visual check production scans

Completeness Depends on project

Filenames Script

Basic checks

We Do

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 66: You Ask We Scan

After import the “order for digitization” of each unit is completed

After approving of all checks, scans and metadata are imported into the management

systems

The imports are executed automatically, on basis of scripts and standard protocols

for file transfer

13. 14. Import metadata and scans into management systems

We Do

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 67: You Ask We Scan

After import the metadata are optimized for the search system

For exchange of finding aids we use EAD

From any workstation at the archive, directly via the CMS of the website

The website is hosted from an external location

Metadata are uploaded to the webserver by simple HTTP transfer

18. Import metadata into the website

We Do

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 68: You Ask We Scan

Until then scans are transported by use of portable USB harddisks

Bandwith of the internet connections at the archive is still too small for direct sFTP

(or suchlike) upload of large quantities of scans to the webserver

It seems likely that in the near future this will change

17. Import scans into the website

Transport medium

We Do

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 69: You Ask We Scan

Derivates for use of thumbnails and zoom / contrast functionality are made

After connecting the harddisk to the server the import process starts

Some basic checks are executed on the scans

Import

17. Import scans into the website

We Do

MARAC Conference October 30 2009

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 70: You Ask We Scan

MARAC Conference October 30 2009

The requester can decide whether to buy scans or not

When both scans and metadata have been imported, automatically an e-mail is sent

to the requester for digitization

This email contains a link to the finding aid and thumbnails on the website

Request complete!

The happy customer:

We Do2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 71: You Ask We Scan

MARAC Conference October 30 2009

The requester can decide whether to buy scans or not

When both scans and metadata have been imported, automatically an email is send

to the requester for digitization

This email contains a link to the finding aid and thumbnails on the website

Request completed

We Do

The happy customer:

2. Providing Ordernr(s)

3. Assessing the originals

4. Preparing the originals

5. Transport

6. Scanning7. Assinging

filenames

8. Transport

9. Checking originals

10. Checking scans

13. Import in controled

storage system

15. Export scans

17. Import scans

16. Export metadata

18. Import metadata

14. Import in metadata system

11. Checking metadata

1. Requesting digitalization

12. Originals back to

repositry

Page 72: You Ask We Scan

MARAC Conference October 30 2009

Page 73: You Ask We Scan

MARAC Conference October 30 2009

Page 74: You Ask We Scan

Mission accomplished

1. Government satisfied: number of visitors increased fivefold

2. Management satisfied: costs and funding balance each other

3. Staff satisfied: enjoy their new role

4. Customers satisfied: lots of compliments

MARAC Conference October 30 2009

Accomplished

Page 75: You Ask We Scan

Government satisfied

MARAC Conference October 30 2009

Government

Visitors

Year Reading rooms Website

1982 24.027  

1988 29.788  

1992 27.738  

1998 26.598 40.048

2002 25.014 224.050

2006 17.958 512.592

2007 92.678 520.483

2008 118.312 538.483

2009 (3/4) 77.298  531.143

Page 76: You Ask We Scan

MARAC Conference October 30 2009

Costs Archiefbank (2008)

Digitsation on request € 140,000

Webservices € 52,000

Digitization projects € 200,000

Income Archiefbank (2008)

Digitsation on request € 100,000

Project funding € 330,350

Government € 40,000

Management

Management satisfied

Page 77: You Ask We Scan

Customers

Registered users: ca. 15.000

Requests: 10.605

Scans online: more than 7 million

Archives Next and Computable awards

MARAC Conference October 30 2009

Customers satisfied

Page 78: You Ask We Scan

MARAC Conference October 30 2009

Thanks

Free [email protected]

Page 79: You Ask We Scan