vmm digitization plan 2015
DESCRIPTION
This document provides a process through which digitization projects are selected, digitized, described, and made available.TRANSCRIPT
2015
Created by: Lea Edgar, Librarian/Archivist & Duncan MacLeod, Curator of Collections
VANCOUVER MARITIME MUSEUM DIGITIZATION STRATEGY This document provides a process through which digitization projects are selected, digitized, described, and made available.
1
1 Vancouver Maritime Museum Digitization Strategy
TABLE OF CONTENTS
Introduction ................................................................................................................................................................................. 3 Preamble ................................................................................................................................................................................... 3 Definition of terms ................................................................................................................................................................ 3 Past efforts ............................................................................................................................................................................... 5 The current situation ........................................................................................................................................................... 5 Evaluating our assets ...................................................................................................................................................... 6
Moving forward ..................................................................................................................................................................... 9 Goals ............................................................................................................................................................................................. 10 Goal 1: digital assets .......................................................................................................................................................... 10 Goal 2: digitization program ......................................................................................................................................... 10 Goal 3: organizational capacity .................................................................................................................................... 11
Project planning ...................................................................................................................................................................... 12 Processes occurring prior to digitization ................................................................................................................ 12 Introduction ..................................................................................................................................................................... 12 Establishing a policy for digital materials .......................................................................................................... 12
Digitization Advisory Group .......................................................................................................................................... 15 Selection Criteria ........................................................................................................................................................... 15 Limitations ....................................................................................................................................................................... 16 Maintenance and Removal ........................................................................................................................................ 16 Preparation for digitization ...................................................................................................................................... 16
Digital Conversion .................................................................................................................................................................. 17 Post-‐Digitization Work & Analysis .................................................................................................................................. 19 Timelines .................................................................................................................................................................................... 22 Project Recommendations .................................................................................................................................................. 23 References .................................................................................................................................................................................. 26 Appendix 1 – File Formats .................................................................................................................................................. 27 Textual documents ............................................................................................................................................................ 27 Audio ....................................................................................................................................................................................... 27 Film/Video ............................................................................................................................................................................ 29
Appendix 2 – Metadata ......................................................................................................................................................... 30 Appendix 3 – File Naming ................................................................................................................................................... 31 Appendix 4 – Forms ............................................................................................................................................................... 33 Digitization Project Proposal ........................................................................................................................................ 34
2
2 Vancouver Maritime Museum Digitization Strategy
3
3 Vancouver Maritime Museum Digitization Strategy
INTRODUCTION
PREAMBLE
The creation and preservation of digital collections allows the Vancouver Maritime Museum to enhance its services and access to unique collections for visitors and researchers around the world. This digitization plan creates a process through which digitization projects are selected, digitized, described, and made available. This reflects the mission and vision of Vancouver Maritime Museum:
“The Vancouver Maritime Museum celebrates the profound significance of the ocean and waterways of the Pacific and Arctic, through the preservation and growth of our extraordinary collection, and as a
centre for dialogue, research and experience.”
“We are a maritime museum that conforms to the highest international standards and that anchors our activities, staff, volunteer and partners in a rich and growing collection of maritime artefacts.”
DEFINITION OF TERMS
What is Digitization?
We use the term digitization to refer to a set of processes that create digital objects from physical originals.1 We can then share these materials through digital devices, equipment, and networks. They form a new type of collection—a digital collection—that requires special care and preservation.
To avoid digitized materials becoming obsolete, we must digitize at the highest quality, migrate to the latest storage and formats when necessary, and maintain the links to the descriptive information that makes digital materials meaningful.
Digitization is a complete process that broadly includes: selection, assessment, prioritization, project management and tracking, preparation of originals for digitization, metadata collection and creation, digitizing, quality management, data collection and management, submission of digital resources to delivery systems and into a storage environment, and assessment and evaluation of the digitization effort.
The Vancouver Maritime Museum Collections Department generally engages in digitization projects that fall into one of the following categories:
• Ongoing digitization handles entire collections or other larger groups of artifacts and archival or library materials that are not subject to deadlines. These projects are proposed by Collections staff. Final decisions for projects to be undertaken are the discretion of the Collections Department, based on the technical feasibility of the project. However, input will be sought from other Museum staff and proposals for the ongoing digitization workflow will be
1 So-‐called “born digital” objects (both born-‐digital content and current business records) are not included in the scope of this document at this time.
4
4 Vancouver Maritime Museum Digitization Strategy
reviewed based on the limitations and preferences outlined in this plan in addition to factors such as collection development policies and the Museum’s strategic goals. Complete collections are digitized throughout the year as time permits. These materials are stored in the Digital Collections Repository (see below for a definition of this term). Ongoing digitization projects are the principal activity of the Collections Department.
• On-‐demand digitization handles immediate requests from staff and patrons for digital
reproductions of artifacts or archival materials. These requests generally arise as part of work with the public or through regular activities such as exhibit planning or preservation. When the material to be digitized meets digitization selection criteria (see page 14), they are scanned to preservation quality standards and—although not currently the practice—are described through robust metadata records. These materials are then stored in the Digital Collections Repository. Items that do not meet these guidelines may still be digitized at the discretion of Collections staff, but will generally not be added to the Digital Collections Repository.
• Grant-‐based digitization occurs for specially funded projects. Special staff may be hired for
these processes, although Collections staff will be involved in the overall management and development of these projects. Decisions on grants to pursue are made by Collections in collaboration with the Executive Director and Fundraising and Development Manager. These materials may be stored in the Digital Collections Repository or in another repository designed particularly for the project.
Digital Collections Repository A digital collections repository is a mechanism for managing and storing digital content. Putting content into a repository enables staff to manage and preserve it, and therefore derive maximum value from it. In the case of the VMM, a digital repository can be a server, database, external hard drive or any device used to store and provide access to digital media. Metadata
Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information.
5
5 Vancouver Maritime Museum Digitization Strategy
PAST EFFORTS
The Vancouver Maritime Museum has never had a formal digitization policy, plan or strategy. Digitization has been undertaken by means of grant funding for special projects and on-‐demand digitization. On-‐demand digitization (see definition on page 5) has been accomplished by using in-‐house scanners, such as the Epson Expression 10000XL, museum-‐owned cameras, as well as making use of third party vendors primarily for oversized and audio-‐visual material.
THE CURRENT SITUATION
Because no institution-‐wide policy, plan or strategy is in place, digital collections are currently stored on two external hard drives. These drives are both located in the Henry Larsen Rare Book Room, with no off-‐site backup. At the current rate of on-‐demand digitization, one drive will be full by the end of 2015 (1 TB), leaving only one drive to house the collection (2 TB) with no back-‐up system in place.
The current digital holdings include still images, archives (records), ship plans, charts, maps, images of artifacts, films and oral histories.
6
6 Vancouver Maritime Museum Digitization Strategy
EVALUATING OUR ASSETS
A survey of all of the digital holdings of the VMM has been carried out to determine not only what digital objects are held in different parts of the Museum but also in what formats these images are currently available.
STILL IMAGES
Currently, a 2 TB external hard drive stored in the Henry Larsen Rare Book Room houses all the still images that have been digitized so far (this includes prints, negatives, slides, and images from albums). Unfortunately, the vast majority of these digital images are not saved in an archival master file format (Largely JPEG files. See Appendix 1 for more information), not scanned at a high enough resolution, contain no metadata (see Appendix 2), and are not saved according to a file naming schema (see Appendix 3). Therefore, for sufficient use and longevity, these images will need to be re-‐scanned according to the prescribed standards in the future policy.
Today still images are only scanned on-‐demand. As of July 2013, newly scanned images are saved in TIFF format for an archival master and the resolution generally follows the rule of retaining 3000 pixels along the long edge of the image. However, generally no image is scanned at under 600 dpi (dots per inch).
Still images currently take up 86.6 GB of space on the hard drive, representing approximately 8,893 digital images in total.
St. Roch Research Photograph Collection
The largest effort at digitization was the scanning of the St. Roch Research Photograph Collection. We believe the scanning was actually completed by Parks Canada staff at an unknown date. This is the largest collection of digital materials in the Museum collections. The images are currently stored on the 2 TB external hard drive. These images were scanned at varying resolutions and saved as lossy JPEG files. They were, however, named according to the file naming schema originally created by Parks Canada.
ARCHIVAL MATERIALS
There have been no major efforts at digitizing archives (records). The one exception is the Lord Nelson Collection.
Lord Nelson Collection
The scanning of the Lord Nelson collection took place in 2013. The project followed basic professional standards and should be viable for years to come. The scanning was completed over the course of approximately 5 months by the Librarian/Archivist working on the project for 2 days per week. The images are currently stored on the 2 TB external hard drive which is stored in the Rare Book Room. The image files take up 51.6 GB and represent 587 files total.
7
7 Vancouver Maritime Museum Digitization Strategy
CHARTS, MAPS & SHIP PLANS
Very few charts, maps, or ship plans have been scanned. The digital files that exist today were scanned on demand.
Charts currently take up 4.21 GB of space on the same external hard drive as the still images, and consist of 34 digital files in total. The images were saved as JPEGs, TIFFs, and PDFs and vary in quality. Some will be suitable to be archival master files, such as the Bayly charts. Metadata and a file naming schema will need to be applied to the files in the future.
Ship plans currently take up 213 MB and consist of 7 digital files. The images were saved as JPEGs, TIFFs, to PDFs and vary in quality. Some will be suitable to be archival master files. Metadata and a file naming schema will need to be applied to the files in the future. No maps have yet been digitized.
AUDIO VISUAL
The only films that have been digitized are three relating to the St. Roch. The titles of these films are:
• Through the Northwest Passage: 1940-‐1942 by F.S. Farrar, 1st Mate, RCMP St. Roch, • Arrival St Roch[in Vancouver], and • The RCMP Presents: St. Roch Sails South.
They were sent to a professional digitization company recommended by professional archivists called Scene Savers. Upon receipt, Scene Savers inspected and cleaned the films. They then transferred the film using a wet-‐gate process and made an MPEG 4 (access file) and an un-‐compressed QuickTime (master file) and saved them to the hard drive. The films were wound onto archival cores and rehoused into archival canisters (originally they were in metal tins). They also created DVD copies of the films for us.
This process used professional standards of the time and therefore the resulting files and metadata should be useable for many years to come. In addition, the films were clean and stored appropriately for long-‐term preservation.
In 2013 an effort was made to record oral histories for the exhibit Komagata Maru: Challenging Injustice. The resulting audio files are in MP3 format and total 10 files (8 interviews). They are stored on the 2 TB external hard drive as well as the Curator’s computer. The files take up 696 MB of disk space.
8
8 Vancouver Maritime Museum Digitization Strategy
ARTIFACT IMAGES
Currently 5.6 GB of artifact images are stored on both 9 CDs as well as on the 2 TB external hard drive. These images were all professionally taken for publication in Waterfront written by James P. Delgado. Some of these images were also used on an online exhibit titled Vancouver Maritime Museum Treasures. These images, however, do not have accession or artifact numbers connected to their files. The curator’s computer has hundreds of images of artifacts, but these have not been taken professionally and are for curatorial reference only. Many of these images were taken of objects from Building 14 during the rapid, large scale-‐deaccessioning program that took place in 2013. Many of the images are poor quality and have not been connected to accession or artifact numbers.
The next step will be an assessment of the analog images currently available. Digitizing already available images, such as slides of museum artifacts, will be a less costly and time-‐consuming process than beginning 'from scratch'. If images from photographs or slides are being scanned or have been scanned, only good-‐quality images should be used. Some objects will need to be re-‐photographed if the images on hand are in poor condition or are not good representations of the original object. Ideally only good, professionally photographed images created with a colour bar or grey scale should be digitized.
If previously created digital images are available, consider whether the quality is high enough for current needs, and whether the associated documentation is adequate. New photography will add significantly to the time and money required for a digitization project, particularly when the objects to be photographed require significant preparation time. For example, large objects such as canoes may have to be transported from storage to a suitable place to be photographed; complex objects, such as costumes, may require a great deal of preparation.
9
9 Vancouver Maritime Museum Digitization Strategy
MOVING FORWARD
The Vancouver Maritime Museum wishes to retain and expand upon its current digital collections by creating a Digitization Plan that will outline the museum’s strategy for creating and maintaining digital holdings. The plan will cover these key areas:
1. Project planning 2. Processes occurring prior to digitization 3. Digital conversion 4. Post-‐digitization work and analysis
The activities described within each phase address archival issues, imaging and conversion work, and IT infrastructure issues. Archival issues include preparation of originals for digitization, indexing, collection and creation of metadata of all types, and quality control of the digital versions, indexing data, and other metadata. Imaging/conversion work includes digitization, creation of derivative versions for access, quality control, and metadata creation. IT infrastructure issues include: collection and transfer of data to other systems, networked and Web services, databases, and managed storage and backup. Additional IT infrastructure issues include: short-‐term/intermediate data storage, backup of digital resources for disaster recovery, and safeguards and checks to protect against data loss and to ensure data integrity. Timing and costs How long will digitization take and how much will it cost? It is difficult to estimate total costs and timelines at this time. The Digitization Strategy’s primary task is to determine guidelines for setting priorities about what will be digitized and how. Costs and timing will be determined on a project by project basis. While we will not digitize all of our collections, the cost is still large. Added to the direct cost of digitization is the staff hours needed to find and research objects and data and the rights associated with them. Digitization is an ongoing process that will require ongoing resources. We have been digitizing, and will continue to do so, as funds become available. This plan will allow the VMM to work across the museum from a single plan that outlines a comprehensive and systemic approach. Any consideration of cost is balanced by what we stand to gain by making our collections available 24/7 from all over the world. Digitization will also result in considerable savings relating to the preservation of our collections in that having digital surrogates will ease the need for handling of the original objects and records. In addition, Museum staff will be able to access documents and collections much more quickly and simply.
10
10 Vancouver Maritime Museum Digitization Strategy
GOALS
Three broad goals address content, infrastructure, and resources. While listed in priority order, the goals address issues that are interdependent, so we will implement them concurrently.
GOAL 1: DIGITAL ASSETS
PROVIDE ONLINE ACCESS TO VMM COLLECTIONS BY CREATING, MANAGING, AND PROMOTING THE MUSEUM’S DIGITAL MATERIALS.
We seek to increase the amount and availability of our digital materials, and introduce comprehensive and systematic digital asset management planning. To accomplish it, we must first assess existing digitized materials across the Museum and the technologies with which they were created. We then have to define criteria for selecting and prioritizing VMM resources to digitize, because we will not have sufficient financial resources to meet the total demand. We must establish a trusted digital collections repository to preserve the materials once digitized, and then ensure that we can integrate them across the Museum and into the broader online arena (VMM website, databases, social media, etc.). Finally, we have to develop strategies for promoting greater use of our materials within the Museum and throughout the world.
GOAL 2: DIGITIZATION PROGRAM
TO PURSUE ITS MISSION IN THE 21ST CENTURY, INTEGRATE DIGITIZATION INTO THE CORE FUNCTIONS OF THE VMM.
We will move digitization from an activity handled differently and infrequently in each department to an integrated VMM digitization program that meets both internal needs and external expectations. This requires us to formulate a Digitization Plan to guide digitization activities, outline program expectations, and provide a basis for consistent decision-‐making that aligns with other relevant VMM policies. A Digitization Program will develop and oversee ongoing digitization strategies. These and other efforts will create a Museum culture that embraces digitization and the sharing of collections, research, and expertise.
11
11 Vancouver Maritime Museum Digitization Strategy
GOAL 3: ORGANIZATIONAL CAPACITY
THROUGH NOVEL, INNOVATIVE APPROACHES, SECURE SUFFICIENT RESOURCES AND BUILD CAPACITY TO CREATE AND SUSTAIN A
DIGITAL VMM.
A plan of this magnitude requires significant financial and human resources. To ensure that we have sufficient funds to sustain the VMM’s Digitization Plan over time, we will integrate digitization into the VMM’s strategic plan; develop guidelines for partnerships and sponsorships; and identify alternative sources and mechanisms of support. We will offer training and tools to enhance staff competencies, and supplement their efforts with the knowledge and skills offered by colleagues in related organizations engaged in similar work by partnering whenever possible.
Librarian/Archivist� 14-12-4 4:50 PMComment [1]: Is it already?
Librarian/Archivist� 15-5-26 10:38 AMComment [2]: Goals based on Smithsonian plan.
12
12 Vancouver Maritime Museum Digitization Strategy
PROJECT PLANNING
PROCESSES OCCURRING PRIOR TO DIGITIZATION
INTRODUCTION
Before the Museum embarks on a digitization project, it should allocate adequate resources of time and money for at least the following:
• Assessing the Museum's needs, deciding where digitization is appropriate and where it is
not, • Defining the project, • Researching technological options, • Choosing standards, • Developing requirements statements, • Planning the implementation of the project, including milestones and a timetable, and • Monitoring, evaluating and adjusting the project as required.
ESTABLISHING A POLICY FOR DIGITAL MATERIALS
Establishing a policy for managing digital materials should be part of the planning process. Just as the Museum needs a Collections Policy, so too should it have a policy on creating and managing digital materials, which form a valuable 'collection' of a new kind. The policy should define at least the following:
• Copyright and legal policies for staff, • How digital images, once created, will be managed, • How image content and technical information will be documented, • Plans for safe storage, conservation and preservation of master images and surrogate
images to ensure their longevity, • Plans for migration to new formats and technologies as needed, and • Plans for digitizing new objects or archival materials. • The policy should be reviewed periodically to determine whether project plans or policies
need adjusting.
13
13 Vancouver Maritime Museum Digitization Strategy
GENERAL POLICY ISSUES
• Define policies and procedures • Implement and ensure compliance with policies and procedures • Different procedures will be selected depending on class or category of project • Determine the complement of digital objects, file types, and file formats for digitization projects • Recommendations for sustainable formats for digital archival records and digital master copies
(see Appendix 1) o Digital object characteristics (i.e., nature of raster image files, digital audio and video
files, etc.) o Digital conversion parameters (i.e., technical specifications followed for the creation of
digital objects) o Minimum complement of metadata required
• Creation and management of metadata schema(s) o Define the complement of metadata needed for preservation of digital objects o Define appropriate complement of various metadata to ensure management of assets
for desired retention time period (short or long term) o Determine when in workflow what metadata is added or created o Determine where metadata will be stored (embedded or in external system or both) o Determine in what formats metadata will exist o Determine relationship to identifiers of digital objects
• Definition of essential characteristics (significant properties) of the original – curatorial/archival and technical
• Determination of appropriate approach and quality levels for digitization o Approaches to digitization o Conversion specifications for preservation reformatting o Establish quality management approach o Establish metrological approach for scanner and digital conversion equipment
performance • Ensure authenticity of digital copies
o Verification procedures for digital copies o Comparison and review of digital copy to original record from the archival/curatorial
perspective to ensure digital copy satisfies requirements for authentic digital versions o Document chain of custody – both original records during digitization and the digital
copies o Audit-‐trail – history of actions on the digital copies and related metadata, from creation
through final submission to digital repository o Verification of fixity information such as checksums and digital signatures o Relationship to identifiers of digital objects
• When a records management plan is created, and if applicable, ensure appropriate records management of the digital resources to be created
o Define records management issues related to the digital copy o Define class/status of new copies (e.g., records, copies, non-‐records, master files,
access/distribution/derivative files) in order to determine digitization approaches and management of file types and content over time
o Identify party/parties responsible for managing the various digital versions and ensure appropriate records management depending on status of digital copies
14
14 Vancouver Maritime Museum Digitization Strategy
o Define procedures and methods for accessing the digital copies for most or all use requests
15
15 Vancouver Maritime Museum Digitization Strategy
DIGITIZATION ADVISORY GROUP
The selection and prioritization of ongoing digitization projects will be determined by a group of board members and staff at the Vancouver Maritime Museum. Scan-‐on-‐demand and grant-‐based digitization projects are typically selected and managed by Collections staff. The primary responsibility of the advisory group is to coordinate and facilitate production of each digitization project.
SELECTION CRITERIA
Project proposals can be submitted by Collections or VMM staff to the Digitization Advisory Group. A digitization project proposal form (see Appendix 4) will include questions regarding the following criteria for selection: VALUE
• Informational value: items that offer significant information on the key people, places, events, objects, periods, activities, projects, and processes reflected in Collections Policy and the Museum’s mission.
• Administrative value: items with functional usefulness to the Vancouver Maritime Museum and its partners.
• Artifactual value: rare items or unique objects of material culture with intrinsic value. USE
• High use materials have higher priority. • Projects that would enhance access and bring greater attention to an existing collection
have higher priority. • Collections and items should have an RAD compliant finding aid or other access point
before being selected for digitization.
RISK • Materials at risk of loss due to deterioration have higher priority.
RIGHTS
• Materials in the public domain or with documented permission to publish have higher priority.
• Materials owned by or with intellectual property rights to VMM have higher priority. • Materials may be made available for research, teaching, or private study according to
Canadian Copyright Act (R.S.C., 1985, c. C-‐42) Section 29.
16
16 Vancouver Maritime Museum Digitization Strategy
LIMITATIONS
Materials that meet the above criteria for format and subject must also be free of any of the following limitations for ingest into the digital repository:
The item must either be in the public domain or the copyright must be owned by the Vancouver Maritime Museum, or granted by the copyright holder in writing. In some cases, if the material does not fall into the above category, but the copyright holder is unknown or unreachable, materials will also be added to the repository. A digitized version of an appropriate quality must not already be available online. This may extend to materials digitized and provided by another institution. Digitization of the item must be complete. Portions of materials, such as a single page from a book, will not be added to the repository unless:
• The portion to be digitized is significant in its own right, such as a map or illustration or an image of a famous maritime personality
• The portion to be digitized is frequently requested, such as an image of the St. Roch appearing in a publication
MAINTENANCE AND REMOVAL
Generally, all digital objects will remain as accessible as possible, but removal may occur for reasons of collection weeding, storage issues, and data curation, among other reasons. Such decisions will be made in collaboration with the Digitization Advisory Group, Collections Department, and/or VMM staff. Migrations to new formats, and the usage or disposal of the pre-‐migrated file, will be decided at the discretion of the Collections Department.
PREPARATION FOR DIGITIZATION
Curatorial/archival preparation of physical originals/records
• Analysis of originals (formats, organization, condition, copies, size, etc.) • Physical and intellectual organization • Collect and record a more detailed level of descriptive metadata during the course of
curatorial/archival preparation work to enhance description in existing systems • Create, assign, and record appropriate records management/administrative metadata for
new digital resources • Batch records for conversion
Preservation preparation
• Evaluation of physical condition and readiness for scanning • Holdings maintenance, if needed • Conservation prep, if needed • Batch records for conversion
17
17 Vancouver Maritime Museum Digitization Strategy
DIGITAL CONVERSION
DATA ENTRY
• Record any pre-‐existing metadata needed to begin conversion (may include job tracking information, descriptive metadata, etc.)
DIGITAL CONVERSION
• Capture done according to specifications in-‐house, by partners, and/or by contractors • Image target use for performance verification • Device conformance testing and calibration • Initial and on-‐going testing of digital image quality and equipment based on established
benchmarks and specifications • Digitization of existing documentation, if not already in electronic form • Digitization of descriptive information, finding aids, indices, folder lists, inventories, etc. if not in
electronic format • Perform any correction/editing/processing to digital files • Image evaluation – objective and subjective • Create and track production metadata • OCR and text conversion/mark-‐up, rekeying, etc.
TECHNICAL, STRUCTURAL, ADMINISTRATIVE, AND DESCRIPTIVE METADATA CREATION AND COLLECTION
• Define requirements for and record metadata for different collections/groupings/classes of resources at different levels
• Create and record/embed metadata into appropriate systems/headers • Auto characterization and manual and automated collection of technical and other metadata to
carry forward as files are moved into other systems
INDEXING
• Minimal intellectual organization of digital objects to match the appropriate level within the archival descriptive hierarchy or to match the intellectual organization of the collection. Indexing is primarily geared towards describing and organizing large groups of digital versions of physical records. Indexing provides a level of association and organization of digital resources so they can be effectively searched and retrieved.
QUALITY MANAGEMENT
• Quality assurance and quality control of digital copies and metadata to ensure conformance to guidelines
• As with any manufacturing process, exceptions or defects can consume an inordinate amount of resources; the further downstream the error detection, the greater the resources used to correct it
18
18 Vancouver Maritime Museum Digitization Strategy
• Defect identification and inspection and verification of files o Automated quality assurance/quality control for both digital objects and for related
metadata (all types of metadata – including technical, administrative, descriptive, etc.) o Follow up by staff on problems identified by automated checks o Statistically valid sampling checks by staff, automated identification of resources to be
checked • Rework for error identification • Ensure compliance with templates/profiles • Follow established metrology protocols and document certifications, or correct and replace as
required • Documentation of quality assurance/quality control process • Create and record QC/QA metadata
DATA ENTRY/IMPORT
• Import technical, structural, descriptive, production, administrative, rights, QC/QA metadata into appropriate systems on local level
• Import assets into appropriate systems on local level • Collect and manage new data in central and local systems
VERSION CONTROL
• Define and record relationship between types of files (such as preservation master, production master, derivative files, etc.)
• Automate production of derivative files and versions • Automation of metadata into and out of header tags and files (such as XMP, IPTC, etc.) • Perform inspection and verification of derivative files and versions • Create and apply checksums to appropriate versions • Create batches • Aggregate multiple versions, files, and metadata files into a “package” for submission/delivery
into storage
COPY STATUS AND RECORDS MANAGEMENT
• Manage and document process appropriately to ensure authenticity of digital copies
19
19 Vancouver Maritime Museum Digitization Strategy
POST-‐DIGITIZATION WORK & ANALYSIS
COPY STATUS AND RECORDS MANAGEMENT
• Finalize status of digital copies and related metadata • Update status of original records/originals if needed
COMPLETE BIBLIOGRAPHIC/ARCHIVAL DESCRIPTION
COLLECTION AND CREATION OF ANY ADDITIONAL APPROPRIATE METADATA (DESCRIPTIVE, STRUCTURAL, ADMINISTRATIVE, TECHNICAL) NOT COLLECTED IN EARLIER PROCESSESFINALIZE THE COMPLEMENT OF METADATA NEEDED
• Appropriate complement of various metadata to ensure management of assets for desired retention time period (short or long term)
QUALITY ASSURANCE AND QUALITY CONTROL OF METADATA AND DIGITAL OBJECTS
• Conformance to standards, data types, templates/profiles • Accuracy • Defect identification and error correction • Automated quality assurance/quality control for both digital objects and for related metadata
(all types of metadata – including technical, administrative, descriptive, etc.) • Follow up by staff on problems identified by automated checks • Statistically valid visual checks by staff (i.e., color and tone accuracy), automated identification
of resources to be checked • Record actual rework/defect correction efforts • Documentation of quality assurance/quality control process
CURATORIAL/ARCHIVAL VALIDATION AND VERIFICATION
• Digital versions in comparison to originals from curatorial/archival perspective to ensure digital copies satisfy requirements for authentic digital versions
TECHNICAL VALIDATION
• To industry specifications for well-‐formed digital objects and data formats; and assessment of digital objects to verify they meet local profile and requirements
ACCESS
• Make digital objects and metadata available to staff and researchers • Deliver digital objects via web-‐based/delivery systems for research • Deliver high-‐quality digital products via the web and via optical media
20
20 Vancouver Maritime Museum Digitization Strategy
AGGREGATE AND ASSOCIATE DIGITAL OBJECTS AND METADATA FILES FOR PACKAGING AND TRANSFER
• Create and associate multiple low resolution derivative files • Assign checksums • Export – flexible packaging of both digital objects and metadata for delivery into other systems
using different metadata schema o Submit resources to access/delivery systems and make resources available online o Submit resources to digital repository
• Export metadata in different formats to other systems • Export digital files to other systems • Acceptance/confirmation of export/submission of digital objects and metadata into other
systems
UPDATE METADATA
• In other management and access systems as needed to synchronize or replace with new metadata generated during digitization projects
• Linking of metadata between systems
PROVIDE ROUTINE REFERENCE TO DIGITIZED RECORDS VIA ON-‐LINE SYSTEMSTRACK AND ASSOCIATE NEW DIGITAL/ANALOG VERSIONS TO THE PHYSICAL ORIGINALS
MANAGE DIGITAL RESOURCES IN APPROPRIATE ACTIVELY MANAGED STORAGE ENVIRONMENT
• After submission of completed digital objects and related metadata to long term digital repository
o Ensure provenance and authenticity of digital resources o Ensure data integrity o Ensure disaster recovery
PROJECT ASSESSMENT, REPORTING AND EVALUATION
• Project Assessment • Web, Image File, and Database Usage Analyses • Cost-‐Benefit Analyses
ASSESSMENT OF IMPACT ON OTHER ACTIVITIES
• Assess effects of digitization on traditional reference activities (e.g., online access, in-‐person access, send all source analog content offsite?) and researcher requests, and update procedures
IDENTIFY AND CORRECT PROBLEMS AND ERRORS RELATING TO BOTH DIGITAL OBJECTS AND RELATED METADATA
• Correct problems/deficiencies on a routine basis for all categories of digitization
LESSONS LEARNED
21
21 Vancouver Maritime Museum Digitization Strategy
• Unexpected results, scoping errors, etc.
PROCESS IMPROVEMENT
• As needed update workflows, tools, procedures, policies, etc.
22
22 Vancouver Maritime Museum Digitization Strategy
TIMELINES
These guidelines assume that all scans can be done in one pass and that no additional digital manipulation of the images is required.
For projects not requiring selection (for example, complete photograph fonds), scanning & re-‐foldering:
• 50-‐70 images/day (10 minutes per image, for scan & thumbnail)
For projects requiring selection:
• 25-‐35 images/day (15 minutes per image)
The time requirements for projects with more complicated scanning or digital manipulation components vary too widely to allow standardised guidelines. Projects involving the scanning of audio/visual materials, oversized materials that will be scanned separately then stitched together, or photographed in a copying station, or projects such as virtual displays which may require additional post-‐processing of images, a small sample should first be done in order to create reliable time estimates. Alternatively, this kind of project can be farmed out to a third party who will provide an estimate of time and costs that can be used as the basis of a project application.
23
23 Vancouver Maritime Museum Digitization Strategy
PROJECT RECOMMENDATIONS
Register of Seamen Shipped
Extent: 47 volumes
Date range: 1898 -‐ 1967
Significance of collection: The Register of Seamen Shipped consists of 47 volumes documenting the seamen who shipped out of the Port of Vancouver. The collection dates from 1898 to 1967 and is a unique record of merchant marine history. These large, hand-written volumes include the name of the ship, a list of the crew and their position on the ship, wages earned, and a home address. Popular with genealogists, the Register has also been used by Veteran Affairs in allocating merchant marine wartime service pensions. Note: the records do not include lists of passengers, but some of these can be found at Library and Archives Canada's immigration records. Project details: • package each register for transfer to SFU Library, liaise with SFU copy team
• scan/photograph each page of individual registers
• Librarian/Archivist perform quality control and review scans
• develop searchable index for vessels listed in each register (this currently exists in hardcopy) • purchase and rehouse registers in archival boxes, complete any related conservation work • create finding aid for collection, make index accessible on VMM website
Project costs:
Supplies:
- Archival storage boxes (Bury #38-361-001/$24.90 each) = $1,170.30
- taxes & shipping costs = $155.00
Total supplies: $1,325.30
Scanning fees:
- Outsourced scanning to SFU Library (Digitization Dept)
- quote at $0.20/page
24
24 Vancouver Maritime Museum Digitization Strategy
- note: each side of 1 page is counted as 2 pages as there is information to be scanned on each side
•15 large volumes (400 pages) = 12,000
•10 medium volumes (300 pages) = 6,000
•22 small volumes (150 pages) = 6,600
• total pages to be scanned: 24,600
• 24,600 pages x $0.20/page = $4,920.00
Total scanning fees: $4,920.00
- SFU Project administration fees/staff costs/metadata development = est. $15,000.00
Total estimated project cost: $21,245.30
St. Roch Photographic Collection
Extent: 2200 images (photographs and negatives)
Date range: 1927 -‐ Present
Significance of collection: The St. Roch Photographic collection represents the photographic history and documentation of the Vancouver Maritime Museum’s central show piece, the designated National Historic Site RCMP St. Roch. The images cover the construction of the vessel, its work in the Arctic and goes all the way up to its installation and interpretation in the Vancouver Maritime Museum. The collection consists of photographs, slides and negatives. Project details: • scan each image using the VMM’s archive scanner
• Librarian/Archivist perform quality control and review scans
• Scanning to be performed by hired archives assistant
Project costs:
Scanning fees:
- Assistant $110/day (2200 images @ 50-70 images/day = Approximately 40 days)
Total estimated project cost: Approximately $4,400
25
25 Vancouver Maritime Museum Digitization Strategy
Cyril L. Littlebury Photographic Collection
Extent: 782 negatives
Date range: 1900s – 1930s
Significance of collection: The Cyril L. Littlebury collection comprises 782 negatives, which represent photographs taken by Cyril Littlebury between 1900 and his death in 1936. The content of the images relates to western Canada and half of the negatives relate specifically to the west coast. The collection is currently housed in the Vancouver Maritime Museum’s vault as it has been registered with CCPERB, but requires digitization to the current museum standards. Project details: • scan each individual negative using the VMM’s archive scanner
• Librarian/Archivist perform quality control and review scans
• Scanning to be performed by hired archives assistant
Project costs:
Scanning fees:
- Assistant $110/day (782 images @ 50-70 images/day = Approximately 13 days)
Total estimated project cost: Approximately $1,430
26
26 Vancouver Maritime Museum Digitization Strategy
REFERENCES
The following works were used in the creation of this strategy
• Avery, Cheryl & O’Brien, Jeff. (2008). The National Archival Development Program (NADP) Project Time Guidelines. Saskatchewan Council for Archives and Archivists.
• Federal Agencies Digitization Guidelines Initiative – Still Image Working Group. (2009). Digitization Activities: Project Planning and Management Outline. Federal Agencies Digitization Guidelines Initiative. Retrieved from http://www.digitizationguidelines.gov/guidelines/DigActivities-‐FADGI-‐v1-‐20091104.pdf
• Federal Agencies Digitization Guidelines Initiative – Still Image Working Group. (2009). Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files. Federal Agencies Digitization Guidelines Initiative. Retrieved from http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image-‐Tech_Guidelines_2010-‐08-‐24.pdf.
• Sitts, Maxine K. (Ed.) (2000). Handbook for Digital Projects: A Management Tool for Preservation and Access (1st ed.).Northeast Document Conservation Center. http://www.nedcc.org/resources/digitalhandbook/dman.pdf.
• Smithsonian Institution. (2010). Creating a digital Smithsonian: Digitization strategic plan. Washington, D.C.
• Yun, Audra Eagle. (2010). Z. Smith Reynolds Library: Digitization project proposal. Retrieved from https://librarchivist.files.wordpress.com/2010/05/digitizationprojectproposal.pdf.
27
27 Vancouver Maritime Museum Digitization Strategy
APPENDIX 1 – FILE FORMATS
The following formats have been determined by using Library and Archives Canada, Library of Congress, and National Archives and Records Administration (NARA) standards for heritage preservation.
TEXTUAL DOCUMENTS
PDF-‐Archive (PDF/A)21
The Association for Suppliers of Printing, Publishing and Converting Technologies (NPES), and the Association for Information and Image Management International (AIIM International) have developed an international standard that defines the use of PDF for archiving and preserving documents. The format is known as PDF-‐Archive (PDF/A) and has been adopted by the ISO (ISO standard 19005-‐1:2005).
Currently the Vancouver Maritime Museum does not have the software to save in PDF/A format, however, simple PDF format will still be accepted.
Portable Document Format (PDF)
PDF is an open, de facto standard that was developed by Adobe for the electronic distribution of textually based documents in raster format. It is a widely used format that preserves all the fonts, formatting, graphics and colours contained in the original source document after its conversion to the PDF format. PDF is fully backwards compatible and platform independent.
Where possible, all scanned textual documents should be run through an Optical Character Recognition (OCR) program. One such program – ABBYY FineReader 6.0 Sprint – is installed on the Librarian/Archivist’s office computer.
AUDIO
Sound is digitized by "measuring" the voltage (produced by a microphone) representation of the sound wave at regular intervals. How often the sound wave is measured is called the sampling rate and is generally expressed in kHz (i.e. thousands of times per second). The standard sampling rate for compact disc recordings is 44.1 kHz or 44,100 times per second. Master quality recordings have a minimum sampling rate of 96 kHz.
The "measurements" are expressed in bits. Audio CD’s have a bit depth (number of bits used to measure the voltage) of 16 which yields a total of 65,536 values. Master quality recordings commonly use 24 bits, which yields 16,777,216 values.
Bit rate is the total number of bits used per second (sampling rate X bit depth). Therefore a CD quality representation of sound of 44.1 kHz/16 bits would have a bit rate of: 44,100 X 16 = 705,600 bits per second per channel (seeing that CDs are stereo the total would be 1,411,200 bits per second).
28
28 Vancouver Maritime Museum Digitization Strategy
Audio analog-‐to-‐digital converters allow for 192 kHz sampling rate and 24 bit amplitude resolution. The International Association of Sound Archives (IASA) recommends a minimum digital resolution of 48 kHz sampling rate at 24-‐bit resolution for analog originals. But, IASA acknowledges that the higher resolution of 96 kHz/24 bit has become the standard for heritage organizations. IASA also recommends that spoken word recordings be captured at the same rate as music recordings.
Preferred digitization quality: 96 kHz/ 24
Minimum digitization quality: 48 kHz/ 24
The recommended audio file formats are classified as being uncompressed file formats.
Broadcast Wave Format (BWF) The European Broadcast Union (EBU) introduced BWF in 1996 to allow files to be exchanged between digital audio workstations during radio and television productions. It is now used in every aspect of professional audio. Based on Microsoft's and IBM’s WAV format, BWF can carry PCM (Pulse Code Modulation) or MPEG encoded audio which can be enhanced with metadata describing information about the originator, date and coding history of the recording. A BWF file is fully compatible with any playback software that supports regular WAV files. The International Association of Sound and Audiovisual Archives (IASA) recommends the use of BWF as an archival audio file format, and LAC has recently switched to this from standard WAV.
If Broadcast Wave format is not possible in a digitization project, WAV and the encoding of linear pulse code modulation (LPCM) will be used. MP3 may be the format for dissemination of digital files.
29
29 Vancouver Maritime Museum Digitization Strategy
FILM/VIDEO
Digital video is comprised of a sequence of bitmap digital images displayed in rapid succession at a constant rate. In the context of video these images are called frames. Every bitmap frame comprises a raster of pixels.
The higher the frame rate the better the motion, and the higher the bits per pixel, the better the colour quality. Unlike analog video which is typically stored on magnetic tape, subject to physical deterioration and signal degradation with each subsequent copy, it is possible to copy multiple generations of digital video files with no loss in quality.
In the perfect scenario, it would be desirable to ensure that all digitized video be uncompressed. Obviously, this demand may not be feasible as many digital formats still use some form of compression and the potential storage requirements involved may not be available.
Preferred file types
Acceptable file formats, in order of preference. Note that for audio streams in MPEG-‐2 and -‐4 formats, AAC is preferred to other audio encodings.
• Motion JPEG 2000 (ISO/IEC 15444-‐4)(*.mj2) • AVI (uncompressed, motion JPEG) (*.avi) • QuickTime Movie (uncompressed, motion JPEG) (*.mov) • MPEG-‐2 • MPEG-‐4_AVC • MPEG-‐4_V • MPEG-‐1 • Compressed in wrappers like AVI, QuickTime, WMV, etc.
30
30 Vancouver Maritime Museum Digitization Strategy
APPENDIX 2 – METADATA
An image is not considered to be of high quality unless metadata is associated with the file. Metadata makes possible several key functions – the identification, management, access, use, and preservation of a digital resource – and is therefore directly associated with most of the steps in a digital imaging project workflow: file naming, capture, processing, quality control, production tracking, search and retrieval design, storage, and long-‐term management. Although it can be costly and time-‐consuming to produce, metadata adds value to master image files: images without sufficient metadata are at greater risk of being lost.
No single metadata element set or standard will be suitable for all projects or all collections. Likewise, different original source formats (text, image, audio, video, etc.) and different digital file formats may require varying metadata sets and depths of description. Element sets should be adapted to fit requirements for particular materials, business processes and system capabilities.
Although there is benefit to recording metadata on the item level to facilitate more precise retrieval of images, we realize that this level of description is not always practical. Different projects and collections may warrant more in-‐depth metadata capture than others; a deep level of description at the item level, for example, is not usually accommodated by traditional archival descriptive practices. The functional purpose of metadata often determines the amount of metadata that is needed. Identification and retrieval of digital images may be accomplished on a very small amount of metadata; however, management of and preservation services performed on digital images will require more finely detailed metadata – particularly at the technical level, in order to render the file; and at the structural level, in order to describe the relationships among different files and versions of files.
From more detailed information on metadata, please see Technical Guidelines for Digitizing Cultural Heritage Materials: Creation of Raster Image Master Files by the Federal Digitization Initiative Still Image Working Group.
31
31 Vancouver Maritime Museum Digitization Strategy
APPENDIX 3 – FILE NAMING
A file-‐naming scheme should be established prior to each digitization project. File names can either be meaningful (such as the adoption of an existing identification scheme which correlates the digital file with the source material), or non-‐descriptive (such as a sequential numerical string). Meaningful file names contain metadata that is self-‐referencing; non-‐descriptive file names are associated with metadata stored elsewhere that serves to identify the file, for example the item number. In general, smaller-‐scale projects may design descriptive file names that facilitate browsing and retrieval; large-‐scale projects may use machine-‐generated names and rely on the database for sophisticated searching and retrieval of associated metadata.
A file naming system based on non-‐descriptive, non-‐mnemonic, unique identifiers usually requires a limited amount of metadata to be embedded within the file header, as well as an external database which would include descriptive, technical, and administrative metadata from the source object and the related digital files.
Recommendations for file names
• Names are unique -‐ no other digital resource should duplicate or share the same identifier as another resource. In a meaningful file-‐naming scheme, names of related resources may be similar, but will often have different characters, prefixes, or suffixes appended to delineate certain characteristics of the file. An attempt to streamline multiple versions and/or copies should be made.
• Consistently structured -‐ file names should follow a consistent pattern and contain consistent information to aid in identification of the file as well as management of all digital resources in a similar manner. All files created in digitization projects should contain this same information in the same defined sequence.
• Well-‐defined -‐ a well-‐defined rationale for how/why files are named assists with standardization and consistency in naming and will ease in identification of files during the digitization process and long afterwards. An approach to file naming should be formalized for digitization projects and integrated into systems that manage digital resources.
• Persistent – files should be named in a manner that has relevance over time and is not tied to any one process or system. Information represented in a file name should not refer to anything that might change over time. The concept of persistent identifiers is often linked to file names in an online environment that remain persistent and relevant across location changes or changes in protocols to access the file.
• Observant of any technical restrictions – file names should be compliant with any character restrictions (such as the use of special characters, spaces, or periods in the name, except in front of the file extension), as well as with any limitations on character length. Ideally, file names should not contain too many characters. Most current operating systems can handle long file names, although some applications will truncate file names in order to open the file, and certain types of networking protocols and file directory systems will shorten file names during transfer. Best practice is to limit character length to no more than 32 characters per file name.
• It is recommend to use a period followed by a three-‐character file extension at the end of all file names for identification of data format (for example, .tif, .jpg, .gif, .pdf, .wav, .mpg, etc.) A file format extension must always be present.
32
32 Vancouver Maritime Museum Digitization Strategy
• Take into account the maximum number of items to be scanned and reflect that in the number of digits used (if following a numerical scheme).
• Use leading 0’s to facilitate sorting in numerical order (if following a numerical scheme). • Do not use an overly complex or lengthy naming scheme that is susceptible to human error
during manual input. • Use lowercase characters and file extensions. • Record metadata embedded in file names (such as scan date, page number, etc.) in another
location in addition to the file name. This provides a safety net for moving files across systems in the future, in the event that they must be renamed.
• In particular, sequencing information and major structural divisions of multi-‐part objects should be explicitly recorded in the structural metadata and not only embedded in filenames.
• Although it is not recommended to embed too much information into the file name, a certain amount of information can serve as minimal descriptive metadata for the file, as an economical alternative to the provision of richer data elsewhere.
• Alternatively, if meaning is judged to be temporal, it may be more practical to use a simple numbering system. An intellectually meaningful name will then have to be correlated with the digital resource in an external database.
Versioning
For various reasons, a single scanned object may have multiple but differing versions associated with it (for example, the same image prepped for different output intents, versions with additional edits, layers, or alpha channels that are worth saving, versions scanned on different scanners, scanned from different original media, scanned at different times by different scanner operators, etc.). Ideally, the description and intent of different versions should be reflected in the metadata; but if the naming convention is consistent, distinguishing versions in the file name will allow for quick identification of a particular image. Like derivative files, this usually implies the application of a qualifier to part of the file name. The reason to use qualifiers rather than entirely new names is to keep all versions associated with a logical object under the same identifier. An approach to naming versions should be well thought out; adding 001, 002, etc. to the base file name to indicate different versions is an option; however, if 001 and 002 already denote page numbers, a different approach will be required.
Naming Derivative Files
The file naming system should also take into account the creation of derivative image files made from the master files. In general, derivative file names are inherited from the masters, usually with a qualifier added on to distinguish the role of the derivative from other files (i.e., “pr” for printing version, “t” for thumbnail, etc.) Derived files usually imply a change in image dimensions, image resolution, and/or file format from the master. Derivative file names do not have to be descriptive as long as they can be linked back to the master file. For derivative files intended primarily for Web display, one consideration for naming is that images may need to be cited by users in order to retrieve other higher-‐quality versions. If so, the derivative file name should contain enough descriptive or numerical meaning to allow for easy retrieval of the original or other digital versions.
33
33 Vancouver Maritime Museum Digitization Strategy
APPENDIX 4 – FORMS
34
34 Vancouver Maritime Museum Digitization Strategy
DIGITIZATION PROJECT PROPOSAL
1. Name:
2. Materials Nominated for Digitization: (Please indicate collection/item name, number, series, number, box number, folder number, creator(s), and other relevant information to the fullest extent possible.)
3. Timeline: (Please give a rough timeline for the project, including goals and preferred start/end dates.)
4. Reasons for Nomination: (Describe why the materials are important, according to the digital project selection criteria. Why should these items be made available digitally?)
a. Value. Please indicate any informational, administrative, or artifactual value. Is this a collaborative project?
b. Use. Please indicate usage information about the item(s). Do the items include RAD records or an adequate curatorial description?
c. Risk. Are there preservation concerns with the item(s)?
d. Rights. Please describe whether the VMM has permission to publish the item(s), detailing current copyright status or intellectual property rights.
5. Resource Requirements (Please indicate technical information for the digitization project.)
a. Extent. Detail the number of physical items to be scanned (including number of pages).
b. Format(s). Please indicate the digital output(s) desired (e.g. TIFF, JPEG, MOV, etc.)
c. Metadata. Please indicate schema(s) preferred for describing the digital item(s).
From:
Adapted from Sitts, Maxine K., ed. Handbook for Digital Projects: A Management Tool for Preservation and Access. 2000 (first edition). Northeast Document Conservation Center. http://www.nedcc.org/resources/digitalhandbook/dman.pdf). Z. Smith Reynolds Library, Digitization Project Proposal.