tdr overview pres advocates
DESCRIPTION
Presentation on the TDR - as of 10th October 2007. Made to the Faculty of AdvocatesTRANSCRIPT
1
Overview of the NLS TDRFor the Faculty of Advocates
James ToonTDR Project Manager
Lawnmarket – Room 208, Ext [email protected]
2
OutlineA – Introduction and Background
B – What does it comprise
C - Policies
D – Strategic Approach
E – Schedule and Happenings
F - Questions
3
Section A – Introduction and Background
4
Part of the NLS Digital Vision• A Trusted Digital Repository must be at the heart of the
Digital NLS to ensure long-term digital preservation and access.”
Strategic Objectives• Develop and implement a trusted digital repository (TDR)
infrastructure based around international standards and best practices (build and acquire)
• Provide necessary storage capacity for the TDR (store)
• Implement discovery and access strategies to deliver data stored in the TDR (make available)
Background
5
• The 2003 Legal Deposit Libraries Act, which extended our legal deposit privilege to non-print materials
• The need to preserve our growing collection of digitised material
• The need to preserve digital collections purchased by NLS
• The potential to host and preserve digital content for partner institutions in Scotland
What is the TDR? The Trusted Digital Repository is a repository system that will allow the National Library of Scotland to preserve and manage digital content of enduring value to the Nation. Its core drivers are;
6
In this context, the mission of the NLS TDR is as follows:
• Collect and store digital 'stuff' (be it 'born digital' or a 'surrogate' of print material)
• Make this material easily accessible to users (although subject to some constraints)
• Provide a platform for the development of significant digital collections (of all types)
• Ensure what is available today remains available many years to come through the application of digital preservation standards and technologies
TDR Mission
7
• Options appraisal undertaking in 2005 compared the build vs. buy options. Decision to build based on open source platform.
• Grant of £1.8M Awarded by Scottish Executive Aug 2006 over 2 years
• TDR Project concentrating on web archiving as primary objective.
• Setting up an in house development team of full time and contract developers.
• Procure and install significant storage system (up to 200Tb)• Building the software in 4 key release cycles over the
development of the product. Release 4 due end Sept 2008• Setting out a 5 year strategic plan for the TDR as a key part of
the library, with associated implementation costs and roadmap
TDR Project overview
8
Section B – What does it comprise
9
• Method of delivering content into the system from the ‘outside’
• Method or harvesting content from the internet
• Methods of organising and managing the content objects and their metadata
• A repository system to enforce and service the content objects and their metadata
• A method of discovering and retrieving the content stored in the repository
• A method of carrying out services for users based on the content in the repository
Software components
10
• Commenced full EU procurement process in August 2006
• Contracts signed end December 2006
• First Storage Area Network (SAN) system installed March 2007
• Fibre optic line installed between NLS and data centre July 2007
• Second mirror SAN installed August 2007
• Final storage network testing carried out Sept 2007
Hardware component
11
Make up of a repository;
• 1 – Digital Object. A package of information that includes the content of the work and data about the work object, including policies which aid discovery and that may dictate the usage and availability of the work.
• 2 – Repository. A location where digital objects are stored and responsible for enforcing policies bound to these objects
• 3 - Service. Software products that provide services based on the contents of the repositories and their policies, such as dissemination, transformation, rights management and preservation
What is a repository?
12
Section C – Policies
13
The following are examples of many policies that must be developed and integrated into the library operational structure;
•Digital Preservation•Rights Management and security•Web Archiving•Cataloguing and metadata•Acquisition and permissions
Integrated policy decisions
14
Policies based around simple three stage approach to preservation
1 – Archive. The objects cannot be preserved if they are not in the archive.
2 – Risk Management. Assessment of the likely preservation risk at ingest. Allow for pragmatic but proactive approach to individual preservation cases. i.e. file normalisation as part of ingest as prevention (e.g. Adobe PDF conversion)
3. Preserve. Provide method for migration/emulation on a per object basis. Original bit stream is main archive, and we are maintaining the method of preservation rather than carrying out transformation en masse
Digital Preservation Policies
15
Security and rights enforcement based around;
• User security associating individuals to access groups (such as creating a set of users who are just Advocates)
• User security applicable down to individual object level• Controlled availability for delivery (such as per IP address, or
book in/book out methods) within Legal Deposit boundaries• Association of individual object security policies• Management of individual or group digital signatures• Full logging and audit trail• Enforcement of publisher DRM requirements – not victim of
DRM requirements
Rights Management Policies
16
Emphasis on high quality collections based on;
• Selective, thematic collections– Collection areas such as Scottish music, sport, politics etc
– Collection sub areas such as, bagpipe music, Scottish rugby, political party sites – all with inherited collection metadata
• Event Based collections– Scottish elections, special events, disasters (such as flood in NLS
building!)
• Domain level collections– Yearly/twice yearly broad brush collection of all Scottish and/or UK
websites
Web Archiving Policies
17
Not dissimilar to NLS standard practices, but with a few additions;
• Descriptive with MARC using AACR2 • Use of METS for data wrappers• Automatic technical metadata extraction through
DROID/PRONOM database for The National Archive• Use of PREMIS metadata for technical details (amongst others)• Metadata mapping to allow bulk ingest of objects from other
databases
Cataloguing and Metadata Policies
18
Significant administrative overhead
• Open archive suggestions (i.e. from Curators and users alike) for web collections
• Ability for curators to deposit selected digital objects (such as PDF etc) to collections
• Ability for users of hosted repository to self archive• Permissions carefully managed, even post Legal Deposit
implementation• Copyright and DRM management• Watching brief on new DRM/Legal deposit issues such as
ACAP (Automated content access protocol)
Acquisition and permissions policies
19
Section D – Strategic Approach
20
Key focus on sustainability
• In parallel with the development activity is the construction of a 5 year strategy for the TDR, looking at where we want to be;
• International Standards. OAIS ISO standard, RLG OCLC TDR Checklist
• Plan for auditing the ongoing compliance• Risk management strategy to manage processes• Benefits management realisation plan to guarantee success• Operational integration into NLS either as centralised or
decentralised model• Facilitator of information management for Scotland
Strategic Approach
21
Section E – Schedule and Happenings
22
• Milestone release 1 - End June 2007. Implementation of pilot standalone web archiving system using IIPC tool set
• Milestone release 2 – End Dec 2007. Release of hosted repository system, including use of Fedora with associated ingest mechanism and metadata manager to meet OAIS requirements
• Milestone release 3 – End March 2008. Replacement of pilot web archiving system with NLS own workflow management tool
• Milestone release 4 – End Sept 2008. Implementation of resource discovery/delivery tool for TDR/Digital Library to provide access
Development Milestone plan
23
• System development (planning, design, team recruitment etc)• Developing collections policy, default criteria for collection and
specific selective and event collection plans• Working with national partners on full domain web archiving • Undertaking benchmarking activity for strategy plan• Writing a PESTLE (Political, Environmental, Social,
Technological, Legal and Ethical) analysis of international repository effort
• Communication with national and international stakeholders via International Internet Preservation Coalition (IIPC)
• Starting to better understand the pros and cons of web archiving through practice!
What’s happening now?
24
Any questions?