on sweet spots and trustworthiness: a tale of mass ... · on sweet spots and trustworthiness: a...
TRANSCRIPT
On sweet spots and trustworthiness:
a tale of mass digitisation and digital preservation
Diastor Conference 5th of June 2014
Tom De Smet [email protected]
ONZE MISSIE ONZE
MISSIE
Collections including: • Film from 1898 onwards • Advertising 1920 • Cinema journals ‘22–’80 • Radio from 1934 • Television from 1951 • News, Current affairs, Sport • Dutch royal house collection • Dutch Premier League archive • National Music Archive • Former broadcast museum • Amateur film • Documentary film • Photographs • Visual art collections • And much more…
23 June 2014
Nederlands Instituut voor Beeld en Geluid
5
CENTRAL POSITION OF THE COLLECTION Content is key
• 80% audio visual heritage • > 850.000 hrs of audiovisual material
• 2 million photos • 20.000 objects • and counting…
à 1 catalogue à 1 annotation process
Demanding users: Broadcast Professional
The Digital Archive
137.200 hours video 17.510 hours film
123.900 hours audio 1.200.000 photo’s
yearly 8.000 hours TV (video)
54.000 hours radio 1500 hours of
miscellaneous material
Approx. 350TB per year
born digital: Dutch television & radio and other sources
Europe’s largest digitisation project
SOURCE
TIME COST OUTPUT
Triangle of Challenges
Accessibility Search
Retrieval Re-use
Permanence Storage costs
Coding Resolution
Research Questions
� � t � n �r � B� t � � � t � c� s�� � re� t � B� � tBos� � �� � � s� �� f � �� � t � � Bs� � � B� t Bs� � � � � t � � � � � � � � s� �� � <� � u�eBtt � � � ru� n � � � � � c� � � B� � � B� es� tt � B � � � � � � � � � ��e e� � � � f -‐�à� � � � � �� � es� t � sc�w B �� � t � n �r � B� � B c� s � � � t � � � � s� �� � B� �� � sB�� � � t ,� B� e�w � � � � c� � � B� � Bs� � f � �� � � � �s � � � � � es�� w� �� � � �e �� � � � w� t � B� � � v� tw � � BB� t � � � � � Bs� � tf � � � tB� � Bs� n Bs� � Bn t?�
26/09/2011
Challenge 1: The Material
" Formats & types - Colour / BW " Cinematographical quality " Physical state and problems:
" Acidification (shrinkage) " Glue remains " Splices " Perforations " A/B winds (emulsion side)
" Sound & synchronisation " Sizes of film and cans " Number of items per carrier and physical marks " Status of metadata in catalog
Source material
Budget and time Quality of result
Challenge 2: Digital (& Video) Domain
" Film market in development"" Hardware development"" Digital format standardisation
vs market acceptance"" End-user requirements
Rela
tive im
po
rtan
ce in
co
llect
ion
Cinematographical & technical quality
Sports (old)
Sports (more recent)
Sports (selected)
Tele-recordings
News
News (selected)
Current Affairs
Current Affairs
(selected)
Drama
Drama(selected)
Documentaries
Documentaries(selected) Opdrachtfilms
Polygoon (35mm)
Advertorials
Standard resolution
HD resolution
2K resolution
Resolutions
Format selection & scanning
Other tape format(s) ? "
Data file format (JPEG2000 feasible) ? "
DPX is the best option"
Scan to Digi Transcode Lo-res proxy
Scan to Digi Encode to MXF D10-50 Transcode LoRes proxy
Scan to DPX Encode to XDCAM HD422 Transcode LoRes proxy
Formats since 2010
DPX - SMPTE 268M-2003 v2 - 10 bit log - RGB or single channel BW - Scanner colour space (specified
and with LUT for Rec.709 - XDCAM HD)
BWAV - EBU tech 3285-2001 v1 - 24 bit linear PCM, 48 kHz
sample rate
XDCAM HD 422 - SMPTE 381, 382,
XDCAM_MXF_HD422_v080 - MXF OP1a: SMPTE 377, 378,
379 - Timecode Track in material
package MXF (EBU Rec 122)
IMMIX CATALOGUE
SD Workflow
IMMIX CATALOGUE
HD / 2K Workflow
WRITE LTO4
Scanning Room
• 2010 switch from SD to HD = ‘video only’ to data + video • Digi<sing both internally (1/3d) and externally (2/3ds) • Film prepara<on covered by digi<sing party • Digital Master in DPX format • B/W materials saved on one channel only (Y) • Deliverance/Mezzanine format XDCAM HD422 • Resolu<ons 2K / HD – depend on source material (economical
constraints) • Scanner colour space (na<ve; RGB) • Color correc<on needed in case of colour fading • Automated workflows with emphasis on QC in all stages • Managed storage (DPX from medio 2011)
Sound and Vision’s approach - summary
26/09/2011
Trustworthy? OAIS
ISO 14721: functional model ISO 16363: guidelines for certification
“An OAIS is an Archive […] that has accepted the responsibility to preserve information and make it available for a Designated Community.”
Digital preservation at Sound and Vision
- Impact of data loss is big (and risky!) - IT-infrastructure needs to be updated constantly - We feel it needs to be stated as a clear objective in your long-term
goals and mission - We set up a project for Digital Object Management requirements:
- Preservation metadata dictionary - Preservation workflow model
Topic Lesson
1 - Collection How much is actually known/correct/complete about the source material (with regards to digitisation)? - Don’t overestimate this part.
2 - Technology
• Archival workflow has specific demands • Archival material poses specific challenges that
suppliers have limited access to / knowledge about. • Take more time to fix start-up problems if you are an
early adopter.
3 - Outsourcing Even when material properties are specified this doesn’t prevent wrong assumptions about the work load (e.g. film prep).
Lessons learned (1)
Topic Lesson
4 - Formats
Due to the film grain, high detail level in the DPX and the coding properties of the XDCAM HD422, post-processing is needed to prevent visual artefacts (blocking) in the latter.
5 - Workflow & QC
Devising a digitisation workflow is not difficult, but it takes time to understand: • how all the components (have to) work together • what the properties of the formats are • how to perform the verification of formats • the extraction and mapping of metadata to internal
systems.
Lessons Learned (2)
Lessons learned en route to TDR
- Not all at once - Choose pilots/examples of workflows that are closest to
your archive/institution - Define your SIP and DIP in collaboration with your
designated communities to avoid disappointment/frustration and a lot of manual work before and after ingest!
Further reading
� y eti((n n n ?es� t B� � s� ?Bs� (tr t � � ( � � � t( � � � s�s r (s� tBos� � ( � � � t � � � � ) cx?Wfl?e� � �
�