![Page 2: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/2.jpg)
how’s and what’s of a digital archive / library
• what is a (good) digital library ? • how should a digital library be designed ? • how should a digital library be created ? • how is a digital library measured ? • how should a digital project be executed ? • how should a digital library or a digital project be
managed ?
2
![Page 3: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/3.jpg)
why a digital project?
• to enhance accessibility of the content in libraries and archives
• to increase collaboration and cooperation between libraries and archives around the world
• to promote research • to provide opportunities for entrepreneurs
3
![Page 4: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/4.jpg)
digital projects overview
• collections: organized groups of digital objects
4
![Page 5: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/5.jpg)
digital collections Library and Archives Canada
5
![Page 6: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/6.jpg)
digital projects overview
• collections: organized groups of digital objects
• objects: digital materials
6
![Page 7: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/7.jpg)
7
digital object issue from the California Digital Newspaper Collection
![Page 8: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/8.jpg)
digital projects overview
• collections: organized groups of digital objects
• objects: digital materials • metadata: information about objects and
collections
8
![Page 9: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/9.jpg)
9
digital object metadata metadata from the Singapore National Library
![Page 10: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/10.jpg)
project phases
10
• assess • design • implement • measure • preserve • manage
![Page 11: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/11.jpg)
assess
• select the collection or content
• define the goals
• identify the users
• identify ownership and legal risks
• identify applicable standards
• evaluate capabilities
11
![Page 12: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/12.jpg)
design: standards
• METS XML for descriptive, structural, technical, and administrative metadata
• descriptive metadata • Metadata Object Description Standard
(MODS) selected metadata from MARC • Dublin Core fundamental group of text
elements for describing and cataloging • technical metadata
• ALTO for OCR text • PREMIS for digital preservation • MIX for images
12
![Page 13: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/13.jpg)
design: standards
• image standards • TIFF • JPEG2000 • JPEG • ANSI/NISO Z39.87
• file standards • PDF, PDF/A, PDF/A-1b, PDF/A-1a • TEI
• record standards • ISAD(G) • ERA
13
![Page 14: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/14.jpg)
design: access
• user community • user interface (UI) • search • authentication and user
management • digital object presentation • portability • administration
14
![Page 15: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/15.jpg)
create requirements and acceptance criteria repeat {
digitize (small) pilot batch test data against acceptance criteria adjust requirements and acceptance criteria
} until (no more adjustments are necessary) digitize more data
implement: pilot
NB: pilot batches are VERY VERY important!! 15
![Page 16: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/16.jpg)
implement: in-house
reasons for in-house production
• collection cannot be moved • collection is badly organized • digitization must be done slowly over a long
period • digitization is very simple
16
![Page 17: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/17.jpg)
implement: outsource
reasons for outsourced production
• originals can’t be scanned in-house because… • equipment is too expensive • output data is beyond staff experience • labor is too expensive
• large volume of work in a short time • insufficient space, infrastructure, or staff
17
![Page 18: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/18.jpg)
implement: software
• commercial off-the-shelf (COTS) • open source • customized COTS • customized open source • custom in-house
18
![Page 19: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/19.jpg)
implement: crowd sourcing
• FamilySearch.org
• National Library of Australia Newspapers Digitisation Program
• Library and Archives Canada
• Wikipedia
19
![Page 20: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/20.jpg)
measure: acceptance criteria
• automatic quality checks • is the digital object complete? • is the digital object verifiable? • is the digital object uncorrupted?
• manual quality checks • does the metadata meet accuracy
specifications? • does the text meet accuracy
specifications? • is the image quality satisfactory?
20
![Page 21: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/21.jpg)
measure: image quality
“…images which are ultimately to be viewed by human beings, the only “correct” method of quantifying visual image quality is through subjective evaluation. in practice, however, subjective evaluation is usually too inconvenient, time-consuming and expensive…”
“…best way to assess the quality of an image is to look at it because human eyes are the ultimate viewers of most images…”
21
Zhou Wang and Hamid R. Sheikh. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing. April 2004
Zhou Wang, Alan Bovick, and Ligang Lu. Why is image quality assessment so difficult? IEEE Transactions on Image Processing. April 2004
![Page 22: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/22.jpg)
measure: use
• who is using the collection? • what is the collection being used for? • how many page views per day / week /
month? • how long do visitors to the collection stay? • how many repeat visitors to the collection?
22
![Page 23: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/23.jpg)
preserve
• bit rot • format obsolescence • media obsolescence / decay • migration to new media or hardware • standards obsolescence
23
![Page 24: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/24.jpg)
preserve: bit rot
gradual decay of …
• storage media because of media quality
• storage media because of improper storage
• data due to random events (bit-flip,
• software due to interface changes
• software due to non-obvious or inadvertent configuration changes
24
![Page 25: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/25.jpg)
preserve: media decay
a report by NIST and the Library of Congress says that
• virtually all CD-Rs tested indicated an estimated life expectancy beyond 15 years
• only 47 percent of recordable DVDs indicated an estimated life expectancy beyond 15 years, some had a life expectancy as short as 1.9 years
• in practice actual lifetimes may be considerably shorter
25
![Page 26: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/26.jpg)
preserve: media obsolescence
• 5 ¼” floppy disks • 8 track tapes • 3 ½” floppy disks • ZIP drives • CD-R, CD-RW, Blu-Ray • microfilm
26
![Page 27: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/27.jpg)
preserve: migration
• file format changes • file name differences: case sensitive /
insensitive • extended file attributes • file permissions • soft links / hard links
27
![Page 28: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/28.jpg)
preserve: standards obsolescence
remember …
• WordPerfect ?
• MARC records ?
• Adobe Flash ?
28
![Page 29: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/29.jpg)
preservation Open Archival Information System (OAIS)
reference model
29 29
![Page 30: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/30.jpg)
the problem
30
![Page 31: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/31.jpg)
the 2009 CHAOS Report (The Standish Group) reports that of all software projects surveyed, 44%
are “challenged”, 24% failed, and only 32% succeeded
the problem
31
![Page 32: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/32.jpg)
Roger Sessions estimates that the worldwide cost of IT failure is USD $500 billion per month
Roger Sessions: CTO of ObjectWatch. He has written seven books including Simple Architectures for Complex Enterprises and many articles. He is a founding member of the Board of Directors of the International Association of Software Architects.
the problem
32
![Page 33: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/33.jpg)
in a recent survey of 1230 IT professionals conducted by Embarcadero Technologies, 2 of the
3 biggest project challenges cited by the IT pros are “poor planning” and “poor or no requirements”
the problem
33
![Page 34: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/34.jpg)
in a March 2007 web poll conducted by the Computing Technology Industry Association "nearly
28 percent of the more than 1,000 respondents singled out poor communications as the number one
cause of project failure"
the problem
34
![Page 35: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/35.jpg)
in a white paper written for Project Perfect by Taimour al Neimat, he lists
• poor planning • unclear goals and objectives • objectives changing during the project • unrealistic time or resource estimates
• lack of executive support and user involvement • failure to communicate and act as a team
• inappropriate skills
as primary causes for the failure of complex IT projects
the problem
35
![Page 36: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/36.jpg)
a recent tender from an (anonymous) government agency
• project to convert ~ 170,000 text images to xml • value of project ~ USD $180,000 • 19 pages of definitions, governing law, proposal evaluation criteria, contractual conditions, instructions about tender response format, etc • technical requirements description? < 1 page • data acceptance criteria? “a high level of accuracy”
the problem
36
![Page 37: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/37.jpg)
a recent program established by a prominent national library
• digitize more than 20 million text pages • high level image and xml requirements • value of work awarded? > USD $5,000,000 • after award of work, technical requirements expand to 43+ pages from ~3 pages • acceptance criteria? added as an afterthought and not well defined
the problem
37
![Page 38: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/38.jpg)
the problem
typical tender evaluation criteria in priority order
1. understanding of requirements 2. reputation of service bureau 3. price
38
![Page 39: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/39.jpg)
39
![Page 40: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/40.jpg)
requirements
the problem
40
![Page 41: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/41.jpg)
requirements Library of Congress JPEG2000 profile
41
![Page 42: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/42.jpg)
requirements acceptance
the problem
42
![Page 43: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/43.jpg)
acceptance National Library of Australia NDP
43
![Page 44: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/44.jpg)
requirements acceptance
communication
the problem
44
![Page 45: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/45.jpg)
communication
45
“projects are about communication, communication,
and communication”
Elenbass, B. (2000). “Staging a Project: Are You Se>ng Your Project Up for Success?”. Proceedings of the Project Management InsItute Annual Seminars & Symposiums.
![Page 46: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/46.jpg)
references
• METS, MODS, ALTO, PRISM, etc : http://www.loc.gov/standards
• OAIS : http://public.ccsds.org/publications/RefModel.aspx • NISO standards and guidelines :
http://www.niso.org/publications/rp • good practice guides : http://www.ukoln.ac.uk • And many, many more
46
![Page 47: Digital projects best practices [xxxiii reunión nacional de archivos 201111]](https://reader033.vdocuments.us/reader033/viewer/2022042714/554bd6b0b4c905706a8b514c/html5/thumbnails/47.jpg)
47 47
preguntas?
Frederick Zarndt
This work is licensed under the Creative Commons Attribution-ShareAlike (CC by SA)
License. To view a copy of this license visit http://creativecommons.org/licenses/by-sa/3.0/