noark 5 validator
DESCRIPTION
Noark 5 validator. Summary. Noark 5 extraction format Noark 5 extraction validator Demo Issues with the Noark 5 extraction format Future steps. N5 extraction. N5 extraction – contd. N5 extraction – arkivstruktur.xml. N5 extraction – arkivstruktur.xml. N5 extraction – endringslogg.xml. - PowerPoint PPT PresentationTRANSCRIPT
Noark 5 validator
Summary
Noark 5 extraction format
Noark 5 extraction validator
Demo
Issues with the Noark 5 extraction format
Future steps
N5 extraction
N5 extraction – contd.
N5 extraction – arkivstruktur.xml
N5 extraction – arkivstruktur.xml
N5 extraction – endringslogg.xml
N5 extraction – arkivuttrekk.xml
N5 extraction – arkivuttrekk.xml
Validating N5 extractions
Verify vs Validate
Does validation depend on the system?
Validation tool• Noark 5 v3• 10 validations currently• Plug-in oriented• Extend with new features
Jhove
Pdfinfo
LibTIFF
What to do if there are validation errors?
Validation 1
Validate all XML files against their XSD schemas
Validation 2
Validate checksums and file format of all documents referenced in arkivstruktur.xml
Validation 3
Validate the number of documents referenced in arkivstruktur.xml against the «antallDokumentfiler» value in arkivuttrekk.xml
Validation 4
Validate that the number of documents referenced in arkivstruktur.xml is GTEQ to the actual number of files in the «dokumenter» directory
Validation 5
Validate the number of «mappe» elements in arkivstruktur.xml against the «mappe numberOfOccurrences» value in arkivuttrekk.xml
Validation 6
Validate the number of «registrering» elements in arkivstruktur.xml against the «registrering numberOfOccurrences» value in arkivuttrekk.xml
Validation 7
Validate the checksums of all files (XML and XSD) referenced in arkivuttrekk.xml
Validation 8
Validate the entries in endringslogg.xml against the objects in arkivstruktur.xml
Validation 9
Validate the checksum of arkivuttrekk.xml
Validation 10
Validate the custom metadata in arkivstruktur.xml against a schema (custom_metadata.xsd)
N5 extraction format – issues and possibilities References to documents
Initial object revisions
DocumentObject does not have a systemID
Can all objects have custom metadata?
Export user registry
Import a Noark 5 extraction
Data quality