DLI TrainingDLI Training
April 2004April 2004
Kingston OntarioKingston Ontario
DDIDDI
What, Why, How?What, Why, How?
Data Documentation Data Documentation InitiativeInitiative
•The Data Documentation Initiative is an international effort to establish a standard for technical documentation describing social science data
• http://http://www.icpsr.umich.edu/DDI/index.htmlwww.icpsr.umich.edu/DDI/index.html
Getting Used to AcronymsGetting Used to Acronyms
• You’ll see…..You’ll see…..
DTD - Document Type DTD - Document Type DefinitionDefinition
• Consists of a Tag Library Consists of a Tag Library
• Tags have been developed by DDITags have been developed by DDI
• A set of tags, when filled, are known A set of tags, when filled, are known as a codebookas a codebook
• DDI – intends to comply to Dublin DDI – intends to comply to Dublin CoreCore
TagsTags
• Tags present English language Tags present English language descriptions of XML (eXtensible descriptions of XML (eXtensible Markup Language)Markup Language)
• Each tag can be optional or Each tag can be optional or mandatory, repeatable or non-mandatory, repeatable or non-repeatablerepeatable
• Set of tags for each section of DTDSet of tags for each section of DTD
5 Sections of DTD5 Sections of DTD ((DocumentDocument Type DefinitionType Definition))
• 1.0 Document Description1.0 Document Description
• 2.0 Study Description2.0 Study Description
• 3.0 Data File Description3.0 Data File Description
• 4.0 Variables Description4.0 Variables Description
• 5.0 Other Study Materials5.0 Other Study Materials
Document DescriptionDocument Description
• Bibliographic Bibliographic description of the description of the DDI document DDI document itself, otherwise itself, otherwise known as a known as a marked-up marked-up codebookcodebook
Study DescriptionStudy Description
• Describes Study or Describes Study or SurveySurvey
• Includes title, Includes title, abstract, keywords, abstract, keywords, author, publisher, author, publisher, collection methods, collection methods, etc.etc.
Data File DescriptionData File Description
• Contains Contains information information describing the data describing the data filefile
• Includes file name, Includes file name, file type, case file type, case quantity, logical quantity, logical record length, total record length, total number of records, number of records, etc. etc.
Variables DescriptionVariables Description
• Describes each Describes each variablevariable
• Includes variable Includes variable label, values, value label, values, value label, question, label, question, summary statistics, summary statistics, etc.etc.
Other Study MaterialsOther Study Materials
• Includes Includes documentation files documentation files in a variety of in a variety of formats: pdf, excel, formats: pdf, excel, word, etc.word, etc.
• Includes codebooks, Includes codebooks, questionnaires, user questionnaires, user guides, variability guides, variability tables, etc.tables, etc.
Choosing TagsChoosing Tags
• http://http://www.icpsr.umich.edu/DDI/users/dtd/cwww.icpsr.umich.edu/DDI/users/dtd/codedtd.htmlodedtd.html
• Select tags from each DTD section for Select tags from each DTD section for templatetemplate
• Each tag represents a searchable Each tag represents a searchable elementelement
• Choose carefully – definitions are Choose carefully – definitions are provided on-lineprovided on-line
Codebook TemplateCodebook Template
• Over 300 tags to Over 300 tags to choose fromchoose from
• DDI suggests DDI suggests starting with set of starting with set of Recommended Recommended ElementsElements
Making TemplatesMaking Templates
• Pre-fill tags which are common to Pre-fill tags which are common to each dataset and save as templateeach dataset and save as template
• One template for Stats Canada One template for Stats Canada surveyssurveys
• One template for ICPSR SurveysOne template for ICPSR Surveys
• One generic to cover other sources One generic to cover other sources
Making Templates - Making Templates - exampleexample
Marking up documentMarking up document
Tags can be filled by a variety of methods:Tags can be filled by a variety of methods:
• Cut and paste or type in information Cut and paste or type in information
• Extract information using custom scripts, Extract information using custom scripts, xml editors or custom softwarexml editors or custom software
• Follow tag descriptions carefullyFollow tag descriptions carefully
• Establish guidelinesEstablish guidelines
Completed CodebookCompleted Codebook
Displaying codebookDisplaying codebook
• XML is not a visual display language XML is not a visual display language on the webon the web
• Transform codebook using XSL/XSLT Transform codebook using XSL/XSLT (extensible stylesheet language)(extensible stylesheet language)
• Stylesheets control the output display Stylesheets control the output display
• Software required to read stylesheets, Software required to read stylesheets, eg. Cocoon, Saxoneg. Cocoon, Saxon
Stylesheets – what they look Stylesheets – what they look likelike
StylesheetsStylesheets
• Create your ownCreate your own
• Borrow from others – examples Borrow from others – examples available on DDI siteavailable on DDI site
• Purchase proprietary software to do Purchase proprietary software to do it for you –example NESSTARit for you –example NESSTAR
Final productFinal product
......................................................
Where to go from here…..Where to go from here…..
• BorrowBorrow
• ShareShare
• CollaborateCollaborate
• DiscussDiscuss
• Become involved.....Become involved.....
Appendix – AcronymsAppendix – Acronyms
• DDI – Data Documentation InitiativeDDI – Data Documentation Initiative• DTD – Document Type DefinitionDTD – Document Type Definition• XML – eXtensible markup languageXML – eXtensible markup language• XSL/XSLT – extensible stylesheet XSL/XSLT – extensible stylesheet
languagelanguage• Template – file containing a set of Template – file containing a set of
chosen tags to be used for each datasetchosen tags to be used for each dataset• Marked-up codebook – all chosen tags Marked-up codebook – all chosen tags
are filled with survey informationare filled with survey information
The endThe end
Carol Perry & Ernie BoykoCarol Perry & Ernie Boyko