developing data management expertise at kings college london experience of the pekin project gareth...
TRANSCRIPT
![Page 1: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/1.jpg)
Developing data management expertise at King’s College London
Experience of the PEKin projectGareth Knight, Centre for e-Research (CeRch)
Lindsay Ould, Archives & Information Management (AIM)
![Page 2: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/2.jpg)
2
Overview
1. Aims & objectives of PEKin project
2. Project methodology
3. Findings on current state of data management
4. Action taken to address issues
5. Further work to be performed
6. Lessons learnt
7. Potential for reuse of project deliverables
![Page 3: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/3.jpg)
3
What is PEKin?
Title: Preservation Exemplar at King’s (PEKin)
Funder: JISC, Preservation strand of
Information Environment 09-11
Time period: 1 April 09 – 31 October 10
Project partners:• Centre for e-Research (CeRch)• Archives & Information Management (AIM)• Based at King’s College London
![Page 4: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/4.jpg)
4
What is a Digital Record?
“Recorded information generated, collected or received in the initiation,
conduct or completion of an activity and that comprises
sufficient content, context and structure to provide proof or
evidence of that activity “International Committee on Archives (ICA)
![Page 5: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/5.jpg)
5
New archiving challenges
Changing state of digital information: •Changing notion of what constitutes a record of business:
• Core business: student information, committees, estates, etc.• Increasingly research outputs (data, papers) – funder requirements
•Changing composition:• Born digital content (static and dynamic resourcees)• Hybrid (paper+digital), digital only
•Lifecycles:• Creation process: Create, revise, publish 1st version, revise, publish
2nd version. Repeat.• Access lifecycle: Technology dependencies (hardware & software)
Implications:• Archival process: Archive at earlier stage? Capture using different
technologies?• Data value: Can we be sure that everything has business value?
![Page 6: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/6.jpg)
6
Methodology
1. Evaluate existing information management procedures and working practices at institutional level and revise accordingly
• What remains viable?• Elements that require revision• Gaps and omissions
2. Determine the data management needs that data producers and systems managers in academic units/professional services encounter and determine most effective approach to address requirements
3. Implement a technical system capable of curating and preserving digital records of long-term archival value.
![Page 7: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/7.jpg)
7
Review of existing frameworks
Reviewed DAF, DRAMBORA & DIRKS, etc. All req. further refinement to apply to own situation.
DAF: Data Asset/Audit Framework+ Useful for gathering detailed information on data assets located in departments
+ Useful for analysing data management practices
- Time-consuming to perform
- Does not provide a method of evaluating problems & developing a mitigation strategy
DRAMBORA+ Provides formal structure for identifying, describing & evaluating risks & developing a strategy to mitigate or avoid them.
+ well-defined list of risk categories and factors
- Intended for OAIS-like environments rather than less formalised research ‘systems’
- Focus upon OAIS workflow rather than data creation lifecycle
![Page 8: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/8.jpg)
8
Integrating frameworks
DAF & DRAMBORA are broadly similar, but some work needed:•Normalised terminology and definitions & adopted some archival terminology•Activity classification: Activities placed in diff. categories in DRAMBORA & DAF.•“Light touch” approach - establish balance between DRAMBORA system-level & DAF asset level analysis
• high-level analysis of data assets using DAF• Omitted various DRAMBORA risk categories unrelated to data
management•Adopted e-Research lifecycle model•Stages were tied-in with distinct project outputs
![Page 9: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/9.jpg)
9
Audit FrameworkInitial contact with unit
managerEstablish buy-in
Select methodology and prepare material
Obtain and analyse publicly accessible
resources
Obtain and analyse internal resources
Document information
Analyse data assetsAnalyse data
management activitiesProduce case study
Identify risks and determine
consequences
Evaluate risks and determine
consequences
Prepare risk analysis report
Develop mitigation strategy
Present mitigation strategies to stakeholders
Develop mitigation plan
AssessManagement
approach
Analyse documentary
sources
PlanAudit
AnalyseRisk
Develop management
strategy
Implement mitigation plan
Implement management
strategy
Evaluation success of mitigation plan implementation
![Page 10: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/10.jpg)
10
Administrative Case Studies
Business departments & content types examined:
• Committee: Council, Academic Board and sub-committees• Estates: Project & operational records• Student: Records held outside the Student system (SITS)
•Archival value digital records•Mapped to current College paper holdings
![Page 11: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/11.jpg)
11
Research Case Studies
Research groups/projects/departments examined:
• Environmental Research Group (ERG)• Environment Monitoring group• Environment Modelling group
• Twins Early Development Study (TEDS)• Regional Information Collection Centre (RICC)
•Period of change – since April 2010, IT provided centrally with storage provision review underway•Archives have previously ly accepted pioneering research data in past•Acquisition policy is now under review for born digital/digitised records
![Page 12: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/12.jpg)
12
Administrative study findings
• Opportunity to redefine collections• All areas required digital records management support before archives could be identified
• Quality control varied between records• Duplication with paper and born-digital versions retained
• Lack of ownership of born digital records by administrative staff
![Page 13: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/13.jpg)
13
Research study findings
• Challenge to identify data sets of archival value• TEDS & ERG funded dedicated data management
roles including back-up & information security processes
• However, majority of research groups do not have equivalent support, placing data at risk
• Funding bids lacked formal data management plans to provide assurance or influence further funding
• Continuing preservation of data not considered with focus on current work
![Page 14: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/14.jpg)
14
Comparison of research & admin data management
• Individual researchers & administrative staff lack understanding of risk and use personal data approach
• Understanding of digital environment is still outside their comfort zone - hybrid duplicated collections
• High risk when staff – Principle Investigators or Administrators leave
• No point of contact for advice or support
![Page 15: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/15.jpg)
15
Risk Assessment of research data management
• Multiple risks identified• Active data management was good - recommendations made for best practice
• Mitigation• Content versioning system• Store multiple versions of each data file• Implement integrity monitoring• Data management plan to document approach
![Page 16: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/16.jpg)
16
Risk Assessment of administrative data management
• More risks identified than with research data • Lack of business owner for data sets• ISS provide storage & systems management but little data management expertise
• ISS Data Management role now in place• Move to digital capture will address risks • Risk mitigation as for research records
![Page 17: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/17.jpg)
17
Actions taken by project
1. Institution-level Policies
2. Work with departments to address data management risks
3. Documentation
4. Implementation of KCL Digital Archive
![Page 18: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/18.jpg)
18
1. Institution-level Policies
Update of existing policies:• Acquisition policy: Refinements to existing acquisition policies• Retention Policy: Appraisal criteria for records of value• Information Management: Appraisal criteria and advisory
material
Develop new policies:• Preservation Policy: content preservation strategy for
institutional data of short and long-term value
See http://www.kcl.ac.uk/iss/igc/tools/staff.html for guidance currently available
![Page 19: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/19.jpg)
19
2. Liaise with data creators & managers
•Enable management to gain a better understanding of data assets within their department/group and the potential risk factors that may limit data usage.
•Work with data producers & systems managers to address data management issues that they identified as a concern, e.g. versioning
•Make data producers & management aware of risk factors that exist and make recommendations for actions that may help to avoid or mitigate issues.
•Make them aware of support available within College & other departments/groups/projects that are working to resolve common issues.
![Page 20: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/20.jpg)
20
3. Documentation
Self-help documentation to help data creators & managers to:• Understand data management issues & key concepts• Practical steps to diagnose and address DM issues/people to contact
Data Management ‘workbook’:• Creating your data: Issues to consider prior and in early stages of
development to ensure data is fit for purpose & usable over time.• Organising your data: Methods for structuring & documenting data to enable it
to be used & understood• Maintaining access and use of data: Approaches that may be adopted to
ensure continued access & use of data.• Appraising your data: Recommendations for applying archival principles
Content Type Reports:• Short pragmatic reports tailored to specific content types (raster images,
audio, e-mail, documents)
To be published on KCL web site in near future
![Page 21: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/21.jpg)
21
4. KCL Digital Archive
Implemented Alfresco ECM (Community Edition) to manage college data of long-term archival value
Standards compliance• OAIS RM, U.S. Department of Defense 5015.2-STD, ISO 15489,
TRAC when in full service
•Bitstream preservation:• fixity creation/verification, online + offline storage
•Information Content Preservation:• Format conversion, event logging – audit trail
•Access:• Limited to archive reading room, catalogue descriptive MD to
common standard
![Page 22: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/22.jpg)
22
Rules-based approach to data management
jBPM synchronous or asynchronous workflows•Content model compliance
• Conforms to defined structure & object types
•Fixity generation• All: MD5, SHA-1, CRC
•Format identification• All: File(1), DROID
•Technical metadata extraction• Format specific: JHOVE, MP3Info, others
•Conversion to preservation & dissemination derivative• parameters for each format & MD criteria (e.g. OpenOffice, ImageMagick)• Record action results as PREMIS Event
•Close collection to prevent further update•Obsolescence monitoring?
• Risk assessment based upon future development of PRONOM/UDFR
•Manual activity for future date?
![Page 23: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/23.jpg)
23
Future Plans
• Embedding approach into archives & wider institution
• Identify research management needs at early stage (funding proposal, active/semi-active use) rather than end
• Skills audit & needs assessment• Support & training for data management staff
• College Storage strategy • Increased availability of College storage
![Page 24: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/24.jpg)
24
Lessons Learnt
• Better understanding of data ‘ecosystem’ in college – data lifecycle, infrastructure
• Progress made with identifying & addressing data management support – need to ‘scale-up’ to college as whole.
• Need to manage semi-current record, in addition to active and archival records
• Requirements for storage• Raised profile for Archives & CeRch• Need for cross-disciplinary approach to managing
data – combination of expertise & shared language
![Page 25: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/25.jpg)
25
What may be used by other projects?
Output Use
Project Methodology Anyone wishing to combine
archival/curation approach for
managing digital records
Audit methodology + templates Anyone wishing to perform similar
assessment and evaluation of DM
activities.
Data Management workbook &
Content Type reports
Anyone wishing to implement DM
practices in their own
institution/compare against others/
staff wishing to improve DM practices
Data management system Experience & documentation on use of
Alfresco as preservation system
![Page 26: Developing data management expertise at Kings College London Experience of the PEKin project Gareth Knight, Centre for e-Research (CeRch) Lindsay Ould,](https://reader036.vdocuments.us/reader036/viewer/2022070306/55161072550346cf6f8b6222/html5/thumbnails/26.jpg)
Thank YouAny questions?
Gareth Knight Lindsay Ould
Centre for e-Research
(CeRch)
Archives & Information Management
(AIM)