ensuring data quality

20
Ensuring data quality Mapping outcomes for quality assurance & control Data Topics Workshop Series: Fall 2014

Upload: heather-coates

Post on 16-Jul-2015

206 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Ensuring data quality

Ensuring data qualityMapping outcomes for quality assurance & control

Data Topics Workshop Series: Fall 2014

Page 2: Ensuring data quality

Meet & Greet

• First Name• Program or Department • Current role in a research

project

Heather CoatesDigital Scholarship & Data Management LibrarianLiaison to the Fairbanks School of Public [email protected]

Page 3: Ensuring data quality

Timeline

• The Big Picture

• Practical Strategies

• Activities

• Presentation: 10 minutes• Discussion: Defining quality• Discussion: Mapping outcomes• Review | Q&A

Agenda

Page 4: Ensuring data quality

ScenarioFour years after your article is published, a researcher in your field contacts you with questions about the integrity of the data. • Can you find the files

supporting your published findings?

• Can you access and view the files?

• Can you justify your rationale for the procedures based on your documentation?

• Can someone pick up your research and build on it?

Page 5: Ensuring data quality

Goals

• Recognize the need for quality standards.• Begin to define quality standards for your research.• Identify quality assurance and quality control activities.

Page 6: Ensuring data quality

Data Integrity

• Data have integrity if they have been maintained without unauthorized alteration or destruction

• Data integrity is data that has a complete or whole structure. (http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Data_integrity.html)

Page 7: Ensuring data quality

Data Quality

• Fitness for use (depends on context of your questions)• Data quality is the most important aspect of data management• Ensured by

• Sufficient resources and expertise• Paying close attention to the design of data collection instruments• Creating appropriate entry, validation, and reporting processes• Ongoing QC processes• Understanding the data collected

Chapman, 2005

Source: Dept of Biostatistics – Data Management, IUSM

Page 8: Ensuring data quality

Data Quality Standards

• Check data for its logical consistency.• Check data for reasonableness.• Ensure adherence to sound estimation methodologies.• Ensure adherence to monetary submission standards for stolen and recovered

property.• Ensure that other statistical edit functions are processed within established

parameters.

FBI: http://www.fbi.gov/about-us/cjis/ucr/data_quality_guidelines

Source: Dept of Biostatistics – Data Management, IUSM

Page 9: Ensuring data quality

Discussion: Defining Data Quality

Define data quality standards for the following variables:• Age, BMI• Life satisfaction scale• Number of close friends• Blood draw, bone fossil, water sample• Satellite image, photograph,

Page 10: Ensuring data quality

Defining QA/QC

• Strategies for preventing errors from entering a dataset• Activities to ensure quality of data before collection• Activities that involve monitoring and maintaining the quality of data

during the study

Page 11: Ensuring data quality

QA/QC Before Collection

• Define & enforce standards• Formats• Codes• Measurement units• Metadata

• Assign responsibility for data quality• Be sure assigned person is educated in QA/QC

Page 12: Ensuring data quality

Quality Assurance v. Control

• QA: set of processes, procedures, and activities that are initiated prior to data collection to ensure the expected level of quality will be reached and data integrity will be maintained.

• QC: a system for verifying and maintaining a desired level of quality in a product or service.

http://c2.com/cgi/wiki?QualityAssuranceIsNotQualityControl

Page 13: Ensuring data quality

Quality Assurance in Practice

• CRF (data collection instrument) review & validation• System/process testing & validation• Training, education, communication of a team• Standard Operating Procedures, Standard Operating Guidelines• Site audits

Source: Dept of Biostatistics – Data Management, IUSM

Page 14: Ensuring data quality

Quality Control in Practice

• Set of processes, procedures, and activities associated with monitoring, detection, and action during and after data collection.

• Examples:• Errors in individual data fields• Systematic errors• Violation of protocol• Staff performance issues• Fraud or scientific misconduct

Source: Dept of Biostatistics – Data Management, IUSM

Page 15: Ensuring data quality

General themes: GCDMP

• Plan, test, revise, test, revise, test…implement• All stakeholders should be involved in designing protocol, data

collection tools, data management plan, etc.• Document, document, document• Rule: the bigger and more complex the study (sites, data, people), the

more planning you need

Page 16: Ensuring data quality

Relevant practices from GCDMP

• Specify documents required for reproducible research at various levels

• Institutional: SOP• Study: protocol, manual of procedures, data management plan, statistical

analysis plan

• Documentation serves practical purposes, among them a shared understanding of the project, and benefits the team immediately

• Specify roles and responsibilities from the beginning

Page 17: Ensuring data quality

Begin with the end in mind

Produce report-ready outputs

Collect data in a way to enable efficient data entry, processing, validation, analysis, reporting

Enabled by standardized data collection tools

Page 18: Ensuring data quality

Mapping research & data outcomes

• Review the instructions• Review the example (on screen)• Discussion

Page 19: Ensuring data quality
Page 20: Ensuring data quality

Resources

1. Department of Biostatistics – Data Management Team, Indiana University School of Medicine (2013). Data Management including REDCap. (provided via email)

2. Chapman, A. D. 2005. Principles of Data Quality, version 1.0. Report for the Global Biodiversity Information Facility, Copenhagen. ISBN 87-92020-03-8. http://www.gbif.org/resources/2829

3. DataONE Education Module: Data Quality Control and Assurance. DataONE. From http://www.dataone.org/sites/all/documents/L05_DataQualityControlAssurance.pptx

4. Good Clinical Data Management Practices (2013). Available at http://www.scdm.org/sitecore/content/be-bruga/scdm/Publications/gcdmp.aspx