database management for clinical research tables, normalization, queries, and forms michael a. kohn,...

41
Database Management for Clinical Research Tables, Normalization, Queries, and Forms Michael A. Kohn, MD, MPP 3 September 2013

Upload: lenard-bradford

Post on 01-Jan-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Database Management for Clinical ResearchTables, Normalization, Queries, and Forms

Michael A. Kohn, MD, MPP3 September 2013

Clinical Research* Choose the study design, and define the study

population, predictor variables, and outcome variables;

measure these variables and anticipate problems with measurement;

analyze the results

Today, we discuss the “nitty gritty” of collecting, storing, updating, and monitoring the study measurements.

*Private companies that make data management systems for clinical research understand “clinical research” to include only trials preparatory to FDA drug or device approval, not observational studies.

Outline Online Surveys Data Tables

Rows = Records; Columns = Fields Data Dictionary (Metadata)

On-screen forms (“EDC”, “eCRFs”) Normalization of Data Tables Database Queries Data management programs (platforms)

“All-that-apply” Questions

“All-that-apply” Questions

Single Best Response

Skip Logic

DCR-4 Website

Online Survey Tools

REDCap Qualtrics SurveyMonkey Zoomerang, Google Forms, etc.

Components of Data Management

1) Collecting/entering2) Monitoring/checking/cleaning3) Reporting4) Statistical analysis5) Maintaining confidentiality6) Back-ups

Default Option

Collect data by hand-writing on paper forms

Transcribe into Excel tables Import into Stata for monitoring,

reporting and analysis

Rows = Records = Entities

Columns = Fields = Attributes

Data Tables

DCR Chapter 16 Exercise 2

The PHTSE (Pre-Hospital Treatment of Status Epilepticus) Study was a randomized blinded trial of lorazepam, diazepam, or placebo in the treatment of pre-hospital status epilepticus. The primary endpoint was termination of convulsions by hospital arrival. To enroll patients, paramedics contacted base hospital physicians by radio. The following are base-hospital physician data collection forms for 2 enrolled patients:Lowenstein DH, Alldredge BK, Allen F, Neuhaus J, Corry M, Gottwald M, et al. The prehospital treatment of status epilepticus (PHTSE) study: design and methodology. Control Clin Trials 2001;22(3):290-309.

Alldredge BK, Gelb AM, Isaacs SM, Corry MD, Allen F, Ulrich S, et al. A comparison of lorazepam, diazepam, and placebo for the treatment of out-of-hospital status epilepticus. N Engl J Med 2001;345(9):631-7.

Demonstration: Creating a Data Table

Enter a row of data in datasheet view (PHTSE table)

Where is predictor on data collection form?

Demonstration: Data Dictionary

Table design view:•field (=column) names, •data types, •definitions, •validation rules

Create a 9-field data dictionary for the data table

Demonstration: Same Table in Excel, Stata

Excel Stata Etc

Rows = Records = EntitiesColumns = Fields =

AttributesAccess and Stata have a special row at the top for column headings (=field names); Excel just uses the first row.

(Use EnollmentWithUnblindedGroupAssignments query)

On-screen Forms

EDC = Electronic Data Capture

eCRF = Electronic Case Report Form

Advantages of On-Screen Forms

Data keyed directly into the computer data tables without a transcription step

Include validation checks and provide immediate feedback when a response is out of range

Incorporate skip logic

Demonstrate PHTSE Enrollment Form

Normalization

Table of Study Subjects

Row = Enrollment

Columns = ID#, KitNumber, Date/Time, HospArrSzAct

Research nurse collects several variables for each of the patient’s hospital days: Intubation, Lowest O2 Saturation, Tmax, anti-seizure medications and doses, …

Adverse event tracking

Call logs

Table of PHTSE Enrollments

If study subjects have an indeterminate number of hospital days, and each hospital day is associated with an indeterminate number of medications, “normalize” the records into 3 tables, one for subjects, one for hospital days, and one for medications.

Keep adverse events in a table with one row per event and calls in another table with row per call.

Normalization

For more on normalization, take Epi 218.

Database Queries*

SQL = Structured Query Language

* Not to be confused with queries to study sites, author queries, etc.

Select Queries

Select queries (aka “Views”) organize, sort, filter, and display data.

Queries use Standard Query Language (SQL), but you don’t have to learn it, because of graphical query design tools.

A query can join data from two or more tables, display only selected fields, and filter for records that meet certain criteria.

Platforms for Epi 218 Access (Labs 1, 2 and 3) REDCap (Lab 4, 5) QuesGen (Lab 6)

May use other data platforms for final project:-- Oncore-- Filemaker Pro-- OpenClinica-- Other

Note that the list does NOT include Excel.

Microsoft Access Integrated desktop database management

platform Uses SQL (Structured Query Language) Has an outstanding graphical query design tool Incorporates an excellent report writer Based on the principles of the Relational Model Relationships diagram has integrated referential

integrity Very flexible, infinitely customizable NOT browser based (desktop application) Using advanced features usually requires hiring

a developer

Microsoft Access

UCSF MyResearch DEB Terminal Server

DEB Terminal Server Provides a remote Windows desktop

with Microsoft Office Professional Remote Desktop client software freely

available for the Mac and already part of Windows

Obtain DEB Terminal Server username and password from [email protected]

Instructions available on Epi 218 course syllabus page

REDCap Web-based research data collection

system developed at Vanderbilt Available free through UCSF

Academic Research Systems http://tinyurl.com/yh5m6ka You are both the Principal

Investigator and User 1. Model= “Do-it-yourself”

QuesGen

Web-enabled research data collection and management platform developed (with UCSF input) by a private company based in Burlingame

More full-featured and customizable than REDCap, but primarily “pay-us-to-do-it” rather than “do-it-yourself”

Transcribing into Excel tables

No data dictionary (meta-data) No automatic data validation No on-screen forms No skip logic Significant security and file corruption

issues Not acceptable for Epi 218 Final

Project.

General description of database Data collection and entry Error checking and data validation Analysis (e.g., export to Stata) Security/confidentiality Back up

Elements of a Data Management Plan

Final Project

1) What is your study?  ("The [CUTE ACRONYM] study is a [DESIGN] study of the association between [PREDICTOR] and [OUTCOME] in [STUDY POPULATION]").

2) What data points are you collecting?  (Your data collection forms.)

3) Who collects the data? You?  RAs?  MDs?  Chart abstractors? Maybe the study subjects enter the data themselves?

Final Project (cont’d)4) How are the data collected? Written onto a paper

form and then transcribed into a computer database?  Entered directly into the computer?  (If data are transcribed, who does it? Have you hired/will you hire somebody? Or have you enlisted/will you enlist some med students?)

5) Will the above-mentioned computer database be in Access, REDCap, QuesGen, Oncore, OpenClinica, SurveyMonkey, Medidate RAVE, or something else?

6) Try to provide a detailed data dictionary with the name, data type, description, and validation rules for each field (column) in the data table(s).

Final Project7) If it's a multi-table database, even a hand-drawn

relationships diagram would help but is not required.8) How do you validate the data for correctness and

monitor the data collection effort?  (Usually you have some range checks on individual variables and you periodically query for outliers that are nonetheless within the allowed range.)

9) You should periodically analyze the data, not only to look for problems, but also to see where the study is headed.  How do/will you do this?  Query in Access and export to Stata?

10) How will you protect your subjects' identifying data?11) How will you ensure that you don't lose your data

file in a computer crash or if a water pipe leaks?

Answering these questions is an essential part of doing a clinical research study.