unc chapel hill 2014 ctc retreat - sas proc codebook sheps

13
PROC_CODEBOOK: An Automated, General Purpose Codebook Generator Kim Chantala, email: [email protected] Sheps Center for Health Services Research Jim Terry email: [email protected] Carolina Population Center University of North Carolina at Chapel Hill

Upload: jonathan-pletzke

Post on 26-Jun-2015

118 views

Category:

Technology


1 download

DESCRIPTION

SAS programming: Creating codebooks with SAS - Kim Chantala

TRANSCRIPT

Page 1: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

PROC_CODEBOOK: An Automated, General Purpose

Codebook GeneratorKim Chantala, email: [email protected] Center for Health Services Research

Jim Terry email: [email protected] Population Center

University of North Carolina at Chapel Hill

Page 2: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

PROC_CODEBOOK.SAS

SAS macro program that is simple to use User provides:

– titles for the codebook– the file organization– SAS data set with labels and formats .

Output is a comprehensive, well formatted, easy to read codebook.

Page 3: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

Sample Code

%include 'C:\My_project\HWT_short_formats.sas';

%include ' C:\My_project\proc_codebook.sas';

libname here 'C:\My_project';

title1 'CODEBOOK FOR WAY TO HEALTH BASELINE HEIGHT/WEIGHT DATA';

footnote 'Created by: hwt_base_codebook.sas';

%let organization=One Record per Participant (ID);

%proc_codebook(lib=here,

file1=hwt_base,

fmtlib=work.formats,

pdffile=hwt_base_codebook.pdf);

run;

Page 4: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

PDF file created by PROC_CODEBOOK

Page 5: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

USING PROC_CODEBOOK.SAS

Create labels for all variables.

Data set must contain at least one formatted categorical variable and two numeric variables.

Assign FORMATs to all categorical variables. – Standard formats should be used that assign only one value or a

range of values to a unique value label. – No testing has been done using hybrid formats or formats with

multi-value labels.

Include a data set label on the SAS data file.

By default, the codebook is ordered by Variable Name.

Page 6: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

ORDERING VARIABLES IN THE CODEBOOK

Create a simple two variable file called work.order before you call the macro.

– NAME: a 32 character field with your variable name in UPPER CASE. – ORDER: a numeric field with the order you want the variables to print.

Example data step creating a work.order data set:

data order; length name $ 32; name = "T1 "; ORDER = 1; OUTPUT; name = "HHID09"; ORDER = 2; OUTPUT; name = "LINE09"; ORDER = 3; OUTPUT; name = "H1D "; ORDER = 4; OUTPUT; run;

Page 7: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

TITLES AND FOOTNOTES

TITLE1, TITLE2 and all FOOTNOTES are specified by user.

PROC_CODEBOOK creates the following titles:

– TITLE4 lists the number of observations in data set.

– TITLE5 lists the number of variables in the data set.

– TITLE6 lists the organization of the data set and is specified in a global macro variable by the user:

%let organization=One Record per Participant(ID);

Page 8: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

PROC_CODEBOOK syntax

%proc_codebook(lib=libname,

file1=SAS_dataset,

fmtlib=work.formats,

pdffile=codebook_file.pdf,

include_warn=NO);

Page 9: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

The macro variables

Required Variables: – LIB = library for SAS data set (see FILE1 variable) – FILE1 = SAS data set used to create the codebook– FMTLIB = 2-level name of format library– PDFFILE = name of PDF file for the codebook

Optional Variables:

– INCLUDE_WARN= flag to control printing of WARNING messages:

* YES=prints warnings in codebook (default), * NO (or Any other text)=warnings printed only in LOG file.

Page 10: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

Warning Messages

Categories of formats not used by a variable

Variables that have missing data for all observations

Variables that are constant

Page 11: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

Tips on embellishing your codebook: Add a LOGO to the codebook

%let organization=One Record per ORGID*INJURY_DATE*INJURY_TYPE;

ods escapechar='~';

title1 j=c '~S={preimage="H:\datalys\Logo\datalys_color_logo_final.JPG"}';

title2 j=l "CODEBOOK: Women's Volleyball Injury Data Set 2004-05 to 2008-09";

footnote1 j=l 'SAS data set: injwvb0409.sas7bdat';

footnote2 j=l 'Created by H:\datalys\Chantala\Data Dec2009\injwvb0409.sas';

footnote3 'Listed Format assignment not always stored with permanent SAS data set';

%proc_codebook(lib=work,

file1=injwvb0409,

fmtlib=work.formats,

pdffile=C:\My_project\injwvb0409_codebook.pdf,

include_warn=NO);

Page 12: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

PDF file created by PROC_CODEBOOK:

Page 13: UNC Chapel Hill 2014 CTC Retreat - SAS Proc codebook sheps

SAS CODE

The SAS codebook macro & documentation can be downloaded from the following location:

http://www.cpc.unc.edu/research/tools/data_analysis/proc_codebook