easy come, easy go — interactions between the data step and external files andrew t. kuligowski,...

100
Easy Come, Easy Go — Interactions between the DATA Step and External Files Andrew T. Kuligowski, The Nielsen Company

Upload: leon-copeland

Post on 30-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Easy Come, Easy Go — Interactions between the

DATA Step and External Files

Andrew T. Kuligowski,

The Nielsen Company

Interactions Between the DATA Step and External Files 2

Interactions … External FilesIntroduction

• Introduction• Defining the external data source• Describing the appearance of the

data• Practical Application

• Conclusion

Interactions Between the DATA Step and External Files 3

Interactions … External FilesIntroduction

This presentation will provide basic answers to two questions:

• How do I get my data into SAS for analysis?

• How do I get my data out of SAS?

Interactions Between the DATA Step and External Files 4

Interactions … External FilesIntroduction

Process of moving data from an external file to SAS, and from SAS to an external source, are mirror images of each other.

(Almost)

Interactions Between the DATA Step and External Files 5

Interactions … External FilesIntroduction

You need to know the answer to two questions in order to code properly.

• Where is this external source / destination? (and, is there anything significant about this file?)

• What does the data look like coming in / going out?

Interactions Between the DATA Step and External Files 6

Interactions … External FilesIntroduction

"The time has come," the Walrus said,"To talk of many things:Of shoes--and ships--and sealing-wax--Of cabbages--and kings--And why the sea is boiling hot--And whether pigs have wings."

Lewis Carroll "I am the Walrus."

John Lennon

Interactions Between the DATA Step and External Files 7

Interactions … External FilesDefining the External Data Source

Two statements that will identify an external file to DATA step (and possibly provide a little information about its form).

• INFILE statement

• FILE statement

Interactions Between the DATA Step and External Files 8

Interactions … External FilesDefining the External Data Source

Most straightforward usage – hardcode the file name right in the statement.

DATA SAMPLE ; /* full file name - under Windows */ INFILE 'c:\sasconf\sasconf.dat';

Self-documenting code!Requires coding change to point

to a different dataset.

PLUS

MINUS

Interactions Between the DATA Step and External Files 9

Interactions … External FilesDefining the External Data Source

Variation – use a macro variable.

%LET FILENM = c:\sasconf\sasconf.dat;DATA SAMPLE ; /* macro file name - under Windows */ INFILE “&FILENM”;

Of course, this isn’t much more flexible. However, with a little creative coding, this approach can be made very flexible.

Interactions Between the DATA Step and External Files 10

Interactions … External FilesDefining the External Data Source

More flexibility with FILEVAR statement.

%LET FileNm = %STR(SASCONF);DATA SAMPLE ; RETAIN fn “&FileNm" ; File_in = TRIM( fn ) || ".dat" ; File_out = TRIM( fn ) || ".out" ; INFILE dummy FILEVAR=File_in ; FILE dummy2 FILEVAR=File_out ;

Can be hardcoded, or it can change from obs to obs.

Interactions Between the DATA Step and External Files 11

Interactions … External FilesDefining the External Data Source

More flexibility with FILEVAR statement.

%LET FileNm = %STR(SASCONF);DATA SAMPLE ; RETAIN fn “&FileNm" ; File_in = TRIM( fn ) || ".dat" ; File_out = TRIM( fn ) || ".out" ; INFILE dummy FILEVAR=File_in ; FILE dummy2 FILEVAR=File_out ;

Can be hardcoded, or it can change from obs to obs.

File Reference

Set to “dummy” to show these are not actually used by the

routine when FILEVAR used, but different from each other to show they

must be unique.

Interactions Between the DATA Step and External Files 12

Interactions … External FilesDefining the External Data Source

Defining a File Reference(1) In the Operating System:

//SAMPDATA DD DSN=SAS.GLOBAL.FORUM.SAMPDAT,// DISP=SHR

Example uses MVS – shows why File References are also called “DDNAMES”.

Interactions Between the DATA Step and External Files 13

Interactions … External FilesDefining the External Data Source

Defining a File Reference(2) Using the FILENAME statement:

FILENAME confdata 'c:\sasconf\sasconf.dat';

Example uses Windows.

Interactions Between the DATA Step and External Files 14

Interactions … External FilesDefining the External Data Source

Defining a File Reference(2) Using the FILENAME statement(2a) Special options for FILENAME:

FILENAME <fileref> CLEAR;

Remove / release specified File Ref.FILENAME <fileref> LIST;

Print a list of active File Refs to SASLOG.

Interactions Between the DATA Step and External Files 15

Interactions … External FilesDefining the External Data Source

Defining a File Reference(2) Using the FILENAME statement(2a) Special options for FILENAME:(2ai) Special pseudo File Reference:

FILENAME _ALL_ LIST;FILENAME _ALL_ CLEAR;

Perform action on all active File Refs

Interactions Between the DATA Step and External Files 16

Interactions … External FilesDefining the External Data Source

Defining a File Reference(3) Special “Aggregate File

Reference”MVS: Partition Dataset

FILENAME confdata 'userid.sasconf.data'; DATA SAS_Conf07;

INFILE confdata(sascon07);

Interactions Between the DATA Step and External Files 17

Interactions … External FilesDefining the External Data Source

Defining a File Reference(3) Special “Aggregate File

Reference”MVS: Partition DatasetUnix: DirectoryWindows: Directory FILENAME confdata 'c:\sasconf';

DATA SAS_Conf07;

INFILE confdata(sascon07);

Interactions Between the DATA Step and External Files 18

Interactions … External FilesDefining the External Data Source

Other related functions …FILEREF RetCd2 = FILEREF( "confdat2" );Return Code Meaning 0 File Reference is assigned>0 (+) File Reference not currently assigned<0 (-) File Reference assigned, but file it

refers to does not (currently) exist.

Interactions Between the DATA Step and External Files 19

Interactions … External FilesDefining the External Data Source

Other related functions …FILEEXIST IF FILEEXIST("c:\sasconf\sasconf2.dat")Return Code Meaning 0 External File does not exist 1 External File does exist

Interactions Between the DATA Step and External Files 20

Interactions … External FilesDefining the External Data Source

Other related functions …FEXIST IF FEXIST("confdat2" ) THEN DO;Return Code Meaning 0 External File does not exist 1 External File does exist

Interactions Between the DATA Step and External Files 21

Interactions … External FilesDefining the External Data Source

Specialized File Ref. – Input Data Only

DATALINES• Data can be found instream at end of

DATA step, denoted by DATALINES statement.

• Instream data concluded by semicolon.

Interactions Between the DATA Step and External Files 22

Interactions … External FilesDefining the External Data Source

Specialized File Ref. – Input Data Only

DATALINESDATA SAS_Conf07; INFILE DATALINES; /* optional for DATALINES */ INPUT @ 1 Start_Dt MMDDYY8. @ 10 End_Dt MMDDYY8. @ 19 ConfName $CHAR16. @ 36 ConfLoc $CHAR16. ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX11/04/07 11/06/07 SESUG Hilton Head SC

;

Interactions Between the DATA Step and External Files 23

Interactions … External FilesDefining the External Data Source

Specialized File Ref. – Input Data OnlyDATALINES• Assumed to be

card-image (80 bytes)• Can override with system

option NOCARDIMAGE.

Interactions Between the DATA Step and External Files 24

Interactions … External FilesDefining the External Data Source

Specialized File Ref. – Input Data OnlyDATALINES• Alias = CARDS.• Can also use DATALINES4

(or CARDS4) if data mightcontain semicolon.

Interactions Between the DATA Step and External Files 25

Interactions … External FilesDefining the External Data Source

Specialized File Ref. – Input Data OnlyDATALINES vs. CARDS18 DATA _NULL_;19 SET SAS_Conf07;20 FILE CARDS;NOTE: The file CARDS cannot be opened for UPDATE processing.21 PUT _N_= ConfName ;22 RUN;NOTE: The SAS System stopped processing this step because of errors.

Interactions Between the DATA Step and External Files 26

Interactions … External FilesDefining the External Data Source

Specialized File Ref. – Input Data OnlyDATALINES vs. CARDS62 DATA _NULL_;63 SET SAS_Conf07;64 FILE DATALINES;65 PUT _N_= ConfName ;66 RUN;_N_=1 SAS Global Forum… … … … _N_=8 NESUGNOTE: There were 8 observations read from the data set WORK.SAS_CONF07.

Where’s the error message?

Interactions Between the DATA Step and External Files 27

Interactions … External FilesDefining the Appearance of the Data

Two questions (from early in this talk):Where is the data / where is it going?

What does the data look like?

5 types of INPUT and PUT, although SAS only counts 4 of them as separate types.• LIST input

• FORMATTED input

• COLUMN input

• NAMED input

• NULL input

Interactions Between the DATA Step and External Files 28

Interactions … External FilesDefining the Appearance of the Data

•Two questions (from early in this talk): Where is the data / where is it going?

• What does the data look like?

•5 types of INPUT and PUT, although SAS only counts 4 of them as separate types.• LIST input

• FORMATTED input

• COLUMN input

• NAMED input

• NULL input

Most references in

presentation will be to INPUT

for brevity’s sake, although

most are also valid for PUT.

Interactions Between the DATA Step and External Files 29

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)

• Simply insert variables instatement.

• No descriptive informationnecessary (except “$”denoting character vars.)

• Blank space separates values.

Interactions Between the DATA Step and External Files 30

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

PROBLEM: Dates have specialized formats.

Interactions Between the DATA Step and External Files 31

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

SOLUTION: Use INFORMAT statement.

Interactions Between the DATA Step and External Files 32

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

PROBLEM: Values longer than 8 characters.

Interactions Between the DATA Step and External Files 33

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

SOLUTION: UseLENGTH statement.

Interactions Between the DATA Step and External Files 34

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

PROBLEM: Embedded Blanks in character strings.

Interactions Between the DATA Step and External Files 35

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

SOLUTION: Use “&” format modifiers.

Interactions Between the DATA Step and External Files 36

LIST Input (and Output)QUICK ASIDE – Format Modifiers

& Ampersand Permit single embedded blanks in character variables.

: Colon Ignore its default 8-character maximum on character variables.

? Question Mark

Bypass error processing should invalid data be expected and acceptable to the application - for example, attempting to read a character field into a numeric variable. Suppress the “Invalid Data” message, allowing the other indications of an error to be dealt with normally.

?? Double Question Mark

Bypass error processing AND conceal every indication of an error. The “Invalid Data” message and the echoing of the input line containing the bad data are omitted from the SASLOG, and the _ERROR_ variable is not set to “true” (numeric 1).

~ Tilde Read delimiters within quoted character values as characters, and retains the quotation marks.

Interactions … External Files Defining the Appearance of the Data

Interactions Between the DATA Step and External Files 37

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;

SOLUTION: Use “&” format modifiers.SOLUTION: Move CONFLOC variable 1 position to the right.

Interactions Between the DATA Step and External Files 38

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ; ConfName = TRANSLATE( ConfName, ' ', '-' ); ConfLoc = TRANSLATE( ConfLoc, ' ', '-' );DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando-------FL06/03/07 06/06/07 PharmaSUG Denver--------CO09/16/07 09/18/07 PNWSUG Seattle-------WA09/30/07 10/02/07 SCSUG Austin--------TX10/17/07 10/19/07 WUSS San Francisco-CA10/28/07 10/30/07 MWSUG Des Moines----IA11/04/07 11/06/07 SESUG Hilton Head---SC

SOLUTION: Use “&” format modifiers.SOLUTION: Move CONFLOC variable 1 position to the right.SOLUTION: Insert dashes to separate city from state, change them back to blanks.

Interactions Between the DATA Step and External Files 39

Interactions … External Files Defining the Appearance of the Data

LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ; ConfName = TRANSLATE( ConfName, ' ', '-' ); ConfLoc = TRANSLATE( ConfLoc, ' ', '-' );DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando-------FL06/03/07 06/06/07 PharmaSUG Denver--------CO09/16/07 09/18/07 PNWSUG Seattle-------WA09/30/07 10/02/07 SCSUG Austin--------TX10/17/07 10/19/07 WUSS San Francisco-CA10/28/07 10/30/07 MWSUG Des Moines----IA11/04/07 11/06/07 SESUG Hilton Head---SC

SOLUTION: Use “&” format modifiers.SOLUTION: Move CONFLOC variable 1 position to the right.SOLUTION: Insert dashes to separate city from state, change them back to blanks.

WAIT A MINUTE!!!

Am I really suggesting that

we change the data to

accommodate the

weaknesses of the routine?

Interactions Between the DATA Step and External Files 40

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)

• Can specify informat / format for variables.(What does each variable look like?)

• Can specify starting position in record.(Where is each variable on each record?)

Interactions Between the DATA Step and External Files 41

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Informats and Formats

Handles length andappearance of data

Interactions Between the DATA Step and External Files 42

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Informats and Formats

5 types of Informat / Format:

• Character

• Numeric

• Date / Time

• Column Binary

• User-Defined

Not covered inthis presentation.Not covered inthis presentation.PROC FORMAT

Interactions Between the DATA Step and External Files 43

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Informats and Formats

Formats can be assigned with:

• INFORMAT statement(or FORMAT statement)

• ATTRIB statement, using INFORMAT=FORMAT=

• Specifying on INPUT / PUT

Interactions Between the DATA Step and External Files 44

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Column Pointers

Two types of Column Pointers

• Relative Column Pointers +

• Absolute Column Pointers @

Interactions Between the DATA Step and External Files 45

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Column Pointers

File: 'c:\sasconf\sascon07.dat'

04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD

Interactions Between the DATA Step and External Files 46

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Column Pointers – Relative

DATA SAS_Conf07; RETAIN pt_loc 18 ; INFILE 'c:\sasconf\sascon07.dat'; INPUT Start_Dt mmddyy8. +1 End_Dt mmddyy8. +1 ConfName $char16. +1 ConfLoc $char16. ; FILE 'c:\sasconf\sascon07a.dat'; PUT Start_Dt date7. +10 ConfName $char16. +(-25) End_Dt date7. +(pt_loc) ConfLoc $char16. ;RUN;

Basic example on INPUT statement.

More advanced example on PUT

statement.

Interactions Between the DATA Step and External Files 47

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Column Pointers – Absolute

DATA SAS_Conf07; RETAIN pt_loc 36 ; INFILE 'c:\sasconf\sascon07.dat'; INPUT @ 1 Start_Dt mmddyy8. @ 10 End_Dt mmddyy8. @ 19 ConfName $char16. @ 36 ConfLoc $char16. ; FILE 'c:\sasconf\sascon07b.dat'; PUT @ 1 Start_Dt mmddyy8. @ 19 ConfName $char16. @ 10 End_Dt mmddyy8. @ pt_loc ConfLoc $char16.;RUN;

Basic example on INPUT statement.

More advanced example on PUT

statement.

Interactions Between the DATA Step and External Files 48

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Column Pointers – Absolute – Character

DATA SAS_Conf07; RETAIN pt_loc "LOC:" ; INFILE DATALINES ; INPUT @ 1 Start_Dt mmddyy8. @ '07 ' ConfName $char16. @ 10 End_Dt mmddyy8. @ pt_loc ConfLoc $char16. ;DATALINES;04/16/07 04/19/07 SAS Global Forum LOC:Orlando FL06/03/07 06/06/07 PharmaSUG LOC:Denver CO09/16/07 09/18/07 PNWSUG LOC:Seattle WA09/30/07 10/02/07 SCSUG LOC:Austin TX11/04/07 11/06/07 SESUG LOC:Hilton Head SC;

Even more advanced example

Interactions Between the DATA Step and External Files 49

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Column Pointers – Absolute – Character

Only use a pointer with a character string on INPUT. Try it with PUT, and you get:ERROR: The @'CHARACTER_STRING' or @CHARACTER_VARIABLE specification is valid on the PUT statement only in conjunction with the FILE ODS option. The execution of the DATA STEP is being terminated.

Interactions Between the DATA Step and External Files 50

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Line Pointers

Two types of Column Pointers

• Relative Column Pointers /

• Absolute Column Pointers #

Interactions Between the DATA Step and External Files 51

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Line Pointers – Absolute

DATA SAS_Conf07; RETAIN line_loc 2 ; INFILE DATALINES ; INPUT #1 Start_Dt mmddyy8. #line_loc End_Dt mmddyy8. +1 ConfName $char16. #(line_loc+1) ConfLoc $char16. ;… … … DATALINES;04/16/0704/19/07 SAS Global ForumOrlando FL09/16/0709/18/07 PNWSUG … … …

Interactions Between the DATA Step and External Files 52

Interactions … External Files Defining the Appearance of the Data

Formatted Input (and Output)Line Pointers – Relative

DATA SAS_Conf07; … … … FILE 'c:\sasconf\conftbl.dat' ;

PUT @ 1 ConfName $char16. @ 21 Start_Dt date7. " to“ / @ 1 ConfLoc $char16. @ 21 End_Dt date7. / 28*'-';

DATALINES;

… … …;

SAS Global Forum 16APR07 to

Orlando FL 19APR07

----------------------------

PNWSUG 16SEP07 to

Seattle WA 18SEP07

----------------------------

SCSUG 30SEP07 to

Austin TX 02OCT07

----------------------------

SESUG 04NOV07 to

Hilton Head SC 06NOV07

----------------------------

Interactions Between the DATA Step and External Files 53

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)

• List each variable with its start and end position.

Interactions Between the DATA Step and External Files 54

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)

• List each variable with its start and end position.

• No flexibility to use variables to specify start and end column.

• Cannot specify formats.

Interactions Between the DATA Step and External Files 55

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)Working Code

1 DATA SAS_Conf07;2 INFILE 'c:\sasconf\sascon07.dat' ;3 INPUT Start_DtC $ 1 - 84 End_DtC $ 10 – 175 ConfName $ 19 – 346 ConfLoc $ 36 - 51 ;7 Start_Dt = INPUT( Start_DtC, mmddyy8. ) ;8 End_Dt = INPUT( End_DtC, mmddyy8. ) ;9 RUN;

NOTE: The data set WORK.SAS_CONF07 has 8 observations and 6 variables.

Interactions Between the DATA Step and External Files 56

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)Non-Working Code – Flexible Columns

40 DATA SAS_Conf07;41 RETAIN NamePos 19 ;42 INFILE 'c:\sasconf\sascon07.dat' ;43 INPUT Start_DtC $ 1 - 844 End_DtC $ 10 – 1745 ConfName $ NamePos – 34 -- 22ERROR 22-322: Expecting a name.46 ConfLoc $ 36 - 51 ;

ERROR: Missing numeric suffix on a numbered variable list (NamePos-NAME).…

Interactions Between the DATA Step and External Files 57

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)Non-Working Code – Flexible Formats

61 DATA SAS_Conf07;62 INFILE 'c:\sasconf\sascon07.dat' ;63 INPUT Start_Dt mmddyy8. $ 1 - 8 - 22ERROR 22-322: Expecting a name.64 End_DtC $ 10 – 1765 ConfName $ NamePos – 3466 ConfLoc $ 36 - 51 ;

Interactions Between the DATA Step and External Files 58

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)Non-Working Code – Flexible Formats

81 DATA SAS_Conf07;82 INFILE 'c:\sasconf\sascon07.dat' ;83 INPUT Start_Dt $ 1 - 8 mmddyy8. -------- 22ERROR 22-322: Expecting a name.84 End_DtC $ 10 – 1785 ConfName $ NamePos – 3486 ConfLoc $ 36 - 51 ;

Interactions Between the DATA Step and External Files 59

Interactions … External Files Defining the Appearance of the Data

Column Input (and Output)Non-Working Code – Flexible Formats

174 FORMAT Start_Dt End_Dt DATE7. ;175 FILE LOG ;176 PUT Start_Dt 1 - 8177 End_Dt 10 – 17178 ConfName 19 – 34179 ConfLoc $ 36 - 51 ;180 RUN;

17272 17275 SAS Global Forum Orlando FL 17320 17323 PharmaSUG Denver CO 17425 17427 PNWSUG Seattle WA 17439 17441 SCSUG Austin TX 17456 17458 WUSS San Francisco CA 17467 17469 MWSUG Des Moines IA 17474 17476 SESUG Hilton Head SC

Interactions Between the DATA Step and External Files 60

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)

• Input must be in form:Fieldname=value

• INPUT statement canonly contain Namedinput if Named input is requested.

Interactions Between the DATA Step and External Files 61

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)

• Input must be in form:Fieldname=value

• INPUT statement canonly contain Namedinput once Named input usage has begun in that statement.

Named input is more flexible /

traditional when using the PUT

statement.

Interactions Between the DATA Step and External Files 62

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)

• Input must be in form:Fieldname=value

• Variables do not have to be found in the same order as on the INPUT statement.

Interactions Between the DATA Step and External Files 63

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)

• Input must be in form:Fieldname=value

• Variables do not even have to be on the INPUT statement! (Variables defined with LENGTH or ATTRIB statement will be identified and processed, as well.)

Interactions Between the DATA Step and External Files 64

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)Example using INPUT statement

File: 'c:\sasconf\param.dat‘

85 DATA Params;86 INFORMAT ProcessDt MMDDYY8.;87 INFILE 'c:\sasconf\param.dat';88 INPUT @ 1 RunType $CHAR4.89 Analyst= $CHAR3. ;90 RUN;

PROD ProcessDt=12/02/06 Analyst=ATK

Interactions Between the DATA Step and External Files 65

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)Example using PUT statement

92 DATA _NULL_;93 SET Params;94 FILE LOG;95 PUT RunType=96 ProcessDt= DATE9. Analyst= ;97 RUN;

RunType=PROD ProcessDt=02DEC2006 Analyst=ATK

Interactions Between the DATA Step and External Files 66

Interactions … External Files Defining the Appearance of the Data

Named Input (and Output)Example using PUT statement

99 DATA _NULL_;100 SET Params;101 FILE LOG;102 PUT _ALL_ ;103 RUN;

ProcessDt=17137 RunType=PROD Analyst=ATK_ERROR_=0 _N_=1

Interactions Between the DATA Step and External Files 67

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

• The 5th type of input?

• No line pointers,no column pointers,no formats,not even any variables!

Interactions Between the DATA Step and External Files 68

"As I was going up the stair, I met a man who was not there, He wasn't there again today, I wish that man would go away."

Hugh Mearns“… He must be with the CIA."

MAD Magazine

Interactions … External Files Defining the Appearance of the Data

Interactions Between the DATA Step and External Files 69

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

Do youremember this

slide from earlier in the

presentation?

Interactions Between the DATA Step and External Files 70

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

There is another form of column pointer known as the “trailing @”.

As expected, it is found at the end of a line on an INPUT or PUT statement.

Interactions Between the DATA Step and External Files 71

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

Trailing @.

Prevents line and column pointers from automatically moving to the start of next line after completion of INPUT statement.

Line pointer is not moved until next INPUT statement, or next iteration of DATA step.

Interactions Between the DATA Step and External Files 72

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

Double Trailing @.

Prevents line and column pointers from automatically moving to the next line after completion of INPUT statement.

Line pointer is not moved until next INPUT statement, or next iteration of DATA step.

Interactions Between the DATA Step and External Files 73

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

So … how do you restore the default behavior of the column and line pointer?

You guessed it … a Null INPUT / PUT statement!

Interactions Between the DATA Step and External Files 74

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

DATA SAS_Conf07; ARRAY ConfDt (2) Start_Dt End_Dt ; INFILE 'c:\sasconf\sascon07.dat' ; INPUT @ 19 ConfName $char16. @ 36 ConfLoc $char16. @ 1 @ ; DO i = 1 TO 2 ; INPUT ConfDt(i) mmddyy8. +1 @ ; END ; INPUT ;

… … …

Interactions Between the DATA Step and External Files 75

Interactions … External Files Defining the Appearance of the Data

Null Input (and Output)

DATA SAS_Conf07;… … …

PUT @ 1 ConfName $CHAR16. @ 18 Start_Dt mmddyy8. @ 27 End_Dt mmddyy8. @ ; IF MONTH( Start_Dt) ^= MONTH( End_Dt ) THEN PUT ' Span Month' ; ELSE PUT ;RUN ;

Interactions Between the DATA Step and External Files 76

Interactions … External Files Practical application

ASIDE: A little trick to quickly transition from input to output …

193 DATA _NULL_;194 INFILE DATALINES ;195 INPUT ;196 FILE PRINT ;197 PUT _INFILE_;198 DATALINES;

NOTE: 8 lines were written to file PRINT.

No variables listed on

INPUT statement.

Special variable _INFILE_ listed on

PUT statement.Add decision logic for

conditional output.

Interactions Between the DATA Step and External Files 77

Interactions … External Files Practical application

Q: So, which technique is best?

A: That depends on the data.

(You knew I was going to say that.)

Interactions Between the DATA Step and External Files 78

Reading Sequential files:usually Formatted input.

Reading parameters:sometimes Named input,sometimes List input

sometimes Formatted input

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 79

Writing Sequential files:usually Formatted input.

Writing parameters and debugs:often Named input,sometimes List input

sometimes Formatted input

Interactions … External Files Practical application

?

Interactions Between the DATA Step and External Files 80

Reading / writing directly to / from spreadsheet:• Can use DDE, which is outside the scope of

this presentation.

Using a text file as an interface:• Spreadsheet can read and write to CSV files.• SAS can read and write CSV files, too!

CSV = “Comma Separated Value

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 81

Reading / writing directly to / from spreadsheet:• Use DSD option on INFILE statement.

– Two or more consecutive delimiters denote missing values. (Normally, extra delimiters are ignored.)

– Default delimiter is comma. (Normally, default delimiter is a space.) This can still be overridden with DELIMITER= option.

DSD = “Delimiter Separated Data”

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 82

Reading / writing directly to / from spreadsheet:• Use DSD option on INFILE statement.

– Character values are assumed to be enclosed in double-quotes. (Use ~ format modifier if you need to keep the double-quotes as part of the data.)

DSD = “Delimiter Separated Data”

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 83

Interactions … External Files Defining the Appearance of the Data

04/16/07,04/19/07,"SAS Global Forum","Orlando FL"06/03/07,06/06/07,"PharmaSUG","Denver CO"09/16/07,09/18/07,"PNWSUG","Seattle WA"09/30/07,10/02/07,"SCSUG","Austin TX"10/17/07,10/19/07,"WUSS","San Francisco CA"10/28/07,10/30/07,"MWSUG","Des Moines IA"11/04/07,11/06/07,"SESUG","Hilton Head SC"11/11/07,11/14/07,"NESUG","Baltimore MD"

DSD – Sample Data

File: 'c:\sasconf\sascon07.csv'

Interactions Between the DATA Step and External Files 84

Interactions … External Files Defining the Appearance of the Data

DSD – Wrong Way

17 DATA SAS_Conf07;18 INFILE 'c:\sasconf\sascon07.csv' DSD;19 INPUT Start_Dt mmddyy8.20 End_Dt mmddyy8.21 ConfName $CHAR16.22 ConfLoc $CHAR16.;23 FILE 'c:\sasconf\testout2.csv' DSD;24 PUT ConfName $CHAR16.25 ConfLoc $CHAR16.26 Start_Dt date7.27 End_Dt date7.;28 RUN;

Interactions Between the DATA Step and External Files 85

Interactions … External Files Defining the Appearance of the Data

DSD – Wrong Way

NOTE: Invalid data for End_Dt in line 1 9-16.RULE: ----+----1----+----2----+----3----+---4— 04/16/07,04/19/07,"SAS Global Forum”,O …Start_Dt=16APR2007 End_Dt=. ConfName=7,"SAS Global Fo ConfLoc=rum","Orlando FL _ERROR_=1 _N_=1NOTE: Invalid data for End_Dt in line 2 9-16. 06/03/07,06/06/07,"PharmaSUG","Denver CO“Start_Dt=03JUN2007 End_Dt=. ConfName=7,"PharmaSUG","D ConfLoc=enver CO" _ERROR_=1 _N_=2

etc. etc. etc.

Interactions Between the DATA Step and External Files 86

DSD – Wrong Way

'c:\sasconf\testout1.csv'

7,"SAS Global Forum","Orlando FL16APR07 .7,"PharmaSUG","Denver CO" 03JUN07 .7,"PNWSUG","Seattle WA" 16SEP07 .7,"SCSUG","Austin TX" 30SEP07 .7,"WUSS","San Francisco CA" 17OCT07 .7,"MWSUG","Des Moines IA" 28OCT07 .7,"SESUG","Hilton Head SC" 04NOV07 .7,"NESUG","Baltimore MD" 11NOV07 .

Interactions … External Files Defining the Appearance of the Data

Interactions Between the DATA Step and External Files 87

Interactions … External Files Defining the Appearance of the Data

DSD – Right Way

44 DATA SAS_Conf07;45 INFILE 'c:\sasconf\sascon07.csv' DSD;46 INFORMAT Start_Dt End_Dt mmddyy8.47 ConfName ConfLoc $CHAR16.;48 FORMAT Start_Dt End_Dt date7. ;49 INPUT Start_Dt50 End_Dt 51 ConfName 52 ConfLoc ;53 FILE 'c:\sasconf\testout2.csv' DSD;54 PUT ConfName 55 ConfLoc 56 Start_Dt57 End_Dt ;58 RUN;

Interactions Between the DATA Step and External Files 88

DSD – Right Way

'c:\sasconf\testout2.csv'

SAS Global Forum,Orlando FL,16APR07,19APR07PharmaSUG,Denver CO,03JUN07,06JUN07PNWSUG,Seattle WA,16SEP07,18SEP07SCSUG,Austin TX,30SEP07,02OCT07WUSS,San Francisco CA,17OCT07,19OCT07MWSUG,Des Moines IA,28OCT07,30OCT07SESUG,Hilton Head SC,04NOV07,06NOV07NESUG,Baltimore MD,11NOV07,14NOV07

Interactions … External Files Defining the Appearance of the Data

Interactions Between the DATA Step and External Files 89

Interactions … External Files Practical application

Q: What will the system do ifyou ask for more variablesthan you actually have onyour input data?

A: Whatever you tell it to do.

(You knew I was going to say that, too.)

Interactions Between the DATA Step and External Files 90

INFILE options – End of Input Line

FLOWOVERAllows INPUT to read next data record. NOTE: This is the default option / behavior.

Watch for following in SASLOG:NOTE: SAS went to a new line when INPUT

statement reached past the end of a line.

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 91

INFILE options – End of Input Line

MISSOVERStops INPUT from reading new data record, remaining variables set to missing.

Watch for following in SASLOG:(nothing – no extra NOTEs, WARNINGs,

or ERRORs will be produced)

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 92

INFILE options – End of Input Line

TRUNCOVERStops INPUT from reading new data record, remaining variables set to missing.

Watch for following in SASLOG:(nothing – no extra NOTEs, WARNINGs,

or ERRORs will be produced)

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 93

INFILE options – End of Input Line

MISSOVER vs. TRUNCOVERThe difference between these two options is how they handle values that are in the middle of being read at the end of the line.

MISSOVER sets it to a missing value. TRUNCOVER “settles for what it can get”, and stores as much as is present.

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 94

INFILE options – End of Input Line

STOPOVERStops INPUT from reading new data record, stops execution of DATA step, sets _ERROR_ = 1.

Watch for following in SASLOG:ERROR: INPUT statement exceeded record

length. INFILE <infile> OPTION STOPOVER specified.

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 95

INFILE options – End of Input Line

SCANOVER Searches next line for contents of the “@<string>” value.

Watch for following in SASLOG:NOTE: SAS went to a new line when INPUT

@'CHARACTER_STRING' scanned past the end of a line.

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 96

INFILE options – End of Input Line

TURNOVER• Meeting planners term for moving one

group out of a room and another group in.• Catering term for a type of pastry.

Interactions … External Files Practical application

Interactions Between the DATA Step and External Files 97

Interactions … External FilesConclusion

• 50 minute presentation can only scratch the surface of getting data into SAS and putting data out of SAS.

• You can learn more from a few minutes of independent trial and error than from sitting and listening to me for 50 minutes!

Interactions Between the DATA Step and External Files 98

Interactions … External FilesConclusion

For further information …

[email protected]

[email protected]

Interactions Between the DATA Step and External Files 99

Interactions … External FilesConclusion

Special thanks to the late John Tenniel (1820-1914) for providing the graphics used in this presentation.

Interactions Between the DATA Step and External Files 100

Interactions … External FilesConclusion

SAS and

all other SAS Institute Inc. product

or service names are registered trademarks

or trademarks of SAS Institute Inc. in the USA

and other countries.

indicates USA registration.

My lawyer