easy come, easy go — interactions between the data step and external files andrew t. kuligowski,...
TRANSCRIPT
Easy Come, Easy Go — Interactions between the
DATA Step and External Files
Andrew T. Kuligowski,
The Nielsen Company
Interactions Between the DATA Step and External Files 2
Interactions … External FilesIntroduction
• Introduction• Defining the external data source• Describing the appearance of the
data• Practical Application
• Conclusion
Interactions Between the DATA Step and External Files 3
Interactions … External FilesIntroduction
This presentation will provide basic answers to two questions:
• How do I get my data into SAS for analysis?
• How do I get my data out of SAS?
Interactions Between the DATA Step and External Files 4
Interactions … External FilesIntroduction
Process of moving data from an external file to SAS, and from SAS to an external source, are mirror images of each other.
(Almost)
Interactions Between the DATA Step and External Files 5
Interactions … External FilesIntroduction
You need to know the answer to two questions in order to code properly.
• Where is this external source / destination? (and, is there anything significant about this file?)
• What does the data look like coming in / going out?
Interactions Between the DATA Step and External Files 6
Interactions … External FilesIntroduction
"The time has come," the Walrus said,"To talk of many things:Of shoes--and ships--and sealing-wax--Of cabbages--and kings--And why the sea is boiling hot--And whether pigs have wings."
Lewis Carroll "I am the Walrus."
John Lennon
Interactions Between the DATA Step and External Files 7
Interactions … External FilesDefining the External Data Source
Two statements that will identify an external file to DATA step (and possibly provide a little information about its form).
• INFILE statement
• FILE statement
Interactions Between the DATA Step and External Files 8
Interactions … External FilesDefining the External Data Source
Most straightforward usage – hardcode the file name right in the statement.
DATA SAMPLE ; /* full file name - under Windows */ INFILE 'c:\sasconf\sasconf.dat';
Self-documenting code!Requires coding change to point
to a different dataset.
PLUS
MINUS
Interactions Between the DATA Step and External Files 9
Interactions … External FilesDefining the External Data Source
Variation – use a macro variable.
%LET FILENM = c:\sasconf\sasconf.dat;DATA SAMPLE ; /* macro file name - under Windows */ INFILE “&FILENM”;
Of course, this isn’t much more flexible. However, with a little creative coding, this approach can be made very flexible.
Interactions Between the DATA Step and External Files 10
Interactions … External FilesDefining the External Data Source
More flexibility with FILEVAR statement.
%LET FileNm = %STR(SASCONF);DATA SAMPLE ; RETAIN fn “&FileNm" ; File_in = TRIM( fn ) || ".dat" ; File_out = TRIM( fn ) || ".out" ; INFILE dummy FILEVAR=File_in ; FILE dummy2 FILEVAR=File_out ;
Can be hardcoded, or it can change from obs to obs.
Interactions Between the DATA Step and External Files 11
Interactions … External FilesDefining the External Data Source
More flexibility with FILEVAR statement.
%LET FileNm = %STR(SASCONF);DATA SAMPLE ; RETAIN fn “&FileNm" ; File_in = TRIM( fn ) || ".dat" ; File_out = TRIM( fn ) || ".out" ; INFILE dummy FILEVAR=File_in ; FILE dummy2 FILEVAR=File_out ;
Can be hardcoded, or it can change from obs to obs.
File Reference
Set to “dummy” to show these are not actually used by the
routine when FILEVAR used, but different from each other to show they
must be unique.
Interactions Between the DATA Step and External Files 12
Interactions … External FilesDefining the External Data Source
Defining a File Reference(1) In the Operating System:
//SAMPDATA DD DSN=SAS.GLOBAL.FORUM.SAMPDAT,// DISP=SHR
Example uses MVS – shows why File References are also called “DDNAMES”.
Interactions Between the DATA Step and External Files 13
Interactions … External FilesDefining the External Data Source
Defining a File Reference(2) Using the FILENAME statement:
FILENAME confdata 'c:\sasconf\sasconf.dat';
Example uses Windows.
Interactions Between the DATA Step and External Files 14
Interactions … External FilesDefining the External Data Source
Defining a File Reference(2) Using the FILENAME statement(2a) Special options for FILENAME:
FILENAME <fileref> CLEAR;
Remove / release specified File Ref.FILENAME <fileref> LIST;
Print a list of active File Refs to SASLOG.
Interactions Between the DATA Step and External Files 15
Interactions … External FilesDefining the External Data Source
Defining a File Reference(2) Using the FILENAME statement(2a) Special options for FILENAME:(2ai) Special pseudo File Reference:
FILENAME _ALL_ LIST;FILENAME _ALL_ CLEAR;
Perform action on all active File Refs
Interactions Between the DATA Step and External Files 16
Interactions … External FilesDefining the External Data Source
Defining a File Reference(3) Special “Aggregate File
Reference”MVS: Partition Dataset
FILENAME confdata 'userid.sasconf.data'; DATA SAS_Conf07;
INFILE confdata(sascon07);
Interactions Between the DATA Step and External Files 17
Interactions … External FilesDefining the External Data Source
Defining a File Reference(3) Special “Aggregate File
Reference”MVS: Partition DatasetUnix: DirectoryWindows: Directory FILENAME confdata 'c:\sasconf';
DATA SAS_Conf07;
INFILE confdata(sascon07);
Interactions Between the DATA Step and External Files 18
Interactions … External FilesDefining the External Data Source
Other related functions …FILEREF RetCd2 = FILEREF( "confdat2" );Return Code Meaning 0 File Reference is assigned>0 (+) File Reference not currently assigned<0 (-) File Reference assigned, but file it
refers to does not (currently) exist.
Interactions Between the DATA Step and External Files 19
Interactions … External FilesDefining the External Data Source
Other related functions …FILEEXIST IF FILEEXIST("c:\sasconf\sasconf2.dat")Return Code Meaning 0 External File does not exist 1 External File does exist
Interactions Between the DATA Step and External Files 20
Interactions … External FilesDefining the External Data Source
Other related functions …FEXIST IF FEXIST("confdat2" ) THEN DO;Return Code Meaning 0 External File does not exist 1 External File does exist
Interactions Between the DATA Step and External Files 21
Interactions … External FilesDefining the External Data Source
Specialized File Ref. – Input Data Only
DATALINES• Data can be found instream at end of
DATA step, denoted by DATALINES statement.
• Instream data concluded by semicolon.
Interactions Between the DATA Step and External Files 22
Interactions … External FilesDefining the External Data Source
Specialized File Ref. – Input Data Only
DATALINESDATA SAS_Conf07; INFILE DATALINES; /* optional for DATALINES */ INPUT @ 1 Start_Dt MMDDYY8. @ 10 End_Dt MMDDYY8. @ 19 ConfName $CHAR16. @ 36 ConfLoc $CHAR16. ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX11/04/07 11/06/07 SESUG Hilton Head SC
;
Interactions Between the DATA Step and External Files 23
Interactions … External FilesDefining the External Data Source
Specialized File Ref. – Input Data OnlyDATALINES• Assumed to be
card-image (80 bytes)• Can override with system
option NOCARDIMAGE.
Interactions Between the DATA Step and External Files 24
Interactions … External FilesDefining the External Data Source
Specialized File Ref. – Input Data OnlyDATALINES• Alias = CARDS.• Can also use DATALINES4
(or CARDS4) if data mightcontain semicolon.
Interactions Between the DATA Step and External Files 25
Interactions … External FilesDefining the External Data Source
Specialized File Ref. – Input Data OnlyDATALINES vs. CARDS18 DATA _NULL_;19 SET SAS_Conf07;20 FILE CARDS;NOTE: The file CARDS cannot be opened for UPDATE processing.21 PUT _N_= ConfName ;22 RUN;NOTE: The SAS System stopped processing this step because of errors.
Interactions Between the DATA Step and External Files 26
Interactions … External FilesDefining the External Data Source
Specialized File Ref. – Input Data OnlyDATALINES vs. CARDS62 DATA _NULL_;63 SET SAS_Conf07;64 FILE DATALINES;65 PUT _N_= ConfName ;66 RUN;_N_=1 SAS Global Forum… … … … _N_=8 NESUGNOTE: There were 8 observations read from the data set WORK.SAS_CONF07.
Where’s the error message?
Interactions Between the DATA Step and External Files 27
Interactions … External FilesDefining the Appearance of the Data
Two questions (from early in this talk):Where is the data / where is it going?
What does the data look like?
5 types of INPUT and PUT, although SAS only counts 4 of them as separate types.• LIST input
• FORMATTED input
• COLUMN input
• NAMED input
• NULL input
Interactions Between the DATA Step and External Files 28
Interactions … External FilesDefining the Appearance of the Data
•Two questions (from early in this talk): Where is the data / where is it going?
• What does the data look like?
•5 types of INPUT and PUT, although SAS only counts 4 of them as separate types.• LIST input
• FORMATTED input
• COLUMN input
• NAMED input
• NULL input
Most references in
presentation will be to INPUT
for brevity’s sake, although
most are also valid for PUT.
Interactions Between the DATA Step and External Files 29
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)
• Simply insert variables instatement.
• No descriptive informationnecessary (except “$”denoting character vars.)
• Blank space separates values.
Interactions Between the DATA Step and External Files 30
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
PROBLEM: Dates have specialized formats.
Interactions Between the DATA Step and External Files 31
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
SOLUTION: Use INFORMAT statement.
Interactions Between the DATA Step and External Files 32
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
PROBLEM: Values longer than 8 characters.
Interactions Between the DATA Step and External Files 33
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
SOLUTION: UseLENGTH statement.
Interactions Between the DATA Step and External Files 34
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ ConfLoc $ ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
PROBLEM: Embedded Blanks in character strings.
Interactions Between the DATA Step and External Files 35
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
SOLUTION: Use “&” format modifiers.
Interactions Between the DATA Step and External Files 36
LIST Input (and Output)QUICK ASIDE – Format Modifiers
& Ampersand Permit single embedded blanks in character variables.
: Colon Ignore its default 8-character maximum on character variables.
? Question Mark
Bypass error processing should invalid data be expected and acceptable to the application - for example, attempting to read a character field into a numeric variable. Suppress the “Invalid Data” message, allowing the other indications of an error to be dealt with normally.
?? Double Question Mark
Bypass error processing AND conceal every indication of an error. The “Invalid Data” message and the echoing of the input line containing the bad data are omitted from the SASLOG, and the _ERROR_ variable is not set to “true” (numeric 1).
~ Tilde Read delimiters within quoted character values as characters, and retains the quotation marks.
Interactions … External Files Defining the Appearance of the Data
Interactions Between the DATA Step and External Files 37
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ;DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD;
SOLUTION: Use “&” format modifiers.SOLUTION: Move CONFLOC variable 1 position to the right.
Interactions Between the DATA Step and External Files 38
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ; ConfName = TRANSLATE( ConfName, ' ', '-' ); ConfLoc = TRANSLATE( ConfLoc, ' ', '-' );DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando-------FL06/03/07 06/06/07 PharmaSUG Denver--------CO09/16/07 09/18/07 PNWSUG Seattle-------WA09/30/07 10/02/07 SCSUG Austin--------TX10/17/07 10/19/07 WUSS San Francisco-CA10/28/07 10/30/07 MWSUG Des Moines----IA11/04/07 11/06/07 SESUG Hilton Head---SC
SOLUTION: Use “&” format modifiers.SOLUTION: Move CONFLOC variable 1 position to the right.SOLUTION: Insert dashes to separate city from state, change them back to blanks.
Interactions Between the DATA Step and External Files 39
Interactions … External Files Defining the Appearance of the Data
LIST Input (and Output)Example of non-working codeDATA SAS_Conf07; INFORMAT Start_Dt End_Dt mmddyy8. ; LENGTH ConfName $ 16 ConfLoc $ 16. ; INFILE DATALINES ; INPUT Start_Dt End_Dt ConfName $ & ConfLoc $ & ; ConfName = TRANSLATE( ConfName, ' ', '-' ); ConfLoc = TRANSLATE( ConfLoc, ' ', '-' );DATALINES;04/16/07 04/19/07 SAS Global Forum Orlando-------FL06/03/07 06/06/07 PharmaSUG Denver--------CO09/16/07 09/18/07 PNWSUG Seattle-------WA09/30/07 10/02/07 SCSUG Austin--------TX10/17/07 10/19/07 WUSS San Francisco-CA10/28/07 10/30/07 MWSUG Des Moines----IA11/04/07 11/06/07 SESUG Hilton Head---SC
SOLUTION: Use “&” format modifiers.SOLUTION: Move CONFLOC variable 1 position to the right.SOLUTION: Insert dashes to separate city from state, change them back to blanks.
WAIT A MINUTE!!!
Am I really suggesting that
we change the data to
accommodate the
weaknesses of the routine?
Interactions Between the DATA Step and External Files 40
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)
• Can specify informat / format for variables.(What does each variable look like?)
• Can specify starting position in record.(Where is each variable on each record?)
Interactions Between the DATA Step and External Files 41
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Informats and Formats
Handles length andappearance of data
Interactions Between the DATA Step and External Files 42
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Informats and Formats
5 types of Informat / Format:
• Character
• Numeric
• Date / Time
• Column Binary
• User-Defined
Not covered inthis presentation.Not covered inthis presentation.PROC FORMAT
Interactions Between the DATA Step and External Files 43
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Informats and Formats
Formats can be assigned with:
• INFORMAT statement(or FORMAT statement)
• ATTRIB statement, using INFORMAT=FORMAT=
• Specifying on INPUT / PUT
Interactions Between the DATA Step and External Files 44
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Column Pointers
Two types of Column Pointers
• Relative Column Pointers +
• Absolute Column Pointers @
Interactions Between the DATA Step and External Files 45
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Column Pointers
File: 'c:\sasconf\sascon07.dat'
04/16/07 04/19/07 SAS Global Forum Orlando FL06/03/07 06/06/07 PharmaSUG Denver CO09/16/07 09/18/07 PNWSUG Seattle WA09/30/07 10/02/07 SCSUG Austin TX10/17/07 10/19/07 WUSS San Francisco CA10/28/07 10/30/07 MWSUG Des Moines IA11/04/07 11/06/07 SESUG Hilton Head SC11/11/07 11/14/07 NESUG Baltimore MD
Interactions Between the DATA Step and External Files 46
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Column Pointers – Relative
DATA SAS_Conf07; RETAIN pt_loc 18 ; INFILE 'c:\sasconf\sascon07.dat'; INPUT Start_Dt mmddyy8. +1 End_Dt mmddyy8. +1 ConfName $char16. +1 ConfLoc $char16. ; FILE 'c:\sasconf\sascon07a.dat'; PUT Start_Dt date7. +10 ConfName $char16. +(-25) End_Dt date7. +(pt_loc) ConfLoc $char16. ;RUN;
Basic example on INPUT statement.
More advanced example on PUT
statement.
Interactions Between the DATA Step and External Files 47
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Column Pointers – Absolute
DATA SAS_Conf07; RETAIN pt_loc 36 ; INFILE 'c:\sasconf\sascon07.dat'; INPUT @ 1 Start_Dt mmddyy8. @ 10 End_Dt mmddyy8. @ 19 ConfName $char16. @ 36 ConfLoc $char16. ; FILE 'c:\sasconf\sascon07b.dat'; PUT @ 1 Start_Dt mmddyy8. @ 19 ConfName $char16. @ 10 End_Dt mmddyy8. @ pt_loc ConfLoc $char16.;RUN;
Basic example on INPUT statement.
More advanced example on PUT
statement.
Interactions Between the DATA Step and External Files 48
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Column Pointers – Absolute – Character
DATA SAS_Conf07; RETAIN pt_loc "LOC:" ; INFILE DATALINES ; INPUT @ 1 Start_Dt mmddyy8. @ '07 ' ConfName $char16. @ 10 End_Dt mmddyy8. @ pt_loc ConfLoc $char16. ;DATALINES;04/16/07 04/19/07 SAS Global Forum LOC:Orlando FL06/03/07 06/06/07 PharmaSUG LOC:Denver CO09/16/07 09/18/07 PNWSUG LOC:Seattle WA09/30/07 10/02/07 SCSUG LOC:Austin TX11/04/07 11/06/07 SESUG LOC:Hilton Head SC;
Even more advanced example
Interactions Between the DATA Step and External Files 49
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Column Pointers – Absolute – Character
Only use a pointer with a character string on INPUT. Try it with PUT, and you get:ERROR: The @'CHARACTER_STRING' or @CHARACTER_VARIABLE specification is valid on the PUT statement only in conjunction with the FILE ODS option. The execution of the DATA STEP is being terminated.
Interactions Between the DATA Step and External Files 50
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Line Pointers
Two types of Column Pointers
• Relative Column Pointers /
• Absolute Column Pointers #
Interactions Between the DATA Step and External Files 51
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Line Pointers – Absolute
DATA SAS_Conf07; RETAIN line_loc 2 ; INFILE DATALINES ; INPUT #1 Start_Dt mmddyy8. #line_loc End_Dt mmddyy8. +1 ConfName $char16. #(line_loc+1) ConfLoc $char16. ;… … … DATALINES;04/16/0704/19/07 SAS Global ForumOrlando FL09/16/0709/18/07 PNWSUG … … …
Interactions Between the DATA Step and External Files 52
Interactions … External Files Defining the Appearance of the Data
Formatted Input (and Output)Line Pointers – Relative
DATA SAS_Conf07; … … … FILE 'c:\sasconf\conftbl.dat' ;
PUT @ 1 ConfName $char16. @ 21 Start_Dt date7. " to“ / @ 1 ConfLoc $char16. @ 21 End_Dt date7. / 28*'-';
DATALINES;
… … …;
SAS Global Forum 16APR07 to
Orlando FL 19APR07
----------------------------
PNWSUG 16SEP07 to
Seattle WA 18SEP07
----------------------------
SCSUG 30SEP07 to
Austin TX 02OCT07
----------------------------
SESUG 04NOV07 to
Hilton Head SC 06NOV07
----------------------------
Interactions Between the DATA Step and External Files 53
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)
• List each variable with its start and end position.
Interactions Between the DATA Step and External Files 54
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)
• List each variable with its start and end position.
• No flexibility to use variables to specify start and end column.
• Cannot specify formats.
Interactions Between the DATA Step and External Files 55
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)Working Code
1 DATA SAS_Conf07;2 INFILE 'c:\sasconf\sascon07.dat' ;3 INPUT Start_DtC $ 1 - 84 End_DtC $ 10 – 175 ConfName $ 19 – 346 ConfLoc $ 36 - 51 ;7 Start_Dt = INPUT( Start_DtC, mmddyy8. ) ;8 End_Dt = INPUT( End_DtC, mmddyy8. ) ;9 RUN;
NOTE: The data set WORK.SAS_CONF07 has 8 observations and 6 variables.
Interactions Between the DATA Step and External Files 56
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)Non-Working Code – Flexible Columns
40 DATA SAS_Conf07;41 RETAIN NamePos 19 ;42 INFILE 'c:\sasconf\sascon07.dat' ;43 INPUT Start_DtC $ 1 - 844 End_DtC $ 10 – 1745 ConfName $ NamePos – 34 -- 22ERROR 22-322: Expecting a name.46 ConfLoc $ 36 - 51 ;
ERROR: Missing numeric suffix on a numbered variable list (NamePos-NAME).…
Interactions Between the DATA Step and External Files 57
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)Non-Working Code – Flexible Formats
61 DATA SAS_Conf07;62 INFILE 'c:\sasconf\sascon07.dat' ;63 INPUT Start_Dt mmddyy8. $ 1 - 8 - 22ERROR 22-322: Expecting a name.64 End_DtC $ 10 – 1765 ConfName $ NamePos – 3466 ConfLoc $ 36 - 51 ;
…
Interactions Between the DATA Step and External Files 58
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)Non-Working Code – Flexible Formats
81 DATA SAS_Conf07;82 INFILE 'c:\sasconf\sascon07.dat' ;83 INPUT Start_Dt $ 1 - 8 mmddyy8. -------- 22ERROR 22-322: Expecting a name.84 End_DtC $ 10 – 1785 ConfName $ NamePos – 3486 ConfLoc $ 36 - 51 ;
…
Interactions Between the DATA Step and External Files 59
Interactions … External Files Defining the Appearance of the Data
Column Input (and Output)Non-Working Code – Flexible Formats
174 FORMAT Start_Dt End_Dt DATE7. ;175 FILE LOG ;176 PUT Start_Dt 1 - 8177 End_Dt 10 – 17178 ConfName 19 – 34179 ConfLoc $ 36 - 51 ;180 RUN;
17272 17275 SAS Global Forum Orlando FL 17320 17323 PharmaSUG Denver CO 17425 17427 PNWSUG Seattle WA 17439 17441 SCSUG Austin TX 17456 17458 WUSS San Francisco CA 17467 17469 MWSUG Des Moines IA 17474 17476 SESUG Hilton Head SC
Interactions Between the DATA Step and External Files 60
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:Fieldname=value
• INPUT statement canonly contain Namedinput if Named input is requested.
Interactions Between the DATA Step and External Files 61
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:Fieldname=value
• INPUT statement canonly contain Namedinput once Named input usage has begun in that statement.
Named input is more flexible /
traditional when using the PUT
statement.
Interactions Between the DATA Step and External Files 62
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:Fieldname=value
• Variables do not have to be found in the same order as on the INPUT statement.
Interactions Between the DATA Step and External Files 63
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)
• Input must be in form:Fieldname=value
• Variables do not even have to be on the INPUT statement! (Variables defined with LENGTH or ATTRIB statement will be identified and processed, as well.)
Interactions Between the DATA Step and External Files 64
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)Example using INPUT statement
File: 'c:\sasconf\param.dat‘
85 DATA Params;86 INFORMAT ProcessDt MMDDYY8.;87 INFILE 'c:\sasconf\param.dat';88 INPUT @ 1 RunType $CHAR4.89 Analyst= $CHAR3. ;90 RUN;
PROD ProcessDt=12/02/06 Analyst=ATK
Interactions Between the DATA Step and External Files 65
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)Example using PUT statement
92 DATA _NULL_;93 SET Params;94 FILE LOG;95 PUT RunType=96 ProcessDt= DATE9. Analyst= ;97 RUN;
RunType=PROD ProcessDt=02DEC2006 Analyst=ATK
Interactions Between the DATA Step and External Files 66
Interactions … External Files Defining the Appearance of the Data
Named Input (and Output)Example using PUT statement
99 DATA _NULL_;100 SET Params;101 FILE LOG;102 PUT _ALL_ ;103 RUN;
ProcessDt=17137 RunType=PROD Analyst=ATK_ERROR_=0 _N_=1
Interactions Between the DATA Step and External Files 67
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
• The 5th type of input?
• No line pointers,no column pointers,no formats,not even any variables!
Interactions Between the DATA Step and External Files 68
"As I was going up the stair, I met a man who was not there, He wasn't there again today, I wish that man would go away."
Hugh Mearns“… He must be with the CIA."
MAD Magazine
Interactions … External Files Defining the Appearance of the Data
Interactions Between the DATA Step and External Files 69
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
Do youremember this
slide from earlier in the
presentation?
Interactions Between the DATA Step and External Files 70
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
There is another form of column pointer known as the “trailing @”.
As expected, it is found at the end of a line on an INPUT or PUT statement.
Interactions Between the DATA Step and External Files 71
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
Trailing @.
Prevents line and column pointers from automatically moving to the start of next line after completion of INPUT statement.
Line pointer is not moved until next INPUT statement, or next iteration of DATA step.
Interactions Between the DATA Step and External Files 72
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
Double Trailing @.
Prevents line and column pointers from automatically moving to the next line after completion of INPUT statement.
Line pointer is not moved until next INPUT statement, or next iteration of DATA step.
Interactions Between the DATA Step and External Files 73
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
So … how do you restore the default behavior of the column and line pointer?
You guessed it … a Null INPUT / PUT statement!
Interactions Between the DATA Step and External Files 74
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
DATA SAS_Conf07; ARRAY ConfDt (2) Start_Dt End_Dt ; INFILE 'c:\sasconf\sascon07.dat' ; INPUT @ 19 ConfName $char16. @ 36 ConfLoc $char16. @ 1 @ ; DO i = 1 TO 2 ; INPUT ConfDt(i) mmddyy8. +1 @ ; END ; INPUT ;
… … …
Interactions Between the DATA Step and External Files 75
Interactions … External Files Defining the Appearance of the Data
Null Input (and Output)
DATA SAS_Conf07;… … …
PUT @ 1 ConfName $CHAR16. @ 18 Start_Dt mmddyy8. @ 27 End_Dt mmddyy8. @ ; IF MONTH( Start_Dt) ^= MONTH( End_Dt ) THEN PUT ' Span Month' ; ELSE PUT ;RUN ;
Interactions Between the DATA Step and External Files 76
Interactions … External Files Practical application
ASIDE: A little trick to quickly transition from input to output …
193 DATA _NULL_;194 INFILE DATALINES ;195 INPUT ;196 FILE PRINT ;197 PUT _INFILE_;198 DATALINES;
NOTE: 8 lines were written to file PRINT.
No variables listed on
INPUT statement.
Special variable _INFILE_ listed on
PUT statement.Add decision logic for
conditional output.
Interactions Between the DATA Step and External Files 77
Interactions … External Files Practical application
Q: So, which technique is best?
A: That depends on the data.
(You knew I was going to say that.)
Interactions Between the DATA Step and External Files 78
Reading Sequential files:usually Formatted input.
Reading parameters:sometimes Named input,sometimes List input
sometimes Formatted input
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 79
Writing Sequential files:usually Formatted input.
Writing parameters and debugs:often Named input,sometimes List input
sometimes Formatted input
Interactions … External Files Practical application
?
Interactions Between the DATA Step and External Files 80
Reading / writing directly to / from spreadsheet:• Can use DDE, which is outside the scope of
this presentation.
Using a text file as an interface:• Spreadsheet can read and write to CSV files.• SAS can read and write CSV files, too!
CSV = “Comma Separated Value
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 81
Reading / writing directly to / from spreadsheet:• Use DSD option on INFILE statement.
– Two or more consecutive delimiters denote missing values. (Normally, extra delimiters are ignored.)
– Default delimiter is comma. (Normally, default delimiter is a space.) This can still be overridden with DELIMITER= option.
DSD = “Delimiter Separated Data”
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 82
Reading / writing directly to / from spreadsheet:• Use DSD option on INFILE statement.
– Character values are assumed to be enclosed in double-quotes. (Use ~ format modifier if you need to keep the double-quotes as part of the data.)
DSD = “Delimiter Separated Data”
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 83
Interactions … External Files Defining the Appearance of the Data
04/16/07,04/19/07,"SAS Global Forum","Orlando FL"06/03/07,06/06/07,"PharmaSUG","Denver CO"09/16/07,09/18/07,"PNWSUG","Seattle WA"09/30/07,10/02/07,"SCSUG","Austin TX"10/17/07,10/19/07,"WUSS","San Francisco CA"10/28/07,10/30/07,"MWSUG","Des Moines IA"11/04/07,11/06/07,"SESUG","Hilton Head SC"11/11/07,11/14/07,"NESUG","Baltimore MD"
DSD – Sample Data
File: 'c:\sasconf\sascon07.csv'
Interactions Between the DATA Step and External Files 84
Interactions … External Files Defining the Appearance of the Data
DSD – Wrong Way
17 DATA SAS_Conf07;18 INFILE 'c:\sasconf\sascon07.csv' DSD;19 INPUT Start_Dt mmddyy8.20 End_Dt mmddyy8.21 ConfName $CHAR16.22 ConfLoc $CHAR16.;23 FILE 'c:\sasconf\testout2.csv' DSD;24 PUT ConfName $CHAR16.25 ConfLoc $CHAR16.26 Start_Dt date7.27 End_Dt date7.;28 RUN;
Interactions Between the DATA Step and External Files 85
Interactions … External Files Defining the Appearance of the Data
DSD – Wrong Way
NOTE: Invalid data for End_Dt in line 1 9-16.RULE: ----+----1----+----2----+----3----+---4— 04/16/07,04/19/07,"SAS Global Forum”,O …Start_Dt=16APR2007 End_Dt=. ConfName=7,"SAS Global Fo ConfLoc=rum","Orlando FL _ERROR_=1 _N_=1NOTE: Invalid data for End_Dt in line 2 9-16. 06/03/07,06/06/07,"PharmaSUG","Denver CO“Start_Dt=03JUN2007 End_Dt=. ConfName=7,"PharmaSUG","D ConfLoc=enver CO" _ERROR_=1 _N_=2
etc. etc. etc.
Interactions Between the DATA Step and External Files 86
DSD – Wrong Way
'c:\sasconf\testout1.csv'
7,"SAS Global Forum","Orlando FL16APR07 .7,"PharmaSUG","Denver CO" 03JUN07 .7,"PNWSUG","Seattle WA" 16SEP07 .7,"SCSUG","Austin TX" 30SEP07 .7,"WUSS","San Francisco CA" 17OCT07 .7,"MWSUG","Des Moines IA" 28OCT07 .7,"SESUG","Hilton Head SC" 04NOV07 .7,"NESUG","Baltimore MD" 11NOV07 .
Interactions … External Files Defining the Appearance of the Data
Interactions Between the DATA Step and External Files 87
Interactions … External Files Defining the Appearance of the Data
DSD – Right Way
44 DATA SAS_Conf07;45 INFILE 'c:\sasconf\sascon07.csv' DSD;46 INFORMAT Start_Dt End_Dt mmddyy8.47 ConfName ConfLoc $CHAR16.;48 FORMAT Start_Dt End_Dt date7. ;49 INPUT Start_Dt50 End_Dt 51 ConfName 52 ConfLoc ;53 FILE 'c:\sasconf\testout2.csv' DSD;54 PUT ConfName 55 ConfLoc 56 Start_Dt57 End_Dt ;58 RUN;
Interactions Between the DATA Step and External Files 88
DSD – Right Way
'c:\sasconf\testout2.csv'
SAS Global Forum,Orlando FL,16APR07,19APR07PharmaSUG,Denver CO,03JUN07,06JUN07PNWSUG,Seattle WA,16SEP07,18SEP07SCSUG,Austin TX,30SEP07,02OCT07WUSS,San Francisco CA,17OCT07,19OCT07MWSUG,Des Moines IA,28OCT07,30OCT07SESUG,Hilton Head SC,04NOV07,06NOV07NESUG,Baltimore MD,11NOV07,14NOV07
Interactions … External Files Defining the Appearance of the Data
Interactions Between the DATA Step and External Files 89
Interactions … External Files Practical application
Q: What will the system do ifyou ask for more variablesthan you actually have onyour input data?
A: Whatever you tell it to do.
(You knew I was going to say that, too.)
Interactions Between the DATA Step and External Files 90
INFILE options – End of Input Line
FLOWOVERAllows INPUT to read next data record. NOTE: This is the default option / behavior.
Watch for following in SASLOG:NOTE: SAS went to a new line when INPUT
statement reached past the end of a line.
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 91
INFILE options – End of Input Line
MISSOVERStops INPUT from reading new data record, remaining variables set to missing.
Watch for following in SASLOG:(nothing – no extra NOTEs, WARNINGs,
or ERRORs will be produced)
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 92
INFILE options – End of Input Line
TRUNCOVERStops INPUT from reading new data record, remaining variables set to missing.
Watch for following in SASLOG:(nothing – no extra NOTEs, WARNINGs,
or ERRORs will be produced)
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 93
INFILE options – End of Input Line
MISSOVER vs. TRUNCOVERThe difference between these two options is how they handle values that are in the middle of being read at the end of the line.
MISSOVER sets it to a missing value. TRUNCOVER “settles for what it can get”, and stores as much as is present.
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 94
INFILE options – End of Input Line
STOPOVERStops INPUT from reading new data record, stops execution of DATA step, sets _ERROR_ = 1.
Watch for following in SASLOG:ERROR: INPUT statement exceeded record
length. INFILE <infile> OPTION STOPOVER specified.
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 95
INFILE options – End of Input Line
SCANOVER Searches next line for contents of the “@<string>” value.
Watch for following in SASLOG:NOTE: SAS went to a new line when INPUT
@'CHARACTER_STRING' scanned past the end of a line.
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 96
INFILE options – End of Input Line
TURNOVER• Meeting planners term for moving one
group out of a room and another group in.• Catering term for a type of pastry.
Interactions … External Files Practical application
Interactions Between the DATA Step and External Files 97
Interactions … External FilesConclusion
• 50 minute presentation can only scratch the surface of getting data into SAS and putting data out of SAS.
• You can learn more from a few minutes of independent trial and error than from sitting and listening to me for 50 minutes!
Interactions Between the DATA Step and External Files 98
Interactions … External FilesConclusion
For further information …
Interactions Between the DATA Step and External Files 99
Interactions … External FilesConclusion
Special thanks to the late John Tenniel (1820-1914) for providing the graphics used in this presentation.