lec5 student
Post on 08-Apr-2018
230 Views
Preview:
TRANSCRIPT
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 1/28
Concatenating SAS Data Sets
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 2/28
2
Define concatenation.
Use the SET statement in a DATA step toconcatenate two or more SAS data sets.
Use the RENAME= data set option to change thenames of variables.
Use the SET and BY statements in a DATA step tointerleave two or more SAS data sets.
Objectives
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 3/28
3
DATA SAS-data-set ;SET SAS-data-set1 SAS-data-set2 . . .;<other SAS statements >
RUN;
Concatenating SAS Data SetsUse the SET statement in a DATA step to concatenate
SAS data sets.
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 4/28
4
You can read any number of SAS data sets with a single
SET statement.
SAS data sets
jan
feb
mar
data work.qtr1;set work.jan work.feb
work.mar;
run;
janfeb
mar
work.qtr1
...
Concatenating SAS Data Sets
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 5/28
5
Two SAS data sets, na1 and na2, contain data for newly
hired navigators. Concatenate the data sets into a newdata set named newhires.
Name
Gender
JobCodeTORRES M
NA1
LANG
F
NA1
SMITH F
NA1
Name
Gender JobCodeLISTER M
NA2
TORRES F NA2
na1 na2
Business Task
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 6/28
6
Name
Gender JobCodeTORRES M
NA1
LANG
F NA1
SMITH F NA1
LISTER
M
NA2
TORRES F NA2
newhires
Concatenating SAS Data Sets: Execution
When SAS reaches end-of-file on the last data set,
DATA step execution ends.
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 7/287
Two SAS data sets, fa1 and fa2, contain data for newly
hired flight attendants. Concatenate the data sets into anew data set named newfa.
Name Gender JobCode
KENT F FA1PATEL M FA1JONES F FA1
Name JCode Gender
LOPEZ FA2 FGRANT FA2 F
fa1 fa2
Business Task
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 8/288
Name Gender JobCode JCode
KENT F FA1PATEL M FA1JONES F FA1LOPEZ F FA2GRANT F FA2
data newfa;set fa1 fa2;
run;
Concatenating SAS Data Sets: Execution
Name Gender JobCodeKENT F FA1PATEL M FA1JONES F FA1
Name JCode GenderLOPEZ FA2 FGRANT FA2 F
newfa
...
fa1 fa2
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 9/289
You can use a RENAME= data set option to change the
name of a variable.
General form of the RENAME= data set option:
SAS-data-set (RENAME=(old-name-1=new-name-1
old-name-2=new-name-2 .
.
.
old-name-n=new-name-n ))
The RENAME= Data Set Option
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 10/2810
Name Gender JobCode Name Gender JobCode
KENT F FA1PATEL M FA1
JONES F FA1
LOPEZ F FA2
GRANT F FA2
data newfa;set fa1 fa2(rename=(JCode=JobCode));run;
Name Gender JobCodeKENT F FA1PATEL M FA1JONES F FA1
Name JCode GenderLOPEZ FA2 FGRANT FA2 F
newfa
The RENAME= Data Set Option
...
fa1 fa2
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 11/2811
Use the SET statement with a BY statement in a DATA
step to interleave SAS data sets.
General form of a DATA step interleave:
DATA SAS-data-set ;
SET SAS-data-set1 SAS-data-set2 . . .;BY BY-variable ;<other SAS statements >
RUN;
Interleaving SAS Data Sets
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 12/2812
Interleaving SAS data sets simply concatenates SAS data
sets so the observations in the resulting data set are inorder.
Interleaving SAS Data Sets
data work.allemp;
set ia.miamiemp
ia.parisemp
ia.romeemp; by ID;
run;
work.allemp
ID Salary
109 36000171 54000
ID Salary083 87000
217 42000
ID Salary059 60000
154 88000
ia.miamiemp
ia.parisemp
ia.romeemp
ID Salary059 60000083 87000109 36000
154 88000171 54000217 42000
c08s1d4 ...
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 13/2813
Name Gender JobCode
GRANT F FA2
JONES F FA1
KENT F FA1
LOPEZ F FA2
PATEL M FA1
data newfa;
set fa1 fa2(rename=(JCode=JobCode)); by Name;run;
newfa
Interleaving SAS Data Sets
Name Gender JobCodeJONES F FA1KENT F FA1PATEL F FA1
Name JCode GenderGRANT FA2 FLOPEZ FA2 F
...
fa1 fa2
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 14/28
Merging SAS Data Sets
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 15/2815
Prepare data for merging using the SORT procedureand data set options.
Merge SAS data sets on a single common variable.
Objectives
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 16/2816
Merging SAS Data SetsUse the MERGE statement in a DATA step to join
corresponding observations from two or more SAS datasets.
DATA SAS-data-set ;MERGE SAS-data-sets;
BY BY-variable(s);<other SAS statements>
RUN;
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 17/28
17
SAS data sets
data compare; merge costs sales goals taxes; by Month;
run;
sales goalscosts taxes
compare
goals taxescosts sales
Merging SAS Data SetsYou can read any number of SAS data sets with a single
MERGE statement.
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 18/28
18
International Airlines iscomparing monthly salesperformance to monthly salesgoals.
The sales and goals data arestored in separate SAS datasets.
Business Task
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 19/28
19
To calculate the difference between revenues and goals, the
performance and goals data sets must be merged.
Month Sales
1 2118223
2 1960034
Month Goal
1 2130000
2 1920000
ia.performance ia.goals
Match-merge the data sets by Month and compute the
difference between the variable
values for Sales and Goal.
Month Sales Goal Difference
1 2118223 2130000 -11777
2 1960034 1920000 40034
ia.compare
Business Task
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 20/28
20
Merge two data sets to acquire the names of the German
crew who are scheduled to fly next week.
ia.gercrew ia.gersched
To match-merge the data sets by EmpID, the data sets
must be ordered by EmpID.
EmpID FlightNum
E04064 5105
E00632 5250E01996 5501
EmpID LastName
E00632 STRAUSS
E01483 SCHELL-HAUNGSE01996 WELLHAEUSSER
E04064 WASCHK
Business Task
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 21/28
21
data work.nextweek;
merge ia.gercrew work.gersched; by EmpID;
run;
EmpID FlightNum
E00632 5250
E01996 5501
E04064 5105
EmpID LastName
E00632 STRAUSS
E01483 SCHELL-HAUNGS
E01996 WELLHAEUSSER
E04064 WASCHK
work.
gersched
ia.
gercrew
EmpID LastName FlightNum
E00632 STRAUSS 5250
E01483 SCHELL-HAUNGS
E01996 WELLHAEUSSER 5501
E04064 WASCHK 5105
work.nextweek
...
Merging SAS Data: Execution
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 22/28
22
Exclude from the data set crew members who are not
scheduled to fly next week.
ia.gercrew work.gersched EmpID LastName
E00632 STRAUSS
E01483 SCHELL-HAUNGS
E01996 WELLHAEUSSER
E04064 WASCHK
EmpID FlightNum
E00632 5250
E01996 5501
E04064 5105
Eliminating Nonmatches
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 23/28
23
General form of the IN= data set option:
Use the IN= data set option to determine which data
set(s) contributed to the current observation.
Variable is a temporary numeric variable that hastwo possible values:
SAS-data-set (IN=variable )
0 indicates that the data set did not contribute to thecurrent observation.
1 indicates that the data set did contribute to the currentobservation.
The IN= Data Set Option
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 24/28
24
data work.combine;
merge ia.gercrew(in=InCrew)
work.gersched(in=InSched); by EmpID;
run;
EmpID LastName
E00632 STRAUSS
E01483 SCHELL-HAUNGSE01996 WELLHAEUSSER
E04064 WASCHK
PDV
EmpID FlightNum
E00632 5250
E01996 5501E04064 5105
work.gersched
ia.gercrew
EmpID LastName FlightNum InCrew InSched
E00632 STRAUSS 5250 1 1
DD
...
The IN= Data Set Option
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 25/28
25
Other MergesIn addition to one-to-one merges, the DATA step merge
works with many other kinds of data combinations:one-to-many unique BY values are in one
data set and duplicatematching BY values are in the
other data set.
many-to-many duplicate matching BY valuesare in both data sets.
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 26/28
26
work.two
X Z1 A1
1 A2
2 B1
3 C13 C2
work.one
X Y Z
1 A A1
1 A A2
2 B B1
3 C C1
3 C C2
work.threedata work.three;
merge work.one work.two;
by X;run;
...
X Y1 A
2 B
3 C
One-to-Many Merging
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 27/28
27
work.three
X Y
1 A1
1 A2
2 B1
2 B2
X Z
1 AA1
1 AA2
1 AA3
2 BB1
2 BB2
data work.three; merge work.one work.two;
by X;run;
X Y Z
1 A1 AA11 A2 AA2
1 A2 AA3
2 B1 BB1
2 B2 BB2
...
Many-to-Many Mergingwork.twowork.one
8/6/2019 Lec5 Student
http://slidepdf.com/reader/full/lec5-student 28/28
28
Exercises
1. Concatenating SAS Data Sets2. Merging SAS Data Sets
3. Identifying Data Set Contributors
top related