lec5 student

Post on 08-Apr-2018

230 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 1/28

Concatenating SAS Data Sets

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 2/28

2

Define concatenation.

Use the SET statement in a DATA step toconcatenate two or more SAS data sets.

Use the RENAME= data set option to change thenames of variables.

Use the SET and BY statements in a DATA step tointerleave two or more SAS data sets.

Objectives

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 3/28

3

DATA SAS-data-set ;SET SAS-data-set1 SAS-data-set2 . . .;<other SAS statements >

RUN;

Concatenating SAS Data SetsUse the SET statement in a DATA step to concatenate

SAS data sets.

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 4/28

4

You can read any number of SAS data sets with a single

SET statement.

SAS data sets

jan

feb

 mar

data work.qtr1;set work.jan work.feb

work.mar;

run;

janfeb

 mar

work.qtr1

...

Concatenating SAS Data Sets

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 5/28

5

Two SAS data sets, na1 and na2, contain data for newly

hired navigators. Concatenate the data sets into a newdata set named newhires.

 Name

Gender

JobCodeTORRES M 

 NA1

LANG

F

 NA1

SMITH F

 NA1

 Name

Gender JobCodeLISTER M 

 NA2

TORRES F NA2

na1 na2

Business Task

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 6/28

6

 Name

Gender JobCodeTORRES M 

 NA1

LANG

F NA1

SMITH F NA1

LISTER 

 M 

 NA2

TORRES F NA2

newhires

Concatenating SAS Data Sets: Execution

When SAS reaches end-of-file on the last data set,

DATA step execution ends.

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 7/287

Two SAS data sets, fa1 and fa2, contain data for newly

hired flight attendants. Concatenate the data sets into anew data set named newfa.

 Name Gender JobCode

KENT F FA1PATEL M  FA1JONES F FA1

 Name JCode Gender

LOPEZ FA2 FGRANT FA2 F

fa1 fa2

Business Task

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 8/288

 Name Gender JobCode JCode

KENT F FA1PATEL M  FA1JONES F FA1LOPEZ F FA2GRANT F FA2

data newfa;set fa1 fa2;

run;

Concatenating SAS Data Sets: Execution

 Name Gender JobCodeKENT F FA1PATEL M  FA1JONES F FA1

 Name JCode GenderLOPEZ FA2 FGRANT FA2 F

newfa

...

fa1 fa2

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 9/289

You can use a RENAME= data set option to change the

name of a variable.

General form of the RENAME= data set option:

SAS-data-set (RENAME=(old-name-1=new-name-1

old-name-2=new-name-2 .

.

.

old-name-n=new-name-n ))

The RENAME= Data Set Option

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 10/2810

 Name Gender JobCode Name Gender JobCode

KENT F FA1PATEL M  FA1

JONES F FA1

LOPEZ F FA2

GRANT F FA2

data newfa;set fa1 fa2(rename=(JCode=JobCode));run;

 Name Gender JobCodeKENT F FA1PATEL M  FA1JONES F FA1

 Name JCode GenderLOPEZ FA2 FGRANT FA2 F

newfa

The RENAME= Data Set Option

...

fa1 fa2

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 11/2811

Use the SET statement with a BY statement in a DATA

step to interleave SAS data sets.

General form of a DATA step interleave:

DATA SAS-data-set ;

SET SAS-data-set1 SAS-data-set2 . . .;BY BY-variable ;<other SAS statements >

RUN;

Interleaving SAS Data Sets

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 12/2812

Interleaving SAS data sets simply concatenates SAS data

sets so the observations in the resulting data set are inorder.

Interleaving SAS Data Sets

data work.allemp;

set ia.miamiemp

ia.parisemp

ia.romeemp; by ID;

run;

work.allemp

ID Salary

109 36000171 54000

ID Salary083 87000

217 42000

ID Salary059 60000

154 88000

ia.miamiemp

ia.parisemp

ia.romeemp

ID Salary059 60000083 87000109 36000

154 88000171 54000217 42000

c08s1d4 ...

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 13/2813

 Name Gender JobCode

GRANT F FA2

JONES F FA1

KENT F FA1

LOPEZ F FA2

PATEL M  FA1

data newfa;

set fa1 fa2(rename=(JCode=JobCode)); by Name;run;

newfa

Interleaving SAS Data Sets

 Name Gender JobCodeJONES F FA1KENT F FA1PATEL F FA1

 Name JCode GenderGRANT FA2 FLOPEZ FA2 F

...

fa1 fa2

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 14/28

Merging SAS Data Sets

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 15/2815

Prepare data for merging using the SORT procedureand data set options.

Merge SAS data sets on a single common variable.

Objectives

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 16/2816

Merging SAS Data SetsUse the MERGE statement in a DATA step to join

corresponding observations from two or more SAS datasets.

DATA SAS-data-set ;MERGE SAS-data-sets; 

BY BY-variable(s);<other SAS statements> 

RUN;

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 17/28

17

SAS data sets

data compare; merge costs sales goals taxes; by Month;

run;

sales goalscosts taxes

compare

goals taxescosts sales

Merging SAS Data SetsYou can read any number of SAS data sets with a single

MERGE statement.

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 18/28

18

International Airlines iscomparing monthly salesperformance to monthly salesgoals.

The sales and goals data arestored in separate SAS datasets.

Business Task

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 19/28

19

To calculate the difference between revenues and goals, the

 performance and goals data sets must be merged.

 Month Sales

1 2118223

2 1960034

 Month Goal

1 2130000

2 1920000

ia.performance ia.goals

Match-merge the data sets by Month and compute the

difference between the variable

values for Sales and Goal.

 Month Sales Goal Difference

1 2118223 2130000 -11777

2 1960034 1920000 40034

ia.compare

Business Task

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 20/28

20

Merge two data sets to acquire the names of the German

crew who are scheduled to fly next week.

ia.gercrew ia.gersched 

To match-merge the data sets by EmpID, the data sets

must be ordered by EmpID.

EmpID FlightNum 

E04064 5105

E00632 5250E01996 5501

EmpID LastName

E00632 STRAUSS

E01483 SCHELL-HAUNGSE01996  WELLHAEUSSER 

E04064  WASCHK 

Business Task

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 21/28

21

data work.nextweek;

 merge ia.gercrew work.gersched; by EmpID;

run;

EmpID FlightNum 

E00632 5250

E01996 5501

E04064 5105

EmpID LastName

E00632 STRAUSS

E01483 SCHELL-HAUNGS

E01996  WELLHAEUSSER 

E04064  WASCHK 

work.

gersched 

ia.

gercrew

EmpID LastName FlightNum 

E00632 STRAUSS 5250

E01483 SCHELL-HAUNGS

E01996  WELLHAEUSSER  5501

E04064  WASCHK  5105

work.nextweek

...

Merging SAS Data: Execution

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 22/28

22

Exclude from the data set crew members who are not

scheduled to fly next week.

ia.gercrew work.gersched EmpID LastName

E00632 STRAUSS

E01483 SCHELL-HAUNGS

E01996  WELLHAEUSSER 

E04064  WASCHK 

EmpID FlightNum 

E00632 5250

E01996 5501

E04064 5105

Eliminating Nonmatches

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 23/28

23

General form of the IN= data set option:

Use the IN= data set option to determine which data

set(s) contributed to the current observation.

Variable is a temporary numeric variable that hastwo possible values:

SAS-data-set (IN=variable )

0 indicates that the data set did not contribute to thecurrent observation.

1 indicates that the data set did contribute to the currentobservation.

The IN= Data Set Option

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 24/28

24

data work.combine;

 merge ia.gercrew(in=InCrew)

work.gersched(in=InSched); by EmpID;

run;

EmpID LastName

E00632 STRAUSS

E01483 SCHELL-HAUNGSE01996  WELLHAEUSSER 

E04064  WASCHK 

PDV

EmpID FlightNum 

E00632 5250

E01996 5501E04064 5105

work.gersched 

ia.gercrew

EmpID LastName FlightNum InCrew InSched 

E00632 STRAUSS 5250 1 1

DD

...

The IN= Data Set Option

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 25/28

25

Other MergesIn addition to one-to-one merges, the DATA step merge

works with many other kinds of data combinations:one-to-many unique BY values are in one

data set and duplicatematching BY values are in the

other data set.

many-to-many duplicate matching BY valuesare in both data sets.

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 26/28

26

work.two

X Z1 A1

1 A2

2 B1

3 C13 C2

work.one

X Y Z

1 A   A1

1 A   A2

2 B B1

3 C C1

3 C C2

work.threedata work.three;

 merge work.one work.two;

 by X;run;

...

X Y1 A 

2 B

3 C

One-to-Many Merging

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 27/28

27

work.three

X Y

1 A1

1 A2

2 B1

2 B2

X Z

1 AA1

1 AA2

1 AA3

2 BB1

2 BB2

data work.three; merge work.one work.two;

 by X;run;

X Y Z

1 A1  AA11 A2  AA2

1 A2  AA3

2 B1 BB1

2 B2 BB2

...

Many-to-Many Mergingwork.twowork.one

8/6/2019 Lec5 Student

http://slidepdf.com/reader/full/lec5-student 28/28

28

Exercises

1. Concatenating SAS Data Sets2. Merging SAS Data Sets

3. Identifying Data Set Contributors

top related