1 epib 698c lecture 3 raul cruz-cano fall 2011. 2 creating and redefining variables you can create...

31
1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011

Upload: bethany-boone

Post on 17-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

3 Home gardener's data Gardeners were asked to estimate the pounds they harvested for four corps: tomatoes, zucchini, peas and grapes. Here is the data: Gregor Molly Luther Susan Task:  add new variable group with a value of 14;  add variable type to indicate home gardener;  Create a new variable zucchini_1 which equals to zucchini*10  derive total pounds of corps for each gardener;  derive % of tomatoes for each gardener

TRANSCRIPT

Page 1: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

1

EPIB 698C Lecture 3

Raul Cruz-Cano Fall 2011

Page 2: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

2

Creating and Redefining Variables

• You can create and redefine variables with assignment statements as follows: Variable =expression

Type of expression ExampleNumeric constant Age =10;

Character constant Gender =‘Female’;

A old variable Age = age_at_baseline ;

Addition Age =age_at_baseline +10;

Vector Notation

Page 3: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

3

Home gardener's data • Gardeners were asked to estimate the pounds they harvested for four

corps: tomatoes, zucchini, peas and grapes. Here is the data: Gregor 10 2 40 0 Molly 15 5 10 1000 Luther 50 10 15 50 Susan 20 0 . 20• Task: add new variable group with a value of 14; add variable type to indicate home gardener; Create a new variable zucchini_1 which equals to zucchini*10 derive total pounds of corps for each gardener; derive % of tomatoes for each gardener

Page 4: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

4

Home gardener's data

DATA homegarden;INFILE ‘C:\garden.txt';INPUT Name $ 1-7 Tomato Zucchini Peas grapes; group = 14; Type = 'home'; Zucchini_1= Zucchini * 10; Total=tomato + zucchini_1 + peas + grapes; PerTom = (Tomato / Total) * 100;Run;

CODE

Page 5: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

5

Home gardener's data

• Check the log window: Missing values were generated as a result of performing an operation on missing values.

• Since for the last subject, we have missing values for peas, so we the variable total and PerTom, which are calculated from peas, are set to missing

Page 6: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

6

SAS functions• SAS has over 400 functions, with the following general form:

Function-name (argument, argument, …)

• All functions must have parentheses even if they don’t require any arguments

• Example: X=Int(log(10)); Mean_score = mean(score1, score2, score3); The Mean function returns mean of non-missing arguments, which differs from

simply adding and dividing by their number, which would return a missing values if any arguments are missing

Page 7: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

7

Common Functions And Operators

Functions ABS: absolute value EXP: exponential LOG: natural logarithm MAX and MIN: maximum and minimum SQRT: square root SUM: sum of variables

Example: SUM (of x1-x10, x21)

• Arithmetic: +, -, *, /, ** (not ^)

Page 8: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

8

More SAS functions

Function Name Example ResultMax Y=Max(1, 3, 5); Y=5

Round Y=Round (1.236, 2); Y=1.24

Sum Y=sum(1, 3, 5); Y=9

Length a=‘my cat’; Y=Length (a);

Y=6

Trim a=‘my ’, b=‘cat’Y=trim(a)||b

Y=‘mycat’

CODE

Page 9: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

9

Selected date functions

functions Description Example ResultsToday Returns current date X=today(); Today’s dateQTR Returns a yearly quarter from a

SAS date valueX= QTR(366)

1

Month Return the month value from a SAS date value

X= Month(366)

1

Day Return the day value from a SAS date value

X= day (369) 4

MDY Returns a SAS date value from month, day and year input

X=MDY(1,1,60)

0

Page 10: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

10

Working with SAS Date

• A SAS date is a numeric value equal to the number of days since Jan. 1, 1960. For example:

Date SAS date valueJan. 1, 1959 -365Jan. 1, 1960 0Jan. 1, 1961 366Jan. 1, 2003 15706

CODE

Page 11: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

11

Example: pumpkin carving contest data • This data contains contestant’s name , age, type of pumpkin (carved or

decorated), date of entry and the scores from 5 judges.

Alicia Grossman 13 c 10-28-2003 7.8 6.5 7.2 8.0 7.9Matthew Lee 9 D 10-30-2003 6.5 5.9 6.8 6.0 8.1Elizabeth Garcia 10 C 10-29-2003 8.9 7.9 8.5 9.0 8.8Lori Newcombe 6 D 10-30-2003 6.7 5.6 4.9 5.2 6.1Jose Martinez 7 d 10-31-2003 8.9 9.510.0 9.7 9.0Brian Williams 11 C 10-29-2003 7.8 8.4 8.5 7.9 8.0

• We will derive the means scores using the “Mean” function• Transform values of “type” to upper case• Get the day of the month from the SAS date

Page 12: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

12

Example: pumpkin carving contest data

DATA contest;INFILE ‘C:\pumpkin.txt';INPUT Name $16. Age Type $ @23 Date MMDDYY10. (Scr1 Scr2 Scr3 Scr4 Scr5) (4.1);

AvgScore= MEAN(Scr1,Scr2,Scr3,Scr4, Scr5); DayEntered = DAY(Date);Type = UPCASE(Type);run;

CODE

Page 13: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

13

Using IF-THEN statement • IF-THEN statement is used for conditional

processing. Example: you want to derive means test scores for female students but not male students. Here we derive means conditioning on gender =‘female’

• Syntax: If condition then action; Eg: If gender =‘F’ then mean_score =mean(scr1, scr2);

Page 14: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

14

Using IF-THEN statement

Logical comparison Mnemonic term symbol

Equal to EQ =

Not equal to NE ^= or ~=

Less than LT <

Less than or equal to LE <=

Greater than GT >

greater than or equal to GE >=

Equal to one in a list IN

List of Logical comparison operators

Note: Missing numeric values will be treated as the most negative values you can reference on your computer

Page 15: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

15

Using IF-THEN statement• Example: We have data contains the following information

of subjects: Age Gender Midterm Quiz FinalExam

21 M 80 B- 8220 F 90 A 9335 M 87 B+ 8548 F 80 C 7659 F 95 A+ 9715 M 88 C 93

• Task: To group student based on their age (<20, [20-40), [40-60), >=60)

Page 16: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

• data conditional;• input Age Gender $ Midterm Quiz $ FinalExam;• datalines;• 21 M 80 B- 82• 20 F 90 A 93• 35 M 87 B+ 85• 48 F 80 C 76• 59 F 95 A+ 97• 15 M 88 C 93• ;• data new1;• set conditional;• if Age < 20 then AgeGroup = 1;• if 20 <= Age < 40 then AgeGroup = 2;• if 40 <= Age < 60 then AgeGroup = 3;• if Age >= 60 then AgeGroup = 4;• run;

16

CODE

Page 17: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

17

Multiple conditions with AND and OR

• IF condition1 and condition2 then action;• Eg: If age <40 and gender=‘F’ then group=1;If age <40 or gender=‘F’ then group=2;

Page 18: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

18

IF-THEN statement, multiple conditions• Example: We have data contains the following information

of subjects: Age Gender Midterm Quiz FinalExam

21 M 80 B- 8220 F 90 A 9335 M 87 B+ 8548 F 80 C 7659 F 95 A+ 9715 M 88 C 93

• Task: To group student based on their age (<40, >=40),and gender

Page 19: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

19

data new1;set conditional;If age <40 and gender='F' then group=1;If age >=40 and gender='F' then group=2;IF age <40 and gender ='M' then group=3;IF age >=40 and gender ='M' then group=4;run;

CODE

Page 20: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

• Note: Missing numeric values will be treated as the most negative values you can reference on your computer

• Example: group age into age groups with missing values

21 M 80 B- 8220 F 90 A 93. M 87 B+ 8548 F 80 C 7659 F 95 A+ 97. M 88 C 93

20CODE

Page 21: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

21

Multiple actions with Do, end

• Syntax: IF condition then do; Action1 ;Action 2;End;

If age <=20 then do ;group=1;exam_date =“Monday”;

End;

Page 22: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

22

IF-THEN statement, with multiple actions• Example: We have data contains the following information

of subjects: Age Gender Midterm Quiz FinalExam

21 M 80 B- 8220 F 90 A 9335 M 87 B+ 8548 F 80 C 7659 F 95 A+ 9715 M 88 C 93

• Task: To group student based on their age, and assign test date based on the age group

CODE

Page 23: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

23

IF-THEN/ELSE statement• SyntaxIF condition1 then action1;Else if condition2 then action2;Else if condition3 then action3;

• IF-THEN/Else statement has two advantages than IF-THEN statement

(1) It is more efficient, use less computing time(2) Else logic ensures that your groups are mutually

exclusive so that you do not put one observation into more than one groups.

Page 24: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

24

IF-THEN/ELSE statement

data new1;set conditional; if Age < 20 then AgeGroup = 1; else if Age >= 20 and Age < 40 then AgeGroup = 2; else if Age >= 40 and Age < 60 then AgeGroup = 3; else if Age >= 60 then AgeGroup = 4;run;

DATA contest;INFILE 'C:\pumpkin.txt';INPUT Name $16. Age Type $ @23 Date MMDDYY10. (Scr1 Scr2 Scr3 Scr4 Scr5) (4.1);If age < =10 then mean_score =mean(Scr1,Scr2); else mean_score=mean(Scr1,Scr2,Scr3,Scr4, Scr5);AvgScore= MEAN(Scr1,Scr2,Scr3,Scr4, Scr5); DayEntered = DAY(Date);Type = UPCASE(Type);run;

CODE

Page 25: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

25

The IN operator• If you want to test if a value is one of the possible

choices, you can use multiple “OR” statement like this: IF grade =‘A’ or grade =‘B’ or grade =‘C’ then PASS=‘yes’;

• A alternative is to use a IN operator: IF grade in (‘A’ ‘B’ ‘C’) then PASS=‘yes’;IF grade in (‘A’ , ‘B’ ,‘C’) then PASS=‘yes’;

CODE (error)

Page 26: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

26

Simplifying programs with Arrays• SAS Arrays are a collection of elements (usually SAS

variables) that allow you to write SAS statements referencing this group of variables.

• Arrays are defined using Array statement as: ARRAY name (n) variable list

name: is a name you give to the array n: is the number of variables in the array

eg: ARRAY store (4) macys sears target costco

Store(1) is the variable for macysStore(2) is the variable for sears

Page 27: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

27

Simplifying programs with Arrays• A radio station is conducting a survey asking people to rate 10

songs. The rating is on a scale of 1 to 5, with 1=Do not like the song; 5-like the song;

• IF the listener does not want to rate a song, he puts a “9” to indicate missing values

• Here is the data with location, listeners age and rating for 10 songs

Albany 54 4 3 5 9 9 2 1 4 4 9Richmond 33 5 2 4 3 9 2 9 3 3 3Oakland 27 1 3 2 9 9 9 3 4 2 3Richmond 41 4 3 5 5 5 2 9 4 5 5Berkeley 18 3 4 9 1 4 9 3 9 3 2

• We want to change 9 to missing values (.)

Page 28: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

28

Simplifying programs with Arrays

DATA songs;INFILE 'F:\radio.txt';INPUT City $ 1-15 Age domk wj hwow simbh kt aomm libm tr filp ttr;ARRAY song (10) domk wj hwow simbh kt aomm libm tr filp ttr;DO i = 1 TO 10; IF song(i) = 9 THEN song(i) = .;END;run;

Page 29: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

29

Using shortcuts for lists of variable names

• When writing SAS programs, we will often need to write a list of variables names. When you have a data with many variables, a shortcut for lists of variables names is helpful

• Numbered range list: variables which starts with same characters and end with consecutive number can be part of a numbered range list

• Eg : INPUT cat8 cat9 cat10 cat11INPUT cat8 – cat11

Page 30: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

30

Using shortcuts for lists of variable names

• Name range list: name range list depends on the internal order, or position of the variables in a SAS dataset. This is determined by the appearance of the variables in the DATA step.

• Eg : Data new; Input x1 x2 y2 y3; Run;

• Then the internal range list is: x1 x2 y2 y3• Shortcut for this variable list is x1-y3; • Proc contents procedure with the POSITION option can be used

to find out the internal order

Page 31: 1 EPIB 698C Lecture 3 Raul Cruz-Cano Fall 2011. 2 Creating and Redefining Variables You can create and redefine variables with assignment statements as

31

Using shortcuts for lists of variable names

DATA songs;INFILE ‘C:\radio.txt';INPUT City $ 1-15 Age domk wj hwow simbh kt aomm libm tr filp ttr;ARRAY new (10) Song1 - Song10;ARRAY old (10) domk -- ttr; DO i = 1 TO 10; IF old(i) = 9 THEN new(i) = .; ELSE new(i) = old(i); END;AvgScore = MEAN(OF Song1 - Song10);run;