hash, little baby. some examples of sas programming when hash object are really helpful

29
Hash, Little Baby… Some practical examples when SAS hash objects are really helpful Dmitry Shopin Data Analyst BC Centre for Excellence in HIV/AIDS Vancouver SAS User Group, 28 May 2014

Upload: dmitry-shopin

Post on 10-Jul-2015

1.012 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Hash, Little Baby…

Some practical examples when SAS hash objects are really helpful

Dmitry ShopinData Analyst

BC Centre for Excellence in HIV/AIDS

Vancouver SAS User Group, 28 May 2014

Page 2: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

AppenderAppender

LoggerLogger

What Are They

SAS Component ObjectsSAS Component Objects

HashHash Hash IteratorHash IteratorJava ObjectJava Object

Page 3: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

What Are They Exactly

key var1 var2 var3 …

key var1 var2

DATASET PDV

HASH OBJECT

Request

Return

Page 4: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

How To Work With Hash Objects

• Declare It

• Define It

• Load It

• Access/Change It

Using Dot.Notation:

A=Object.Attribute or RC=Object.Method(tag1:’value1’, …)

Page 5: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Hash Objects Classic: Look Up. 1/2

Patients Pat_HA

HA

Page 6: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Creating variables

Hash Objects Classic: Look Up. 2/2

data pat_ha; if _N_=1 then do; length ha 8 ha_name $20; declare hash h(); h.defineKey('ha'); h.defineData('ha_name'); h.defineDone();

do until(eof); set ha end=eof;

h.add(); end;

end;

set patient; h.find();run;

Loading dictionary

HA HA_NAME

Creating HASHHA HA_NAME

Page 7: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Interaction between hash object and data

key var1 var2 var3 … key var1 var2var1 = …

h.find()key var1 var2 var3 … key var1 var2

h.replace()h.add()

key var1 var2 var3 … key var1 var2

DATA HASH

Page 8: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Hash Objects Advantages

• Memory resident

• Direct-addressing

• Natural way to sort/get distinct values

Page 9: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 1. Dictionary-based replacement. 1/2

Page 10: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

data address; if _N_=1 then do; if 0 then set dict; dcl hash h(dataset:'dict'); h.defineKey('type'); h.defineData('abbr'); h.defineDone(); end; set address; do i=1 to 99; call scan(street_addr, i, position, length); if not position then leave; type=substr(street_addr,position,length); rc=h.find(); if rc=0 then substr(street_addr,position,length)=abbr; end;

drop rc type abbr i position length;run;

Declares hash object during the 1st iteration

onlyAdds variables with all

attributes to PDV, but never reads their values

Loads the hash object from the dictionary

Grabs a wordExtracts a word and puts it into

the key variable

If found, replaces a word with

corresponding abbreviation

Case 1. Dictionary-based replacement. 2/2

Page 11: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 2. Multiple counts. 1/2

illness

# of visits? # of visits? # of visits?

+

Episodes of illness

Visits

Page 12: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

data _NULL_; if _N_=1 then do; length before during after 8; dcl hash h(); h.defineKey('id'); h.defineData('id','start','end','before','during','after'); h.defineDone(); do until(eof1); set epi end=eof1; h.add(); end; end; set visits end=eof2; rc=h.find(); if rc=0 then do; select; when(visit<start) before+1; when(start<=visit<=end) during+1; when(end<visit) after+1; otherwise; end; h.replace(); end; if eof2 then h.output(dataset:'counts');run;

Case 2. Multiple counts. 2/2

Loads data with illness periods

If current patient found in hash object, increments corresponding

counter in data

Updates current hash object’s record

Outputs hash object as a dataset after the last visit has been

processed

Page 13: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 3. Find Some – Take All. 1/4

Tests

All tests of patients with 2+ tests >50

Page 14: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 3. Find Some – Take All. 2/4

id vl date

1 120 1-Jan-101 50 10-Mar-101 200 17-Jul-101 43 28-Feb-111 40 4-Aug-112 50 13-Apr-122 55 19-Sep-122 45 25-Dec-122 45 21-Jan-133 200 14-Feb-093 230 31-May-09

id

1

2

3

1

2

1

Hash object with unique IDs Hash object with multiple records per key (ID)

Page 15: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 3. Find Some – Take All. 3/4data _NULL_; if _N_=1 then do;

if 0 then set tests;

dcl hash h_test( dataset:'tests', multidata:'yes'); h_test.defineKey('id'); h_test.defineData(all:'yes'); h_test.defineDone();

dcl hash h_id(dataset: 'tests'); h_id.defineKey('id');

dcl hiter iter_id('h_id');

h_id.defineDone();

end;

Hash object with multiple data per

key

Hash object with unique IDs as a key.

No need for DefineData()

Iterator for the hash object with unique IDs

id

1

2

3

Loads data right awayid vl date

1 120 1-Jan-101 40 4-Aug-11

Uses all variables

Page 16: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 3. Find Some – Take All. 4/4

… rc=iter_id.first();

do while(rc=0); rc2= h_test.find();

i=0; do while(rc2=0 and i<2); if vl>50 then i+1; rc2= h_test.find_next(); end;

if i<2 then h_test.remove();

rc=iter_id.next();

end;

h_test.output(dataset:'high_2VL');

run;

Finds the first patient, using the iterator of the hash object with unique IDs

Iterates through all visits of the current patient, leaving when 2 found or no

more visits

If less than 2 visits, deletes all visits from the multidata hash object

Finds this patient in the multidata hash object

Finds the next patient

Page 17: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 1/5

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 18: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 2/5

Adjacency list (“edges”)

Connected components (“clusters”)

Page 19: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

Chris

Elena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 20: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

Chris

Elena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 21: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

Chris

Elena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 22: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

Chris

Elena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 23: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

Chris

Elena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 24: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

Chris

Elena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 25: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

ChrisElena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 26: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 3/5

John

David

Ken

ChrisElena

Adam

Fred

Berta

Mary

Peter

Vertices

Queue

John

David

Ken

Chris

Elena Adam

Fred

Berta

Mary

Peter

Page 27: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Case 4. Breadth First Tree Search. 4/5data _null_; dcl hash V(); V.defineKey('name'); V.defineData('name','cluster'); dcl hiter Vi('V'); V.defineDone();

dcl hash E(dataset:'Connections', multidata:'y'); E.defineKey('name'); E.defineData('name','friend'); E.defineDone();

dcl hash Q(ordered:'y'); Q.defineKey('qnum','name'); Q.defineData('qnum', 'name'); dcl hiter Qi('Q'); Q.defineDone();

do until(eof); set Connections end=eof; call missing(cluster); V.add(); end;

Hash object for Vertices, with iterator

Hash object for Edges

Hash object for Queue, with iterator

Loading the unique names

John

David

Ken

Chris

Queue

Page 28: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Selecting next name to start new cluster, when queue is empty

Dequeueing all names in queue one-by-one until it’s empty

Enqueueing all connections of dequeued name

rc1=Vi.first();do while(rc1=0); if missing(cluster) then do; qnum=1; Q.add(); n+1; cluster=n; V.replace(); rc2=Qi.first(); do while(rc2=0); qnum=qnum+Q.num_items-1; rc3=E.find(); do while(rc3=0); name=friend; rc4=V.find(); if rc4=0 and missing(cluster) then do; qnum+1; Q.add(); cluster=n; V.replace(); end; rc3=E.find_next(); end; Qi.first(); Qi.delete(); Q.remove(); Qi=_new_ hiter ('Q'); rc2=Qi.first(); end; end; rc1=Vi.next();end;V.output(dataset:'clusters');run;

Page 29: Hash, Little Baby. Some examples of SAS programming when hash object are really helpful

Hash More!

?