Google SAS Search

Add to Google

Monday, July 26, 2010

Thanks LabSug!

Thank you Los Angeles Basin SAS User Group for listening to my talk on SAS MACRO: Beyond The Basics.

The room provided by RAND was excellent. Everything was well organized and the day went without a hitch. The audience was very engaged and had great questions (definitely not the sleepy group I was expecting right after lunch!).

Overall I had a great time, and look forward to getting the chance to speak again!

Monday, July 19, 2010

I have a data set of sales data by day. Unfortunately the names of the columns represent the dates. In order to work with the data, I need to transform the data set so each day represents an observation.

The data set looks something like this:


store _010_05_01 _010_05_02 _010_05_03 ....
1 8 5 6
2 6 9 3
3 7 9 8
....


As you can see, I have a store variable and several variables that are named by the date and contain the number of sales for that day. I need it to look like this:

store date sales
1 5/1/10 8
1 5/2/10 5
1 5/3/10 6
2 5/1/10 6
2 5/2/10 9
2 5/3/10 3
....


Unfortunately this is a pretty common problem when you receive data from vendors who aren't sure how you are going to work with it. Luckily the transformation is pretty easy:


1) data byDay;
2) set myData;
3) array t[*] $ _010: ;
4) do i = 1 to dim(t);
5) sales = t[i];
6) date = vName( t[i] );
7) output;
8) end;
9) run;


This little data step creates an array to hold our date variables. (3)We use a little syntax sugar to keep from having to type out all the variable names _010:. The colon tells the SAS compiler to list out all the variables that have the _010 prefix. (4)Then we loop through all the elements in our t[] array. We use the vName() function to get the name of the variable that t[] is referring to (6). And finally we output once for each iteration of our loop (7).

Obviously if your date columns aren't named uniformly (_010:) then things won't work as nicely. And you would likely want to add a KEEP statement to just keep the variables you are creating along with any others you might be interested in.

If you made it this far, you may have noticed I am not quite finished with my transformation. Line (6) assigns the value of date to something like '_010_05_01'. Not quite what we wanted. We need to turn that into a SAS date. I need to substr() the first two characters off to remove them (_0) and then input() the result using the appropriate informat. So line (6) should really be:


6) date = input( substr( vName( t[i] ), 3), yymmdd8. );


A little more complicated, but hopefully still understandable.

Hopefully you find this useful! Comment with any comments/observations.