I have a data set of sales data by day. Unfortunately the names of the columns represent the dates. In order to work with the data, I need to transform the data set so each day represents an observation.
The data set looks something like this:
store _010_05_01 _010_05_02 _010_05_03 ....
1 8 5 6
2 6 9 3
3 7 9 8
....
As you can see, I have a store variable and several variables that are named by the date and contain the number of sales for that day. I need it to look like this:
store date sales
1 5/1/10 8
1 5/2/10 5
1 5/3/10 6
2 5/1/10 6
2 5/2/10 9
2 5/3/10 3
....
Unfortunately this is a pretty common problem when you receive data from vendors who aren't sure how you are going to work with it. Luckily the transformation is pretty easy:
1) data byDay;
2) set myData;
3) array t[*] $ _010: ;
4) do i = 1 to dim(t);
5) sales = t[i];
6) date = vName( t[i] );
7) output;
8) end;
9) run;
This little data step creates an array to hold our date variables. (3)We use a little syntax sugar to keep from having to type out all the variable names _010:. The colon tells the SAS compiler to list out all the variables that have the _010 prefix. (4)Then we loop through all the elements in our t[] array. We use the vName() function to get the name of the variable that t[] is referring to (6). And finally we output once for each iteration of our loop (7).
Obviously if your date columns aren't named uniformly (_010:) then things won't work as nicely. And you would likely want to add a KEEP statement to just keep the variables you are creating along with any others you might be interested in.
If you made it this far, you may have noticed I am not quite finished with my transformation. Line (6) assigns the value of date to something like '_010_05_01'. Not quite what we wanted. We need to turn that into a SAS date. I need to substr() the first two characters off to remove them (_0) and then input() the result using the appropriate informat. So line (6) should really be:
6) date = input( substr( vName( t[i] ), 3), yymmdd8. );
A little more complicated, but hopefully still understandable.
Hopefully you find this useful! Comment with any comments/observations.
Man! I did not know about the ":" part in creating arrays. This will reduce lots of specific coding in my work. Best thing I've Thx Thx Thx!
ReplyDelete