Google SAS Search

Add to Google

Tuesday, May 29, 2007

Where Did the Observation Come From?

Here is a little snippet of code I created to address the problem of assigning a value to a variable based on what data set an observation came from in a data step. Here is an example:

Suppose I have a whole bunch of data sets each representing a different country. I want to set a lot of them in one data step and create one resulting data set with a variable called language. In order to create the language variable correctly, we need to know which data set the observation is coming from. Typically we would use the IN= option on the data set to create a flag and then check that flag using if/then logic.


data selectedCountries;
set
chile(in=chile)
china(in=china)
costa_rica(in=costa)
egypt(in=egypt)
fiji(in=fiji)
turkey(in=turkey)
usa(in=usa)
saudi_arabia(in=saudi)
;

if chile then language = 'SPANISH';
else if china then language = 'CHINESE';
else if costa then language = 'SPANISH';
etc etc etc...
run;

One of the major problems with this approach is it does not scale well. The more countries you set, the more problematic your if/then logic becomes.

Here is a slightly more elegant solution that uses arrays and variable information functions. You still use the IN= option on the data set, however you want to name the in= variable the same as the value we want to assign. Then you create an array of all those in=variables. Finally, you loop through the array of in= variables and check for their boolean value. If it is true then you assign your new variable the value derived from the vname() function.

data selectedCountries;
set
chile(in= SPANISH)
china(in= CHINESE)
costa_rica(in= SPANISH)
egypt(in= ARABIC)
fiji(in= ENGLISH)
turkey(in= TURKISH)
usa(in= ENGLISH)
saudi_arabia(in= ARABIC)
;
array names[*] SPANISH CHINESE ARABIC ENGLISH TURKISH;
do i = 1 to dim(names);
if names[i] eq 1
then language = vname( names[i] );
end;
run;

Wednesday, May 23, 2007

Saving Time

When I was a kid my brother, sister and I spent a lot of time in my Father's dental lab. This gave us a unique opportunity to learn how to get things done in a time-sensitive production environment. The more business he got and the more successful his practice became, the more demanding his labwork. He spent a lot of time working in the lab perfecting techniques and efficiency. We kids would hang out in his dental lab looking for things to do and he would hand out miscellaneous tasks to us (sadly he locked away the NO2 from us). As we got older and more profecient working the lathe, drill, sand blaster, oven, etc we would get more critical tasks. Spending time with Dad meant spending time learning how to get things done in a fast-paced hands-on environment.

One thing Dad would always repeat to us is how important it is to get things done "quickly and correctly."

Just getting things done quickly won't cut it. And believe it or not, just getting things done correctly doesn't cut it either. Not if you have other steps in the process or customers waiting on you to complete your task. In order to have time in this life for things other than work, it helps to learn how to get things done both quickly and correctly.

Generally, most people think of working quickly as producing sloppy work. But actually, you can get things done quickly with FEWER mistakes. The trick is to seperate tasks into two categories: things that should be done very quickly, and things that should be done very correctly. When you get good at cutting down the time it takes for you to do the miscellaneous tasks you can spend more time getting the critical tasks done correctly. This type of thinking translates very well to programming. It has probably helped my career more than any other single piece of advice I have received.

So as you spend your day programming, think to yourself, "what are the non-critical tasks that I am having to do and how can I minimize them?" Believe it or not, with just a few small changes you can find yourself getting a lot more done.

Here is an example of a change that I have recently incorporated. If you are like me, you probably have a few folders on your hard drive that you are constantly having to access. Throughout my day I am constantly typing something like "c:\my data\reports\ad hoc\" into Save As and Open dialog boxes, Windows Explorer, etc. In Windows you can create a PATH variable to substitute. So in my example I might create a Windows path variable name R (stands for reports) that has the value "c:\my data\reports\ad hoc\". Now I can just type %R% to navigate to that folder. Saves time and frees my mind to focus on the more critical tasks than navigating Windows Explorer.

I believe I got that tip from http://www.lifehack.org/. It's a great site full of useful tips for minimizing the clutter so you can focus on getting things done quickly and correctly.

Thursday, May 17, 2007

LRECL

Here is a SAS trick that is especially useful for Windows users. By default, Windows creates files with a logical record length of 256. This means if you are creating a flat file with records (lines) longer than 256, the lines are going to wrap. You can tell Windows exactly how long to make the record length on the filename statement in SAS. The option is lrecl= (logical record length) and it looks like this:

filename myFile "c:\some directory\some file.txt" LRECL= 400;

Then you can write lines to that file that are up to 400 characters long without fear of the line wrapping.