Google SAS Search

Add to Google

Wednesday, December 21, 2005

Quantum missing

The other day I was pulling into one of my local surfing spots to watch one of the larger swells of the season roll through and I got a call on the cell phone from my wife. After exchanging the usual pleasentries she asked me if I knew how to get the maximum missing value out of a set of missing values. WTF?
For reasons that only really really smart people can comprehend, her organization sometimes uses different "values" of missing to enumerate different reasons for the missing value. Are you with me? Something along the lines of:


select( q1a )
when('ONE') q1s = 1 ;
when('TWO') q1s = 2 ;
when('DK') q1s = .d ;
when('RF') q1s = .r ;
otherwise q1s = . ;
end;


Ostensibly this is used for creating the variables that will eventually be used in statistics.
Cause as missings, they are automatically excluded by proc calculations; but you can still see "why" it's missing. I have to say, I don't know if it's a neat-o trick or a classic abuse of language potential (just cause you CAN do it doesn't mean it's a GOOD idea), but my wife is much more pragmatic than I and didn't care to hear my theoretical musings on storing multiple values in what most of us would consider a single value-- missing. She just wanted to know: did I have an answer?

Well, the max() function returns a missing value if any of it's arguments are missing so that was no good. But as luck would have it the max operator will treat a missing value as a real value. (Which seems quite opposite of the sum() function and + operator. . .) And not only that, but the SAS documentation actually includes an order of missing values. Going smallest to largest:
._
.
.A - .Z

So the simple answer to her complicated question was:

maxMissingValue = m1 <> m2 <> m3 ; 


Of course, I did not know the answer when she called and at the time I was much more interested in the overhead bombs that were breaking outside. I mean, these were some seriously mind numbing big waves. . . So I made some lame joke about how you can't know the missing value until you collapse the missing value potential wave into a singularity event through observation, blah blah blah, uh I have no idea honey.

Thursday, December 01, 2005

Goto Memories

December already?! Seems like only yesterday I was dressing my little baby up for her first Halloween. And now we've already entered the *most wonderful time of the year*. A lot of people complain about the stores setting up their Christmas displays too early. Me? I like it. Bring on the displays! I like Christmas. I like winter. I like the holidays. I just wish I could get over this head cold. You know when you get a cold and it seems like your nose will never return to normal? Like you can't even remember what it was like to not have a cough? I hate having a cold.

When I was a kid my brother and I saved all our money one summer and bought ourselves a Commodore 64. That was the best Christmas ever. I would stay up all night programming in BASIC to get a smiley face sprite to bounce around the screen. I had to save my programs to cassette tape because we hadn't bought a disk drive yet. They were really expensive back then. I wish I had one of those cassette tapes now. If you played it back in the stereo it would make this weird analog warbly noise. I'd love to see what my BASIC code looked like. It'd probably be incomprehensible to me now. I remember I used to like cramming as many statements as possible onto one line and I think I used a lot of GOTO statements. Terrible, terrible bad habits for a 10 year old to be picking up!

So where am I going with this? This is a SAS blog after all, not a commodore blog. I was supposed to share some nugget of wisdom about programming in SAS but instead started rambling about BASIC and GOTO statements. Must be the cough syrup.

I still use GOTO statements today. Do you? They can come in quite handy for SAS/MACRO. Consider the following code. I know it's not a new technique, but it's useful and worth sharing.


* just two little data sets to work with;
data base;
input key;
datalines;
1
2
3
4
;

data newRecords;
input key;
datalines;
1
2
;

*************************************;

%macro earlyTermination();

%* Suppose you wanted to merge some data and see if
there were any records that didnt match. Then you
want to do some processing on those non-matching records.
Otherwise if there were no non-matches you dont want
to do any more processing.;

proc sort data = base;
by key;
run;

proc sort data = newRecords;
by key;
run;

data nonMatches;
merge base(in=base)
newRecords(in=newRecords);
by key;
if newRecords and not base then output;
run;

%* check to see if there are any records in nonMatches;
%let dsid = %sysfunc( open( nonMatches ) );
%let nobs = %sysfunc( attrn( &DSID, nobs ) );
%let rc = %sysfunc( close( &DSID ) );

%if &NOBS = 0 %then %goto done;

%* otherwise do some processing with the
nonMatches
.
.
.;

%put There were &NOBS non-matching records;

%done:
%mend earlyTermination;

%earlyTermination;


In this case we goto an empty label. But there could have been some statements after %DONE. It's important to note however, that the %DONE label will be executed NO MATTER WHAT.

Happy programming.

Tuesday, November 29, 2005

fEqual() Compares Floats For Equality

Here's a SAS function I wrote in C using SAS/TOOLKT that addresses the floating point equality problem previously discussed here.

It is pretty straightforward and uses the algorithm based on this SAS TS Note.

%MACRO FUZZCOMP(X,Y,EPS=1E-12);
(ABS(&X-&Y) LE &EPS*MAX(ABS(&X),ABS(&Y)))
%MEND;


You can find it and a couple other functions I have written here.

As an aside, is there any interest out there in knowing how to write user-written functions in C for the SAS System using SAS/TOOLKT? I know SAS/TOOLKT isn't the *sexiest* SAS product (that would be JMP), but it can be quite useful writing specific functions in a lower-level language like C. If there is any interest I could put something formal together for a paper or even just put a tutorial up on the web.

Thursday, November 10, 2005

De-dupe In Excel

A lot of SAS programmers have to deliver data in Excel every once in a while. Sometimes, after you've gotten the data into Excel you find that you need to get rid of duplicates. Here's how to do it in Excel:

1) Be sure your columns are named.

2) Highlight the columns you want to use as your sort key (the ones you would use in your BY statement for PROC SORT).

3) From the drop-down menu go to Data->Filter->Advanced Filter.

4) Excel will automatically select the range you have highlighted. You should see a little checkbox that says "Unique Records Only." Check that.

5) Hit OK.

Wednesday, November 02, 2005

What Every Computer Scientist Should Know About Floating-Point Arithmetic

So I've been thinking about floating point numbers recently. Mostly I've been thinking about comparing floating numbers for _relative_ equality. I know this certainly isn't a new issue for most, especially if you have worked with a "lower" level language like C/C++, but for the average SAS programmer it may come as a surprise that 7.4 may not = 7.4! In fact the rules of real numbers dictate that 7.4 can never = 7.4 since they are both approximations ( or shorthand ) for an infinetly precise number with decimal places stretching from here to Mars and back again ad infinitum.

In the general world we don't really care that much about floating point inequality because our precision, or more specifically lack-of-precision, makes it a moot point.

But in the world of computers and real numbers precision is always an issue. As anyone who has been unfortunate enough to write code such as this has (painfully) learned:


// add .10 cents rebate to the customers account till they have reached
// the rebate maximum
// called by perVisit() function

const float REBATE_MAXIMUM = 2.5; // $2.50 rebate max

void addRebate( Customer &c)
{
if ( c.accumulatedRebate == REBATE_MAXIMUM ) return;
else
{
// add the ten cents to their account and update their accumulated rebate
// so they do not go over
c.account += .1;
c.accumulatedRebate += .1;
}
}

Now who's going to explain to the CEO why all the 3rd quarter revenue got eliminated in massive rebates? GULP.

Hopefully you recognize the error in the above code? Since the values being compared are floats they are not really 2.50 but really something closer to 2.50000000007 or 2.5000000000001 or well _anything_ once you get past the signifigant digits of 2.50.

But the above code is C++ and as a SAS programmer you don't have to worry about those kinds of hairy details right? Try this code from Data Savant Consulting(which has a nice page discussing this very issue):


data _null_;
x = 7.3;
x = x+ 0.1;
y = 7.4;
if x = y then put "Duh! of course they are equal.";
else put "Doooh! " x " and " y " are not equal!!";
run;

Then go read this!
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Or if you don't have the time to grind through that, just remember comparing floats for equality is not usually a good idea.

Friday, October 28, 2005

The Sum Function

Today's function is very straightforward. It adds numbers.

num = sum( argument, argument, ... );

So straightforward in fact, that some of you new to SAS may be wondering why you might even need such a simple function? There's certainly nothing wrong with the old "+ sign" right? Well, the sum() function can do something that the plus sign cannot do. And that is treat missing values as if they are 0. That can be a very important distinction if there is the possibility of adding variables that may contain missing values.

Consider the following data step:


data _null_;
a = 1;
b = 2;
c = .; * our missing value;

r1 = a + b + c;
r2 = sum( a, b, c );
put r1=;
put r2=;
run;

The value of r1 will be missing since the plus operator returns missing if one of it's arguments is missing. The value for r2 will be 3 since it treats the missing value as if it were a 0

Friday, October 07, 2005

Another SAS Function Friday

Friday again already?! Doesn't it seem like time speeds up around Autumn? Something about the shortening days, the changing weather, the new TV line-up, the expectation of the holiday season soon approaching... I dunno, maybe it's just me?

For some crazy reason all this ruminating on days shortening kinda reminds me of one of my favorite SAS functions: intnx(). How's that for a weak tie-in? :)

Intnx() is used to increment a SAS date/time/datetime value by a given interval and returns a SAS date/time/datetime value. The following syntax should be enough to get you started, refer to SAS documentation for more details:

dt = intnx( 'INTERVAL', dateTime, increment <,alignment> );
Where dt is the date/time/datetime value returned,
INTERVAL is a time interval (WEEK, MONTH, HOUR, etc),
dateTime is a SAS date/time/datetime value,
increment is a positive or negative integer which specifies the number of intervals to shift the value,
and alignment controls the position of the shifted value within the interval (BEGINNING|MIDDLE|END|SAMEDAY). Default is beginning.

Got it? How about an example:


data _null_;
* take todays date and shift it forward two months;
thisDay = today();
forward2Months = intnx('MONTH', thisDay, 2, 'SAMEDAY');
put forward2Months= mmddyy10.;
run;

We specified SAMEDAY as the fourth argument instead of letting it default to BEGINNING which would have given us a date of 12/01/2005 instead of 12/07/2005.

Ready to test out your new function? I've always been confused about what day is the first day of the week? A quick google search gets me a very infomative page which states:

The Bible clearly makes the Sabbath the last day of the week, but does not share how that corresponds to our 7 day week. Yet through extra-biblical sources it is possible to determine that the Sabbath at the time of Christ corresponds to our current 'Saturday.' Therefore it is common Jewish and Christian practice to regard Sunday as the first day of the week (as is also evident from the Portuguese names for the week days). However, the fact that, for example, Russian uses the name "second" for Tuesday, indicates that some nations regard Monday as the first day.

In international standard ISO-8601 the International Organization for Standardization (ISO) has decreed that Monday shall be the first day of the week.


So for you SAS trivia buffs out there, figure out which day SAS considers to be the first day of the week. Does it follow the Judeo-Christian standard of Sunday? Or bend to the ISO-8601 standard of Monday?

Happy Friday!

Wednesday, October 05, 2005

A Word Counting Function For SAS

After some tinkering and toying around, I was finally able to negotiate the programming obstacle course known as SAS/TOOLKIT(R). SAS/TOOLKIT allows you to create user-written procedures, formats, informats and functions for the SAS system. I stuck to just writing a function.

The function is written in C and compiles to a dll file that you put into a SAS -PATH directory. Then you can use it just like any base SAS function. You do not need to have SAS/TOOLKIT in order to use the compiled dll file. So feel free to download the dll and use the function.

The dll file is hosted on my company site pelicanprogramming.com at
http://www.pelicanprogramming.com/sas/wcount.zip

Save the dll file after you download the zip. You can either save it to one of your SAS -PATH directories or update your SAS config file. If you don't know how to (or don't feel comfortable) updating your SAS config file, just save the dll file to
C:\Program Files\SAS\SAS 9.1\core\sasexeThis is where SAS finds most of it's own modules so adding this one in shouln't hurt anything. ;)

Oh yeah, it only works with SAS 9.13 on Windows PC SAS. Sorry SAS 8ers. Sorry Unixers.

The function is named wcount() and it counts the number of words in a string. The syntax is:

int = wcount( string <,char> );

Where int is the number of words,
string is the character string you are counting the words in,
and char is an optional second argument to specify the word delimiter. The default delimiter is a space. Actually wcount() defines a space as any "white" space: space, tab, carriage-return, newline, vertical tab and form-feed.

An example:


data _null_;
wordCount = wCount('this is my test string ');
put wordCount=;
run;

This would replace the following traditional SAS code:

data _null_;
* count the number of words in a string;
string = "this is my string of words";
wordCount = 1;
do while ( scan( string, wordCount) ^= '');
wordCount+1;
end;
wordCount+(-1);
put wordCount=;
run;

So, if you've got version 9 PC SAS and don't mind messing around a little bit, download the file and test it out. If you have any specific problems you can find my e-mail address in the readMe file included in the zip.

Friday, September 30, 2005

SAS Function Friday

In an attempt to increase my blog posting rate I will be highlighting a SAS function every (hopefully) Friday. At least ever Friday that I have access to a computer.

There's so many SAS functions, where do we start? Do we go for one of the more obscure ones in order to start things off with a little razzle dazzle? Something like CALL PEEK or CALL POKE. Cause it's always useful to know the address of your SAS variable. Actually, it can be very useful, just not very often. How about something a little more pedestrian?

Everybody knows all about the COMPRESS() function, right? It's not very razzle dazzle but it's darn useful. Compress() can be used to get rid of......
specific CHARACTERS in a character string!

Notice I did not say spaces. Although, a lot of times it's used just to get rid of spaces. That's the default for the optional second parameter.


data _null_;
x = 'get rid of spaces';
x = compress( x );

y = 'get_rid of spaces_and_ underscores';
y = compress( y, ' _' );
put x = ;
put y = ;
run;

One question you should always have when approaching a SAS character function is what is the length of the return value? In other words, if you create a new variable, what is it's length going to be?

data _null_;
x = "Woohoo! It's Friday!";
y = compress( x );
run;

What is the length of y? Eight (the "default" for a SAS variable)? Eighteen (the length of Woohoo!It'sFriday with the spaces comressed out)? Twenty (the length of the variable x)?

Of course you knew the answer is 20. Woohoo! smart reader.

Wednesday, September 28, 2005

Nifty Informat

What do you do if you are reading a date from a file and the owner of the file suddenly decides to change the format of the date?

Use the new anydtdte. informat.

From the SAS documentation:
Reads and extracts date values from DATE, DATETIME, DDMMYY, JULIAN, MMDDYY, MONYY, TIME, YYMMDD, or YYQ informat values


Nice.

Monday, September 26, 2005

A New SAS Blog To Check Out

It used to be that you could not find many resources for SAS on the web. Except for SAS-L and the odd university stuff, there just wasn't much out there. Thankfully, that's changing. For the SAS novice there is now more SAS related material on-line than ever before. Here's a new SAS blog that's aimed at spreading some SAS knowledge. Take a moment and check it out. And maybe even show some support with a comment or two. . . :)

SAS programming from scratch - by STANSI

Monday, August 29, 2005

Function WishList

I am looking for ideas for functions that you would like in SAS but currently do not exist. I have a couple of ideas such as:

Count the number of words in a string (or items in a list).

How about a function to quote the words in a character string?

A function to determine if a number is prime or not?

Would you like a function that returns an MD5 hash from its input?

And I looked through a couple of the SASWARE ballots and found some contenders.
Such as, provide a function that returns the ordinal of a word in a string, such as
WORDINDEX("abc de def", "def")=3

Do you have any other ideas? If so, please let me know.

Tuesday, August 16, 2005

Macro comments

%macro aQuestion;

%* this code doesn't have any problems does it?;

%put Why are macro comments handled so badly?;
%put I mean, we are at version 9 here people.;
%put I wonder what version number will handle macro comments correctly?;
%put Version 10, 12, 14...?;
%put I wonder at what version number I will finally remember to stop using contractions in my comments!;

%mend aQuestion;

%aQuestion;

Tuesday, August 09, 2005

BLM

Recently I got an email from someone that thought I might be interested in an article about the SAS Institute in some online industry newsletter. I might have signed up and read the article if the executive summary didn't read like it was written by a 12 year-old making fun of her dad:

"SAS: Striving to Sustain Leadership
by P.J. Jakovljevic
SAS Institute has been successful, moving beyond a business intelligence. Lately, it has lately focused on sustaining its technology leadership, expanding in some vertical markets, and becoming more attentive to the low-end market."

Again, I couldn't bring myself to actually create a login to read it, but this other executive summary on the same site seems like it could be worth the read. If only to convince myself that most people in this industry are truly bat crazy.
"BLM: Buzzword Life Cycle Management
William Sheppard - August 6, 2005

Executive Summary

The IT industry is alive with buzzwords. The management of buzzwords represents a significant area of improvement for both the buzzword users (BU, for example vendors, analyst and consultants) and buzzword consumers (BC, mostly end users). Buzzword life cycle management (BLM) is a proven discipline being applied to this crying need within the software industry..."

My IT skillz must be going soft cause I've never taken part in the buzzword life cycle discipline. But I've got a hunch that waterfalls are involved.

Check it out.

Thursday, July 28, 2005

Homegrown Solution

I finally got around to putting together a little utility to encrypt SAS script/source files. Of course, since SAS won't put out any API's or SDK it doesn't hook into the system directly. You just use an unnamed pipe on a filename statement to use it. The concept is pretty straightforward.

The utility is called Fugu. You can read more about it and download it here. I did a quick search of google before writing it and couldn't find a utility that is small, fast, easy to use and writes to STDOUT. Maybe someone else will find it useful?

Currently I am only offering up a windows binary version, but I also put the source code up there so you can compile it for your own platform. I did compile it using gcc on my linux box and it worked fine.

Speaking of APIs and SDKs, does anyone have the version 6 SAS/TOOLKIT book? I looked all over the place and couldn't find anything useful in online documentation. It is all additional notes to the version 6 book. So, if you've got the book and it's collecting dust, I'll pay for the shipping...

Wednesday, July 13, 2005

An Answer?

If anyone is interested, to keep hardcoded usernames and passwords from sitting in your SAS/Connect script, the SAS Institute recommends putting macro vars into the connect script and assigning those macro vars in a compiled data step. Straightforward and easy to use.

Check it out:
FAQ #1800

As an aside, if you've got > 1800 frequently asked questions, can you really describe them as being frequently asked? Someone out there is asking a lot of questions. Frequently.

Wednesday, June 22, 2005

find . -exec fgrep -i "passw" '{}' \; -print

Do you use SAS/Connect? I do. I think it works really well. I've used it for many years in many situations and never found it coming up short. I would even go so far as to say it's one of the few things in SAS that is straightforward, stable and a pleasure to use.

Do you hardcode usernames/passwords into your connect scripts? Sometimes circumstances dictate it. Do you use pass-through SQL? Do you use SAS/Access libname statements? Do you think it might be a bad thing to have usernames/passwords sitting around in code?

This is something I've thought about before, and it just came up at work recently so I'm thinking about it again. As far as I know there is no facility in SAS to encrypt/decrypt script files during the SAS session (in this case, I am considering a "script file" to be anything not compiled: base sas, connect script, config files, etc). Does anyone know if there is such a mechanism?

I think I could write something to accomplish this, though it would be a little kludgy since there are no api hooks into the SAS system internals. But then, what's a little kludge between SAS programmers?

Tuesday, June 14, 2005

Pre-Cambrian Code

How old is the SAS code you are working with? I am working with an application that was built from an application that was built from an application, etc. A lot of times you can tell the code was just taken from what was previously working and slapped into the newer version. And there are some crazy fossils to be found in that old code. Take this one for example:

proc sortt data =

This was showing up in just enough places to get me curious. It couldn't just be a common misspelling, could it? Nope.

Now the question is: Should I leave it there for someone else to find in a couple of years, or change it?

Monday, May 16, 2005

Bad Design

So here I am happily working along in SAS/AF updating an old V6 frame: Make some buttons, a couple listboxes, look up successful return codes for assigning a libref/fileref/opening a data set/ opening a directory/ etc, display a couple rows from a sas data set. Easy, easy, check, easy. Display rows, no problem. I'll just use the new-in-v8 (and somewhat obtuse) model/viewer relationship with a table viewer control. Drop the table viewer on my frame, go to the properties frame to adjust the attributes (that one always cracks me up), stick the SAS data set model on there, associate it with my viewer via the viewers.model attribute/property. No problem. Compile. Run. Click my View Items button. Go get more coffee as SAS does some crazy dog slooooooooooooooow operations to display the data set in the viewer. What the heck?

Quick search on support.sas.com and...

Slow performance initializing data with SAS Data Set Model
If you are displaying a data set with a large number of columns (ie. 100
or more) via a SAS Data Set model that is attached to a Table Viewer or
Form Viewer control in a SAS/AF FRAME entry, you may see slow
performance when initializing the viewer. This is a design issue and
may be alleviated if you instead display your data set through the
Version 6 legacy classes, Data Table or Data Form respectively.


Now I don't have any say in the number of columns in the displayed data set. And I agree, it smacks of bad design somewhere along the line when a SAS data set grows outrageously longitudanal (it's called normalizing people.)

But to call using V8 objects instead of legacy V6 objects bad design? Seems kinda weak if you ask me.

Wednesday, April 13, 2005

noFmtError For Me

You know those little SAS options that come in super useful about once every nine months, but you can never remember the name cause you almost never need to use it? So you spend half a day pulling your hair out trying to remember what the little bugger is called and scrolling through the millions of results that the SAS online doc returns from your futile keyword search (another topic for another day).

Well, here's one that came up today:

options fmtError|nofmtError;

This tells SAS not to worry if it can't find a format that a data set variable is associated with.

I'll end up using this when someone passes me a data set they have created, but I don't have access to the formats they are using.

Tuesday, April 05, 2005

SHELLacked

A couple months ago the company I do a lot of work for got a nifty new Unix server to replace the old one. Woohoo! It's fast. Screamin' fast. Now we can be even more productive than ever!

Except it doesn't have bash. Ugh. Unix without bash is like chocolate syrup without the chocolate. I don't really know what that is, but it certainly isn't good.

Something strange happens to my fingers at the standard prompt. They forgt how to tpye. And my memory goes all funny and I can't remember the names of directories and files. It's nearly impossible for me to get anything done. And multiple requests to sysAdmins have gotten me nowhere.
Dear Sysadmin: My tyoing sux. Can you plez install bash?

Dear Sysadmin: This Korn is killing me. Bash please?

Dear Sysadmin: The tab key seems to have died. Could you find a way to make it Bourne Again?

Thursday, March 17, 2005

Call Scan() Routine

Here's a nice new SAS function/call routine thingy that I used recently:
call scan()

It works kinda like the scan() function, except it gives the index and length of the word. I was needing to look through text for a particular keyword and then I wanted to find other keywords within five words on either side. Say I'm looking for FIRE. And I want to know if HOUSE or CAR appears within five words. Before, I would have done this type of thing in PERL using it's sweet hash tables. But with call scan() I can get the position of a word, use substrn() to check for my keyword and then loop back and forth scan()ning for my other words.

Nifty.

Monday, March 07, 2005

Kicking The Macro Habit

I do a lot of work with SAS macro. I use it daily, but I must admit, I find the resulting code kludgy and ugly. Being able to write code that writes code is indeed powerful, but it comes with a cumbersome price: all those % signs, ampersands, the semi-colon that will sometimes get gobbled up by the compiler, local and global symbol tables, %sysfunc() to get to base functions, all those wacky quoting functions %NRBSTR( aaaarrrrrgggh! ), limited debugging, etc etc etc.

Plus, I have met many, many otherwise competent programmers that just can't seem to quite _get_ full scale macro programming. Like pointers to pointers in C, at some point the thread just unravels.

Now don't get my wrong. I think macro is a valuable skillset, and has a valuable place in your programming toolbox, but for large, complex applications I am always thinking there's got to be a better way.

SCL comes to mind. You can use it to write base sas code to the program stack just like macro, with the added benefit that you can manipulate and recall the stack. Imagine getting to recall all the base sas code that your big complicated macro produced instead of having to parse your log. There's other benefits too: built in data types arrays and lists; SCL debugging and compiling; much cleaner syntax, and clearly defined submit blocks; variable locality using declare, just to name a few.

So does anybody use SCL this way? I've been thinking about it for a while, but I haven't yet. Using SAS catalogs instead of text source files would be a drawback for me. Are there any other drawbacks?

Maybe those programmers that don't get large scale macro programming just have more attuned programming sensibilites than me. They see all those %'s and &'s and their refined programming brain wanders away in disgust. Maybe I will code my next large application in SCL from the beginning instead of turning to macro out of habit. But old habits are hard to break.

Monday, February 28, 2005

Validating Flat Files

There are still a lot of flat files out there. And there are a lot of SAS programmers who have to deal with them. And unfortunately, there are a lot of people who create flat files that don't understand their structure, or lack thereof.

I worked for a company that had a lot of vendors providing flat files of data. These vendors thought nothing of randomly expanding/shrinking columns and shifting data all over the place. Their DB was able to handle it, no problem. But when the flat file would get passed to us it would cause all kinds of headaches. What to do?

Check the dates. Try to get a date at the end of the record, read it in using a date informat and then check the result. If anything gets shifted around you either get a date that is very unrealistic or missing. Sure it's pretty simplistic, and not totally foolproof, but every programmer that saw it understood it. And more importantly, because it took 2 minutes to code, everybody used it and incorporated it into their own code. And suddenly life was better.

Tuesday, February 22, 2005

What's missing?

Here's some syntax I ran across today that I hadn't seen before:

data something;
set somethingElse( where= (someVar is not missing) );
run;

Usually I am skeptical of new syntax that duplicates the functionality of existing syntax. It causes language bloat and ends up adding unneeded complexity. But I think this is pretty useful. It looks a lot better than the alternative '' or ' ' or . for missing. And it solves ambiguities between numeric and character missings. I only have one complaint:

where= (someVar is not null)

Language bloat. I hate language bloat. Is the keyword null the same as the keyword missing?

Why do I hate language bloat?

If you are trying to figure out data discrepancies in a large system written by others and you run across both keywords, you will have to spend some time trying to figure out if SAS does indeed handle them the same by either writing test cases or looking through documentation.

Wednesday, February 16, 2005

Windows XP Tweak

If you use Windows XP, go here:
http://www.microsoft.com/windowsxp/downloads/powertoys/xppowertoys.mspx

And definetly download CmdHere.exe.

It allows you to open up a DOS command prompt from windows explorer and the command prompt automatically points to the selected folder.

Quite possibly the most useful thing you will do with XP the whole day.

Monday, February 14, 2005

Bandaids

Today I am working on V6 to V8 issues. The client I am working with has a whole system written in SAS: AF, BASE, STAT, pretty much a mixed bag.
And man oh man, the V6 to V8 conversion has been a major source of headaches. Here's what we're working on:
AF screens capture and validate user input, some BASE SAS code is created and gets run using the AT command. The code does some
subsetting and then creates the reports the user asked for. Pretty standard stuff. Here's the problem: the proc gcharts use the NOZEROS option
to keep the proc from creating empty bar charts. In V6 the option causes the proc to fail if a group can't be drawn and then it's on to the next step. In V8, the option now causes a syntax error(!!!??) when a group can't be drawn. SAS sets obs = 0, checks syntax, and doesn't run any of the subsequent steps. Which means some of the reports don't get created.

Is this little change documented anywhere? Can anyone explain to me why an option should cause SAS to behave as if there were a syntax error?

The fix:
options noSyntaxCheck; I'll give you five minutes to find that in the documentation.

Ding! Time's up. Don't feel too bad if you can't find it. I couldn't either. I don't know how I was ever productive before google.

Tuesday, February 08, 2005

FTP- Back To Basics

Okay, let's be honest here. The more you learn and the more specialized you become as a programmer, the more of the 'basics' you tend to forget. There are a couple tools I believe every programmer should have at their fingertips and basic FTP is one of them. Have you met the programmer who can't move a file around without SAS/Connect proc upload/download or some GUI FTP client? I have, and believe me, you don't want to be that programmer!

So, let's review File Transfer Protocol.

First of all, get to a command prompt. At your prompt type:
> ftp
This will start the FTP client. In FTP you open and close a connection to another machine. For my example the remote machine will be called fat_tuesday.
> open fat_tuesday
Now give your username and password.
Once you are connected to the remote machine you can put files onto the remote machine or get files onto your local machine. The local machine is ALWAYS referring to the machine you launched ftp from, not where you happen to be sitting! To put a file from my local machine to my fat_tuesday server I type:
> put myFile.txt
To get a file from the remote machine to my local machine I type:
> get someFile.txt
When I am done I can close the connection:
close fat_tuesday
And close the ftp client:
> bye

That's the basics. Here's some other useful commands, and remember you can always type help in ftp to get a list of commands.

mput -- multiple file put, can glob filenames (mput *.jpeg)
mget -- multiple file get, can glob filenames (mput *.jpeg)
prompt -- turns confirmation on/off for multiple file actions
bin -- change the file type to binary
cd -- change directory on the remote machine
lcd -- local change directory
ls -- list file (or use dir)

Monday, February 07, 2005

Valid Complaints?

Say you are programming away in SAS and you need to switch every occurence of a word with another word. You look at the SAS functions and happily see TRANWRD().

From the documentation: TRANWRD(source, target, replacement).

No problem; you stick it in your data step and it's on to the next thing. And so next you find that you need to get rid of some bad characters. You already know you can compress() the little buggers out (or maybe you didn't know? Compress() can be used to compress any character- not just spaces! There's an optional third argument where you can specify which character(s) you want to remove.) But that presents a little problem since it may cause some wordsToRunTogether. And you don't want that.

So you look in your documentation and find the TRANSLATE() function replaces specific characters in a character expression. Woohoo we are saved! We can replace bad characters with spaces. But check this out: TRANSLATE(source,to-1,from-1<,...to-n,from-n>)

Translate to from.
Tranwrd from to.

It leaves me scratching my head. Why?

And since I brought up the trandwrd() function and complaints, I really should point out that tranwrd() DOES NOT replace WORDS. But rather a pattern of characters. What!? If you want to translate words with the tranwrd() function you have to provide your own spaces.
* get rid of preposition FOR;
myText = tranwrd(myText, 'FOR', '''); * oops, FOREST turns into EST;

myText = tranwrd(myText, ' FOR ', '''); * now translating the word FOR;