You are at the gym running on a treadmill listening to your iPhone through the earbuds. You're having one of those good, easy runs. Maybe you're listening to an interesting podcast like WNYC's Radio Lab, or some good running music. You're focused. You've got a nice rhythm going and your body feels relaxed. You push the pace a little more than normal. Sweat is dripping off you. Endorphines begin flooding your system. You have declared war on holiday snacks and Christmas dinners. And now on this treadmill you are winning the war. Suddenly your iPhone stops playing. You are pulled out of your running reverie and glance down at your now-beeping phone. Voice control has been activated. You fumble for the controls. You stumble on the treadmill. The pace is too fast for multi-tasking. You slow the pace down on the treadmill and flip the phone back to iPod. The phone takes on a life of it's own, pausing and clicking and going back to voice control again. You try to run without, but it's just not the same. Your running rhythm won't return. Everything hurts. You're exhausted. You hit stop on the treadmill and step off defeated.
Apparently there is a design flaw on the iPhone earbuds where moisture causes the little clicker to send random clicker signals to the iPhone.
A design flaw on an Apple product! And Apple is good at design.
So now that I've gotten my preliminary decisions out of the way in the previous post, I can start working on designing my app/site. Luckily Rails encourages agile development which is all about procrastinating on difficult design decisions. Rather than try to design your entire project at the beginning, you take a very minimalist approach and assume the details will reveal themselves as you go.
My minimalist design will consist of two pieces: a short paragraph describing the site, and a piece of paper with drawings of the layout of the pages.
The description of the site:
The site will display SAS jobs. Job seekers will be able to search the jobs by zip code. Recruiters will be able to post jobs. The jobs will expire after a certain amount of time. There will also be some admin pages to control Recruiters and the Jobs they post.
Now I'm taking out a piece of paper and drawing some rough sketches of the main pages.... Done.
From the description I see there are three types of users with three main actions: job seekers find, recruiters post and admins admin. I also see that there are only two models that I need in my data base: recruiters and job postings.
That's pretty much it for the design. Now I can begin coding.
Thursday, December 17, 2009
Design
Friday, December 11, 2009
Some Preliminaries
As I mentioned in an earlier post, I am going to build a new site dedicated to SAS jobs. Before I actually start coding the site, there are some preliminary decisions that need to be made. I have already decided to use Ruby on Rails for the framework. I have just started learning it and so far I like what I have seen. I have also decided on my host already. Although nearly every host says they support hosting RoR, most don'tdo a very good job. I have test driven heroku.com with another small site I built and they do a great job. They will essentially host everything for free while you mock it up and you can buy more resources if you ever need to. They are also an exclusive RoR host so they have tailored the develop, test, deploy environment to Rails' unique agile nature. Choosing heroku.com as my provider also forces me to use git as my version control. Heroku seamlessly uses git as part of the workflow and it works very very nicely.
Now that I have gotten the preliminary decisions out of the way I will be able to develop some simple use cases and models and start coding my app.
Oh, and I will be doing all of the development on my macbook running OSX with Xcode3 and using mySql as the development database.
Below are the books that I am using to help guide me along the learning path. I have all three books on my desk and I can say with confidence that all three are very good.
Books:
Wednesday, December 09, 2009
Some Fun with SAS and Perl Regular Expressions
This post assumes you have a little understanding of how regular expressions work and specifically how SAS implements regular expressions. I recently did something like this and thought it would be good to share. Suppose you have a program that searches through a big text field for a specific word. That's pretty easy to code and you can even get away with just using a simple indexW() function. The problem is when you look at the text field on your report, your eyes glaze over as you scan for the word to make sure you are capturing the correct output. If only there was some easy way to make the word stand out from its neighbors.
I used the prxchange() function to search for a pattern and then replace it with another pattern. In this case, I am outputting HTML so I can wrap my search word in <b> tags. First I will give a little example code, then I will break down what the code is doing and finally show some easy improvements. For the sake of clarity and brevity, I am only showing the code that highlights the search word. I am not showing the code that subsets the data based on the search term.
Example 1:
data _null_;
input text $80.;
put "The text before matching " text= ;
text = prxchange('s/(battery)/<b>$1<\/b>/', -1, text);
put "The text after matching " text= //;
datalines;
This battery is dead.
Batteries are in the box.
;
Output in the log:
The text before matching text=This battery is dead.
The text after matching text=This <b>battery</b> is dead.
The text before matching text=Batteries are in the box.
The text after matching text=Batteries are in the box.
Looking at the code above, you can see that the only interesting thing happening is the prxchange() function. The prxchange function takes a regular expression as its first argument. The regular expression uses a substitution syntax with a generic look of
s/(something to look for)/numbered capture buffers/.
So in my example above, the word (or pattern really) I am looking for is battery. I put () around it to specify that it's the first capture buffer: $1. Then I wrap $1 with bold tags. You can see I had to escape the / in the closing tag because it is a special regular expression character. So my regular expression is:
s/(battery)/<b>$1<\/b>/
and reads as: look for the pattern 'battery', store it in $1 and substitute it with <b>$1</b>.
The second parameter to the prxchange() function is -1 and just tells the function to keep searching the source, finding and replacing every occurrence till you get to the end of source. The third parameter 'text' just tells the function what text source to search.
Make sense?
Now there are a couple things that can easily be added to the regular expression to make the code a little more robust and efficient. First of all, the regular expression is recompiled on every loop of the data step. In our case, we don't need that so we can add the /o option to the end of the regular expression to tell it to just compile it once:
s/(battery)/<b>$1<\/b>/o
Also, our regular expression is caSe SensiTive. We can tell it to ignore case by adding the ignore case option (/i) to the end of the regular expression:
s/(battery)/<b>$1<\/b>/oi
Now it will match battery, Battery, BATTERY, etc.
But wait! We also want to match Batteries. What to do? We could shorten our regular expression to:
s/(batter)/<b>$1<\/b>/oi
But that would match batter and batter is a liquid mixture, usually based on one or more flours combined with liquids such as water, milk or beer. That's definetly not what we are looking for. We want to search for batter followed by at least one or more [a-z] characters:
s/(batter[a-z]+)/<b>$1<\/b>/oi
Now our example code looks like:
data _null_;
input text $80.;
put "The text before matching " text= ;
text = prxchange('s/(batter[a-z]+)/<b>$1<\/b>/oi', -1, text);
put "The text after matching " text= //;
datalines;
This battery is dead.
Batteries are in the box.
Do not eat the cookie batter before it is cooked.
;
Output in the log:
The text before matching text=This battery is dead.
The text after matching text=This <b>battery</b> is dead.
The text before matching text=Batteries are in the box.
The text after matching text=<b>Batteries</b> are in the box.
The text before matching text=Do not eat the cookie batter before it is cooked.
The text after matching text=Do not eat the cookie batter before it is cooked.
And finally, you sharp SAS coders probably don't want to hardcode the search term. More likely it would be stored in a variable and then you could construct the regular expression like you would any other text variable:
mySearch = 'batter';
rx = "s/(" ||
mySearch ||
"[a-z]+)/<b>$1<\/b>/oi";
Or something like that. Also, you can search for more than one thing. Just enclose each pattern in () and refer to them as $1, $2, etc. Play around with it. Have fun. Thanks for reading!
Wednesday, December 02, 2009
Side Projects
A few years ago I created a site that lets users donate their SAS expertise by creating online documentation. Users could put in their own example code, explain potential pitfalls, share tips, etc. I coded it all by hand in Perl with MySQL, JavaScript and HTML. I even included some nifty AJAX for logging in, etc. And it had a trendy name too: iDoc. As in "I document" verb, or "Internet Documentation" noun. After a lot of programming and evenings with O'Reilly books, I felt that it was ready to be released to the world. I tentatively exposed the URL and wrote an introductory email to SAS-L. The response was....
virtual silence.
Ho Hum. Crickets chirping. Nothing. Well, there was one person who railed against my decision to not be cross-browser compatible. Specifically, the site worked well with IE, not so well with others. As all of you who have developed anything more complex than the most generic HTML page knows, cross-browser compatibility is a nightmare. To describe it as a pain in the ass is a disservice to donkeys. I digress...
Anyways, the _site_ was a failure. But the _project_ was a success in that it taught me a bunch of stuff that I was able to incorporate into my day-to-day programming. And the things I learned have dovetailed into other side projects.
Fast forward to today. Today I am starting another side project. It will be written in Ruby on Rails. I bought a Rails book two months ago and have written only one little site so far. But I like it. It's clean, it's fast to develop with and it's easy to learn. I don't even know Ruby. I'll be learning that along the way too. I'll try to share what I learn as I go. I'll tag the posts with something like "Side Project" to differentiate them from the normal SAS postings.
The side project I am starting today is a SAS specific job site. I know there are others out there and some good ones too, but I think there might be enough room for one more. And even if the site fails, the time will be well spent learning a new language and platform.
If you could, pop over to the comments and let me know if hearing about a side project written in Ruby on Rails is the least bit interesting to you. As much
as I like the idea of sharing what I learn, I don't want to fill a SAS programming blog with a bunch of posts about a topic nobody cares about. Thanks! -s