Here is a little snippet of code I created to address the problem of assigning a value to a variable based on what data set an observation came from in a data step. Here is an example:
Suppose I have a whole bunch of data sets each representing a different country. I want to set a lot of them in one data step and create one resulting data set with a variable called language. In order to create the language variable correctly, we need to know which data set the observation is coming from. Typically we would use the IN= option on the data set to create a flag and then check that flag using if/then logic.
data selectedCountries;
set
chile(in=chile)
china(in=china)
costa_rica(in=costa)
egypt(in=egypt)
fiji(in=fiji)
turkey(in=turkey)
usa(in=usa)
saudi_arabia(in=saudi)
;
if chile then language = 'SPANISH';
else if china then language = 'CHINESE';
else if costa then language = 'SPANISH';
etc etc etc...
run;
One of the major problems with this approach is it does not scale well. The more countries you set, the more problematic your if/then logic becomes.
Here is a slightly more elegant solution that uses arrays and variable information functions. You still use the IN= option on the data set, however you want to name the in= variable the same as the value we want to assign. Then you create an array of all those in=variables. Finally, you loop through the array of in= variables and check for their boolean value. If it is true then you assign your new variable the value derived from the vname() function.
data selectedCountries;
set
chile(in= SPANISH)
china(in= CHINESE)
costa_rica(in= SPANISH)
egypt(in= ARABIC)
fiji(in= ENGLISH)
turkey(in= TURKISH)
usa(in= ENGLISH)
saudi_arabia(in= ARABIC)
;
array names[*] SPANISH CHINESE ARABIC ENGLISH TURKISH;
do i = 1 to dim(names);
if names[i] eq 1
then language = vname( names[i] );
end;
run;
No comments:
Post a Comment