Using Formats to Recode a Variable - SAS Programming

Now let's get just a little fancy with code that's a little more useful and only a little more difficult to understand.In this example,you do not use a DATA step to create a new grouped variable as you did in the last example; instead,you make use of a FORMAT statement in a PROC step to achieve the desi red grouping.First the code,then the explanation.

Example

PROC FORMAT;
VALUE SCOREFMT 0-64*'Fail1
65-69='Low Pass1
70-79='Pass'
8D-89»'High Pass1
90~HIGH='Honors';
RUN;
PROC FREQ DATA=GRADES;
TITLE 'Example';
TABLES SCORE;
FORMAT SCORE SCOREFMT*;
RUN;

You first need to set up a SAS output format by using the FORMAT procedure (this is covered in greater depth in Chapter 11, "PROC FORMAT. ") Any value that falls in a range of values on the left of the = sign is assigned the format on the right.

It's like setting up an output translation table.(In this example,you create a format called SCOREFMT.)You then use the FORMAT statement in the frequency distribution creating procedure, the FREQ procedure,to assign the output format SCOREFMT to the variable SCORE.Since the format you are using is a grouped format,the end result is the desired grouping, or collapsing,of the original data values into your desired groups.

This works because PROC FREQ groups data values based on their formatted value (if there is one).You can also place a FORMAT statement in the DATA step used to create a SAS data set.This permanently associates a particular format to a variable in the data set.

In this example, you could have associated the format SCOREFMT to the variable SCORE in the DATA step code (which is not shown in this example).Of course, the format SCOREFMT would have to have been created in a FORMAT procedure previous to the DATA step.

The format SCOREFMT would then automatically be associated with the variable SCORE in all procedures which operate on the data set
GRADES; you would not need to include the FORMAT statement in the procedure code.In either case,the results from the PROC FREQ code are frequency counts on the age categories as shown below:

Output from Example - Using Formats to Recode a Variable

Output from Example - Using Formats to Recode a Variable

This example also allows us to point out other danger zones to avoid,both having to do with setting up ranges by using the FORMAT
statement. First of all,always make sure that the boundaries of the ranges you set up do not overlap.If they do,PROC FORMAT will let you know about it, but it's better to be careful in the first place.Also,make sure that there are no " cracks " into which a value can fall. In other words, make sure that the ranges are totally inclusive.

In this example,SCORE is an integer, and the code covers all possible values. If, however, SCORE was a computed value which could have a non-integer value,you would have to modify the FORMAT statement to eliminate the " cracks." If the scores were not integers, you could rewrite the PROC FORMAT statements like this:

PROC FORMAT;

VALUE SCOREFMT 0-<65='Fail'
65-<70='Low Pass'
70-<80='Pass'
80-<90='High Pass'
90-HIGH='Honors';
RUN;

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics