Using Different Missing Values to Keep Track of High and Low Value - SAS Programming

You can easily extend to provide additional information concerning out-ofrange values. To do this, you use two alternate missing values,namely .H and .L. The use of numeric missing values other than a single period (.) is another underused SAS System feature.Besides the single period that specifies a missing value, there are 27 additional missing value designations:._ (that's a period followed by an underscore) and .A through .Z. In this example, you use separate missing values to represent high and low out-of-range values.You are then able to produce counts of these out-of-range values, and at the same time, you are able to compute statistics without including them. Here is the program:

Example

PROC FORMAT?
INVALUE SBPFMT LQW-<40 ».L 0
40-300 =_SAME_
3Q1-BIGH*.H?
INVALUE DBPFMT LOW-<10 «.L
10-150 =_SAME_
151-HIGH=*.H;
VALUE CHECK .H»'High' 0
.L='LOW
. ='Missing'
OTHER*'Valid'',
RON;
DATA FORMATS;
INPUT fl ID \$3.
S4 SBP SBPFMT3.
€7 DBF DBPFMT3.;
DATALINES;
001160090
002310220
003020008
004 080
005150070

/
TITLE 'Listing from Example 9';
RUN;
PROC FREQ DATA=FORMAT9;
FORMAT SBP DBP CHECK.?
TABLES SBP DBP / MISSING NOCUM; 0
RUN;
PROC MEANS DATA-FORMAT9 N MEAN MAKDBC«1;
VAR SBP DBP;
RUN;

The output from this program follows.

Output from Example - Checking Ranges for Numeric Variables

You conveniently choose .H and .L as the special missing values for high and low out-ofrange values.1 You alternatively could have chosen any of the 27 available missing values such as .A and .B to represent the low and high values. Values in the valid ranges are untouched because you specify _SAME_ for your informatted value.

In the PROC FORMAT code,you also use a VALUE statement © to create an output format so that you can label the appropriate missing values. You use this format to label the variables SBP and DBP in the PROC FREQ output, but not in the PROC PRINT listing 0,so that you can see the actual "internal" values stored in the SAS data set. Note that the missing SBP value for ID number 004 is truly missing, whereas the out-of-range missing values are stored as the appropriate special missing values.

In this example,you use the MISSING and NOCUM options 0 on the TABLES statement of PROC FREQ.Without the missing option, the missing values would only be noted at the bottom of the table in the form Frequency Missing = .When the MISSING option is specified, the missing value(s) are listed in the body of the frequency table, and the frequencies and percentages are calculated based on all the observations, including the ones with missing values. You use the NOCUM option to eliminate the Cumulative Frequency and the Cumulative Percent columns.

PROC MEANS is run to demonstrate that no distinction is made between different missing values when statistical calculations are made. A missing value by any other name is still missing.