In the previous example,you used formats to either display a translated version of a variable's values, or to create a new variable as a grouped version of another variable. Formats can also be used to aggregate data values together for group processing and presentation in procedures that operate on groups of values such as PROC FREQ,a procedure that counts the number (frequency) of occurrences for each formatted value of a variable.
This process of labeling or grouping data values using a format usually has no adverse side effects; however,unexpected results can occur in PROC FREQ, specifically when formatting a variable that contains missing values. Let's look at an example that demonstrates this rather subtle problem.
Output from Example - Encountering a Subtle Problem with Missing Values, Formats, and PROC FREQ
Notice that Output 2 shows the category MISCODED with a frequency count of 2 for the variable X. This represents the two data values that fall into the special format category called OTHER, namely the 5 in Observation 3,and the 3 in Observation 4. Next, look at the output for the variable Y.Notice that there is no longer a row with a label of MISCODED. Why not? The two values in Observations 3 and 4,namely the 5 and the period (missing), are neither 1 nor 2, and therefore should (you would think) be included in the OTHER category.Instead of this, however, the output shows Frequency Missing = 2 at the bottom. Even though there was only one observation that contained a missing value for the variable Y(Observation 4),PROC FREQ reports two missing values.
The reason for this is a little convoluted, but worth the struggle to understand.In this example,since missing values do not have their own separate range in the BADFMT. format, they automatically fall into the OTHER category.Unfortunately,the inclusion of missing values in this OTHER category declares the entire category as missing.The reason is that when a range of values,or a group of discrete values, are grouped together in a single format, all the values in the range or series are represented by the lowest value in the range or series.
Since missing is the lowest value possible,the inclusion of missing in a range, or series, of values translates all the values into missing for formatting purposes.This means that if any observation contains a missing value,all the observations that fall into the OTHER range are treated as missing.PROC FREQ determines the number of missing values after formatting has occurred. Most other procedures that work with grouped data values, such as PROC MEANS and PROC TABULATE,check for missing values before formatting.
SAS Programming Related Tutorials
|SASS (Syntactically Awesome Style sheets) Tutorial||R Programming language Tutorial|
SAS Programming Related Interview Questions
|Logistics Interview Questions||SAS Programming,SQL server Interview Questions|
|Clinical SAS Interview Questions||SASS (Syntactically Awesome Style sheets) Interview Questions|
|SQL Server Analysis Services (SSAS) Interview Questions||R Programming language Interview Questions|
|SAS DI Interview Questions||Advanced SAS Interview Questions|
|Base Sas Interview Questions||SAS Macro Interview Questions|
|Clinical Data Management Interview Questions|
Sas Programming Tutorial
Input And Infile
Set, Merge, And Update
Table Lookup Tools
Proc Means And Proc Uimivariate
All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd
Wisdomjobs.com is one of the best job search sites in India.