Applying an INFORM AT Statement to List Input - SAS Programming

When you use a simple list INPUT statement such as the one in Example 1,the default length for character variables is 8.This means that all character variables are created and stored with a length of 8 bytes.

This creates two potential problems.First,if you are reading small length values as you did for GENDER (1 byte),you are wasting storage spaceusing 8 bytes when 1 will suffice.Second,if you read character values longerthan 8 bytes,the stored value will be truncated to 8. Another shortcoming of default list input is that data cannot be read that occur in standard configurations,such as dates in MM/DD/YY format.

In this example,you can modify your list input by including an INFORMAT statement to define certain patterns or informats in which the raw data occur.Let's read two additional variables, LASTNAME and DOB,which needs special attention(and you can save some storage space as well). Here is the example:

Example

DATA INFORMS;
INFORMAT LASTNAME $20. DOB MMDDYY8. GENDER $1.;
INPUT ID LASTNAME DOB HEIGHT WEIGHT GENDER AGE;
FORMAT DOB MMDDYY8.;
DATALINES;
1 SMITH 1/23/66 68 144 M 26
2 JONES 3/14/60 78 202 M 32
3 DOE 11/26/47 62 99 F 45
4 WASHINGTON 8/1/70 66 101 F 22
/
PROC PRINT DATA=INFORMS;
TITLE 'Example 3.1';
RUN;

This code produces the following output:

Output from Example - Applying an INFORMAT Statement to List Input

Output from Example  - Applying an INFORMAT Statement to List Input

Note that the order of the variables in the output is not the same as the order in the INPUT statement. When the SAS System builds a data set,it stores its variables in the order in which they are encountered in the DATA step. Since the first three variables encountered in the DATA step are LASTNAME, DOB,and GENDER (in the INFORMAT statement),they are the first three variables stored in the SAS data set.The other variables,ID,HEIGHT,WEIGHT,and AGE follow the order in the INPUT statement.

Here the INFORMAT statement gives the following information about the patterns in which some of the raw data elements are found:

  • the length of LASTNAME can be up to 20 characters
  • the data for DOB are found in MM/DD/YY form
  • GENDER is only one character long.

The MMDDYY8.specification after DOB instructs the program to recognize these raw data in MM/DD/YY form and to translate and store them as SAS date values.You also use a FORMAT statement to associate an output pattern,or format with DOB. If you didn't do this,the program would have printed the DOB variable in a SAS date value format.

The DOB for SMITH, for example, would have printed as 2214.We cover the fascinating and mysterious world of SAS date values in depth in Chapter 6, "SAS Dates."(Bet you just can't wait!) You could have accomp lished the same goal as above by supplying your informats directly in the INPUT state ment.This is called modified list input.You simply follow any variable name you wish to modify by a colon (:) and an informat.

The colon tells the program to read the next non-blank value it finds with the specified informat.The previous program could have been written as follows, yielding the same output as the previous example (except for the title and the order of the variables):

Example

DATA COLONS;
INPUT ID LASTNAME : $20.DOB : MMDDYY8.
HEIGHT WEIGHT GENDER : $1.AGE;
FORMAT DOB MMDDYY8.;
DATALINES;
1 SMITH 01/23/66 68 144 M 26
2 JONES 3/14/60 78 202 M 32
3 DOE 11/26/47 62 99 F 45
4 WASHINGTON 8/1/70 66 101 F 22
;
PROC PRINT DATA=COLONS;
TITLE 'Example 3.2';
RUN;

In this example, the SAS System

  • reads the second non-blank value it finds as the value for LASTNAME, but it allows up to 20 characters for the value instead of only the default eight characters.
  • reads the next non-blank value as DOB, but it realizes that the data being read is a date that occurs in MM/DD/YY form.
  • knows that the data for GENDER always occurs as a 1-byte value, and therefore does not use up an extra 7 bytes to save it.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics