Using Informat Lists and Relative Pointer Controls - SAS Programming

Repetition,repetition,repetition.Repetitious isn't it? It definitely has its place in certain areas,such as teaching if it's done right,but the whole basis of "computing"in general is to let the machine do the repet itive work,right? Take a look at the following code (typical for a beginning SAS System programmer):

Example

DATA LONGWAY;
INPUT ID 1-3
Ql 4
Q2 5
Q3 6
Q4 7
Q5 8
Q6 9-10
Q7 11-12
Q8 13-14
Q9 15-16
Q10 17-18
HEIGHT 19-20
AGE 21-22;
DATALINES;
1011132410161415156823
1021433212121413167221
1032334214141212106628
1041553216161314126622
;
PROC PRINT DATA=L0NGWAY;
TITLE 'Example ';
RUN;

The objective here is obviously to read data consisting of an ID number, answers to ten questions,and height and age for each subject.There is a better way.The SAS System provides the ability to read a repetitive series of data items by using a variable list and an informat list.The variable list contains the variables to be read; the informat list contains the informat(s) for these variables.The previous code can be rewritten, using a variable list and an informat list, as follows:

Example

DATA SHORTWAY;
INPUT ID 1-3
@4 (Q1-Q5}(1.)
@9 (Q6-Q10 HEIGHT AGE)(2.)?
DATALINES;
1011132410161415156823
1021433212121413167221
1032334214141212106628
1041553216161314126622
;
PROC PRINT DATA=SHORTWAY;
TITLE 'Example ';
RUN;

The INPUT statement here works as follows:after reading in a value for ID, five values Are read for variables Q1, Q2, Q3, Q4, and Q5.They are all read with a 1.informat, and they are all contiguous in the raw data. (A small digression is in order.

A list of variables, all having the same base and each one having a sequential numeric suffix, can be abbreviated as BASE#-BASE#.In this case, Ql, Q2, Q3, Q4, and Q5,can be written as Q1-Q5. End of small digression.) After the last variable in the list, Q5, is read, a new list is initiated. This one consists of variables Q6-Q10,HEIGHT and AGE. This time they are all read with a 2. informat. An alternate coding for the previous INPUT statement is: INPUT @1 (ID Q1-Q10 HEIGHT AGE)(3. 5*1. 7*2.);

In this case there is only one variable list and one informat list,but the instructions are identical to the last example.The n* denotes how many times a particular informat is to be used; 5*1. means,"use the 1.informat five times, i.e.for the next five variables." Aside from the title, either of the previous two sets of code produces the following output.

Output from Example - Using Informat Lists and Relative Pointer Controls

Output from Example - Using Informat Lists and Relative Pointer Controls

When using informat lists,the raw data do not have to be all contiguous and of the same type, as they are in the last example.Any informats can be used and intermixed,and blank spaces can be skipped with the relative +n pointer controls.

These pointer controls can be used anywhere in an INPUT statement and merely move the column pointer forward or backward +(-n) the designated (n)number of spaces.Since there is no negative pointer control available, in order to go backwards, you actually have to advance a negative amount.

Silly looking at first, but it is logical.Suppose your 10 questions occur in five pairs, each pair consisting of a numerically answered question and then a characterly answered question (not really sure about "characterly", but you get the point.) Suppose further that all pairs are separated from other pairs,and from other variables, by two spaces. The following code handles this situation:

Example

DATA PAIRS;
INPUT §1 ID 3,
@6 (QN1-QN5}{1. +3)
@7 (QC1-QC5){$1, +3) ©
@26 (HEIGHT AGE)(2. +1 2.);
DATALINES;
101 1A 3A 4B 4A 6A 68 26
102 1A 3B 2B 2A 2B 78 32
103 2B 3D 2C 4C 4B 62 45
104 1C 5C 2D 6A 6A 66 22
;
PROC PRINT DATA=PAIRS;
TITLE 'Example ';
RUN;

OK, what's happening here? It's really pretty straightforward.The INPUT statement performs the following tasks:

  • 2 Go to column 6 and repeat the following five times: read a 1-byte numeric field into a variable and then move forward 3 columns from the current position to get ready for the next variable in the list. Name the variables QN1-QN5.
  • © Go back to column 7 and repeat the following 5 times: read a 1-byte character field into a variable and then move forward 3 columns to get ready for the next variable in the list.Name the variables QC1-QC5.
  • 4 Go to column 26 and read a 2-byte field into the numeric variable HEIGHT.Advance the column pointer 1 column, and read another 2-byte field into the numeric variable AGE. That's it.Powerful and efficient. The resulting output is as follows:

Output from Example - Using Informat Lists and Relative Pointer Controls

Output from Example - Using Informat Lists and Relative Pointer Controls

The key to using variable and informat lists is patterns. If you can arrange your data in repeating patterns, then these repetitions can be put to your advantage. Look for them.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics