# Reading Data Arranged in Columns - SAS Programming

In addition to being able to read raw data values that are separated from each other by one or more spaces,the SAS System provides two methods of reading data values that are uniformly alignedin columns:column input and formatted input.Both provide the ability to read data from fixed locations in the input record,and both therefore expect to find the data in those locations.

Formatted input provides the additional feature of allowing you to read data that occur in other than standard numeric or character formats, but this is one of those "beyond the scope of this topics.

Column input and formatted input,as well as list input,can be freely intermixed within the same INPUT statement,as you will see in later examples in this chapter.A column INPUT statement can be used to read lines of data that are aligned in uniform columns.With this method,the name of the variable being read is followed by the column,or column range (starting and ending columns),containing the data for that variable. If you are defining a character variable,the identifying "$" comes before the column numbers.Here is an example. Example DATA COLINPUT; INPUT ID 1 HEIGHT 2-3 WEIGHT 4-6 GENDER$ 7 AGE 8-9; E 8 -9;
DATALINES;
168144M23
278202M34
362 99F37
461101F45
;
PROC PRINT DATA=COLINPUT;
TITLE 'Example 5.1';
RUN;

This code produces the following output,Example (identical to that displayed in Example 1 except for the title.)

Output from Example - Reading Data Arranged in Columns

In this example,you do not leave any spaces between data values.You can if you wish,but unlike list input,it is not necessary to delimit the data values in any way.The column specifications in the INPUT statement provide instru ctions as to where to find the data values.Also, notice that we placed the value 99 (for the variable WEIGHT for observation 3) in columns 5-6 rather than in columns 4-5 as we did in previous examples.

Numbers placed right-most in a field are called right adjusted;this is the standard convention for numbers in most computer systems.You could have placed the 99 in columns 4-5 here as well because your instructions were to read the value for AGE anywhere in columns 4-6.The SAS System correctly reads the value even if it is not right adjusted, but it is a good habit to right adjust numbers in general since other computer programs aren't quite as smart as SAS software.

Speaking of good habits,let's adopt another one.(If these habits truly yield more productive programming,then they will be easy to make and hard to break.)When using column and formatted input,it's worth the extra effort to code the variables in the INPUT statement in a uniform columnar fashion.It makes for easier code proofreading andmaintainability.Here is another version of the previous program that will yield exactly the same output (except for the title):

Example

DATA COLINPUT;
INPUT ID 1
HEIGHT 2-3
WEIGHT 4-6
GENDER $7 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ; PROC PRINT DATA=COLINPUT; TITLE 'Example 5,2'; RUN; Notice that each variable name in the INPUT statement is on a separate line and that the column specifications all line up.This makes for a neater, easier to read, program. Reading Selected Variables from Your Data When you read data in columns,you indicate missing values by leaving the columns blank.You also have the freedom to skip any columns you wish and read only those variables of interest to you.If, for example,you only wanted to read ID and AGE from the previous data,you could use the following code: Example DATA COLINPUT; INPUT ID 1 AGE 8-9; DATALINES; 168144M23 278202M34 362 99F37 461101F45 ; PROC PRINT DATA=COLINPUT; TITLE 'Example 5.3'; RUN; This code produces the following output: Output from Example - Reading Selected Variables from Your Date Reading Values in Different Order In this example you did not eliminate any data from the lines of data,but you chose to read only part of each line,specifically columns 1(ID) and 8-9 (AGE).When using column or formatted input,you can read data fields in any order you want to.You do not have to read them in order, from left to right, in ascending column order.You can also read column ranges more than once,or read parts of previously read ranges,or even read overlapping column ranges as different variables.The next set of code shows an example of reading the same data that you have been working with, but by jumping around the input record. Example DATA COLINPUT; INPUT AGE 8-9 ID 1 WEIGHT 4-6 HEIGHT 2-3 GENDER$ 7;
DATALINES;
168144M23
278202M34
362 99F37
461101F45
;
PROC PRINT DATA=COLINPUT;
TITLE 'Example 5.4';
RUN;

This code produces the following output:

Output from Example - Reading Values in Different Order

Notice that the variables exist in the data set COLINPUT,and are therefore displayed in the output,in the same order in which they are read via the INPUT statement.