Reading Data from Multiple External Files - SAS Programming

Suppose you have two separate files,FILE1 and FILE2,each containing raw data for the same set of variables.You want to read all the data from both files and create a SAS data set from the aggregate.Here is one way to code it.

Example

DATA TWOFILES?
IF NOT LASTRECl THEN INFILE 'FILEl' END=LASTRECl;
ELSE INFILE 'FILE2';
INPUT ID AGE WEIGHT;
RUN;

The key to this program is the END= option in the first INFILE statement. It works like this: a temporary variable defined by the user is created and given the name specified after the = sign, in this case, LASTREC1.

This variable is given a value of 0 each time a record is read from FILE1 except when the last record is read; it is then set to 1.Each time the DATA step iterates, it checks the value of LASTREC1.

The phrase IF NOT LASTREC1 is equivalent to IF LASTREC1 NE 1.When this is true, (this will be true after each record is read from FILE1 except the last one),the first INFILE statement is executed.After the last record from FILE1 is read, LASTREC1 is set to 1.From then on,IF NOT LASTREC1 is false,and the second INFILE statement is executed(the ELSE condition).

A More Flexible Approach: Using a Variable to Indicate the
External Filename

The previous approach works fine,but there is another way—a little complicated,but pretty neat.You can include the names of the external files you wish to read as values in instream data lines.You accomplish this by using a FILEVAR= option in the INFILE statement.Here is the previous code rewritten using this alternate method:

Example

DATA TWOFILES;
INPUT EXTNAME $; 0
INFILE DUMMY FILEVAR=EXTNAME END=LA$TREC;
DO UNTIL (LASTREC=1);
INPUT ID AGE WEIGHT;
OUTPUT; 0
END;
DATALINES;
FILE1
FILE2
;

Here's how this program works.The first INPUT statement 0 reads a value for the variable EXTNAME from the data line following the DATALINES statement.The first value read is FILE1.

This is fed to the INFILE statement © via the F1LEVAR= option.The system now knows to read data from FTLE1.The DO loop © then processes each data line from the current source, FILE1.When the last record is read,the value of LASTREC is set to 1.

The DO loop then stops processing, and the DATA step starts over again, reading a new value (FILE2) in to EXTNAME.LASTREC is reinitialized to 0, and the DO loop starts up again, this time reading from FILE2.

When LASTREC is once again set to 1, the loop stops. Since there are no more raw data lines instream to read, the DATA step ends. Now,what about that DUMMY file specification and the OUTPUT statement? © The DUMMY is just that.The source of raw data for the INFILE statement is actually the value of the FILEVAR variable (EXTNAME in thisexample.) Since an INFILE statement needs a file specification,we supply a dummy, and call it DUMMY.Pretty clever,huh?.

The OUTPUT statement is necessary because data are only written to the SAS data set being built at the end of each iteration of a DATA step unless an OUTPUT statement is encountered.You need the OUTPUT statement here because this DATA step iterates only twice -- after the last record is read from each raw data file.Without the OUTPUT statement,this DATA step would only write two records to the data set you are building. Here is the general syntax for the DO UNTIL statement: DO UNTIL (condition);

SAS statements END;

The SAS statements in the DO UNTIL block are repeatedly executed until the condition (always placed in parentheses) is true.The DO UNTIL loop always executes at least once,regardless of the logical value of the condition. Another useful looping structure is DO WHILE.The syntax is identical to DO UNTIL.

The statements in the DO UNTIL loop only execute when the condition is true.If the condition is false, the DO WHILE loop does not execute at all. An easy way to remember the difference between these two looping statements is that the DO UNTIL statement tests the condition at the bottom of the loop while the DO WHILE statement tests at the top.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics