# Dropping Unnecessary Variables When Building a SAS Data Set from Raw Data - SAS Programming

Many programmers neglect dropping unnecessary variables.Carrying these along places a double burden on resources.First,the data sets are larger and more storage space is needed to hold them temporarily or permanently.Second, since the data sets and accompanying PDVs are larger, more CPU time is needed to process them.

In this example you read in student answers to a test (a multiple-choice test on SAS programming, perhaps) and compute a raw and percentage score for each student. You want a data set containing only raw and percentage scores for further analysis.The method you use, however, requires other intermediate variables to be created along the way. Compare the two following sets of code very closely. The second program has one very small addition which makes one very large difference.

Example – INEFFICIENT

DATA SCORE;
ARRAY KEY[5] $1; ARRAY Q[5]$ 1;
RETAIN KEY1 'A' KEY2 'B' KEY3 'C' KEY4 'D' KEY5 'E';
INPUT (Q1-Q5)($1.); DO 1=1 TO 5; RAW+(Q[I]=KEY[I]); END; PERCENT=100*RAW/5; DATALINES; ABCDA BBCAC EBCAD ? The variables KEY1-KEY5, Q1-Q5,and the DO loop counter I are key to the method used to derive the final scores, but they are extra baggage after they have served their purpose. Why keep them around? Example – EFFICIENT DATA SCORE; ARRAY KEY[5]$ 1;
ARRAY Q[5] $1; RETAIN KEY1 'A' KEY2 'B' KEY3 'C' KEY4 'D' KEY5 'E'; INPUT <Q1-Q5)($1.);
DO I=1 TO 5;
RAW+(Q[I]=KEY[I]);
END;
PERCENT-10Q*RAW/5;
KEEP RAW PERCENT;
*or DROP KEY1-KEY5 Q1-Q5 I;
DATALINES;
ABCDA
BBCAC