Avoiding Unnecessary DATA Steps-1 - SAS Programming

The lesson in this first example is a simple one.Do not read data (either raw or SAS system files) more than is necessary,and do not create SAS data sets more than is necessary.Here is an example where one data set is read and another one is created for absolutely no reason other than to be able to run a procedure without a DATA= option.Remember,without this option, the procedure will operate on the most recently created data set.

Example – INEFFICIENT

DATA NEW;
SET OLD;
RUN;
PROC MEANS H MEAN MIN MAX;
VAR X Y Z;
RON;

creating the NEW data set is totally unnecessary.

Example – EFFICIENT

PROC MEANS DATA=OLD 8 MEAN MIN MAX;
VAR X Y Z;
RUN;

Using the DATA= option for any SAS procedure allows you to use a previously stored SAS data set.You should use the DATA= option for all SAS procedures that operate on data sets (there are some that don't, like PROC OPTIONS),even when it is not necessary.This is good programming style and an easy habit to make.

If you choose to ignore this recommendation, you can still avoid an unnecessary DATA step by using the _LAST_= option on an OPTIONS statement.In the previous example, using the following OPTIONS statement allows you to omit the DATA= option and still run PROC MEANS on data set OLD:

OPTIONS _LAST_ = OLD;

The reason this works is that _LAST_ represents the most recently created data set, which is the default argument for the DATA= option.If you change the value of _LAST_,you change the default value of DATA=.Seems like a roundabout,cumbersome way to avoid an easy productive habit.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics