Dropping Unnecessary Variables When Building a SAS Data Set by Setting an Existing One - SAS Programming

In the last example you built a new SAS data set from raw data and used intermediate variables along the way which were subsequently dropped before writing out the observations.It is also often the case where you need to create a new SAS data set from an existing one by manipulating a subset of existing variables,but you do not need all the variables in the existing data set. Here is one way to drop those unneeded variables from the new data set:

Example – INEFFICIENT

DATA NEW;
SET OLD;
(programming statements)
DROP X1-X20; /* DROP statement */
RUN;

This certainly looks good at first glance. After all, you are not keeping those extra X1-X20 variables. You can do better.

Example– EFFICIENT

DATA NEW;
SET OLD (DROP=X1-X20); /* DROP data set option */
(programming statements)
RUN;

No,the difference is not saving the one line of code!The inefficient program uses a DROP statement while the efficient program uses a DROP= data set option. DROP (or KEEP) statements are independent stand-alone statements and do not take an equal sign before their variable lists. DROP= (or KEEP=) data set options are placed in parentheses following the data set names to which they refer, and they take the equal sign.

In the inefficient example, all the variables from the data set OLD are brought into memory (into the PDV) for each observation being built, and then the variables X1-X20 are dropped before the observation is written to the new data set. In the efficient program, variables XI to X20 are never read in to memory, and subsequently are not written to the new data set.I/O processing is reduced, and the smaller PDV requires fewer CPU resources as well. These savings can be substantial.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics