Updating a Master File from a Transaction File - SAS Programming

The UPDATE statement is designed to update a master data set from atransaction data set.An example of a master data set would be a data set containing all employee records or a data set of product numbers, descriptions, inventory levels,and prices.

A typical transaction data set would contain only the information to identify the record to be updated and the new updated information; for example,a product number and the new price.An update would be accomplished by substituting the new information in the transaction file for the old information in the master file.

One main difference between the UPDATE and MERGE statements is that if there is a missing value for a variable in the transaction data set,that variable is untouched in the master data set. Things can get messy when there are multiple values of the BY variable(s) in either the master data set,the transaction data set,or both, so we will stick to basic examples.

Suppose you have a master and a transaction data set, both of which contain the variables ID, DEPT, and SALARY.Contents of the data sets are as follows:

Updating a Master File from a Transaction File

You want to apply the transaction data set to the master data set to update the records. If you assume the records are already sorted by ID,you can write:


TITLE 'Updated Data Set - SEWMAST';

The syntax is obvious.The new data set created, NEWMAST,is the old data set, MASTER,updated with the data from TRANS.Output is as follows:

Output from Example-Updating a Master File from a Transaction File

Output from Example-Updating a Master File from a Transaction File

Notice that observations 1 and 4 in the original MASTER data set are unchanged since these ID's are not present in the TRANS data set.In observation 2, the value for SALARY in the TRANS data set (22,000)replaced the corresponding value in the MASTER data set (21,000).

The missing value for DEPT in observation 2 of the TRANS data set did not, however,replace the corresponding value for DEPT in the MASTER data set.In observation 3, both values in the TRANS data set replaced the corresponding values in the MASTER data set.TRANS data set value of RECORDS for variable DEPT replaced the originally missing value in the MASTER data set, while the missing TRANS data set value for SALARY did not replace the MASTER value (18,000).If a value of the BY variable (ID) is present in the transaction data set which is not present in the master data set, the new observation will be added to the new master data set.

For example,if MASTER and TRANS were as follows:


The resulting data set, NEWMAST, would contain:
Data set NEWMAST
1 PARTS 13,000
2 PERSON 22,000
3 SALES 24,000
4 EXEC 55,000
5 RECORDS 18,000
9 SUMMER 10,000

f a transaction data set has fewer variables than a master data set,the update process only affects the variables contained in the transaction data set and leaves the remainder of the variables in the master data set unchanged.

Suppose data set TRANS only held data for ID and SALARY, as follows:

data set TRANS only held data for ID and SALARY

The result would be that DEPT would be left unchanged, and SALARY would be updated.Data set NEWMAST

1 PARTS 13,000
2 PERSON 22,000
3 PARTS 24,000
4 EXEC 55,000
5 18,000
9 10,000

Notice that the new observation,ID=9, is added to NEWMAST.If a transaction data set has more variables than a master data set, the update process adds the additional variables to the updated data set.

Suppose data set TRANS contained data for a new variable, GENDER, in addition to updated data for old variables, as follows:The variable GENDER, along with its values, is added to NEWMAST as follows:Data set NEWMAST

1 PARTS 13,000 M
2 PERSON 22,000 F
3 SALES 24,000 M
4 EXEC 55,000 F
5 RECORDS 18,000 M

Until now, you have always been using a transaction data set called TRANS to update a master data set called MASTER,and you have been keeping the updated data in a new data set called NEWMAST.This is not absolutely necessary, however.The following code will update data set MASTER "in place":



There is always some risk of losing data when using this "in place" method of updating, and in a true production environment, you would probably want to use a backup scheme incorporated into your update process.


  1. You have three data sets,ONE,TWO,and THREE. Each data set contains the variables ID,SEX, DOB,and SALARY.In addition,data set TWO contains TAXRATE and WITHHOLD;data set THREE contains HEIGHT and
    WEIGHT.Some sample observations are shown below: Data set ONE
    1 M 10/21/46 70000
    2 F 11/01/55 68000
    7 M 01/27/59 47000

    Data set TWO

    3 F 01/01/33 78000 .36 23000
    5 M 03/07/29 37000 .25 9000

    Data set THREE

    4 M 10/23/49 65000 68 158
    6 F 07/04/65 55000 74 202

    Write a program to create a single SAS data set ALL which combines the observationsfrom data sets ONE,TWO,and THREE and contains the variables ID, DOB, and SALARY.A listing of the resulting data set is shown below:

    Data set ALL

    1 10/21/46 70000
    2 11/01/55 68000
    3 01/01/33 78000
    4 10/23/49 65000
    5 03/07/29 37000
    6 07/04/65 55000
    7 01/27/59 47000
  3. Using the data sets in Problem write a program to create a new SAS data set for those people born on or before January 1,1960,and who earn $50,000 or more per year. For this problem, assume that data set TWO uses the variable name IDNUM instead of ID. You may want to refer to "Example 2" in Chapter 6,"SAS Dates," to see how to use a SAS date literal before starting this problem.3-3.You have a SAS data set, MASTER, which contains records on all your employees.Included in the data set are ID,LASTNAME,FIRSTNAM,GENDER, and AGE.You want to find the name of an employee, but you can only remember that his last name ends in 'fly' and you know he is at least 40 years old.
  4. Data set MASTER
    12 Butterfly Roger M 57
    39 Cline Grove M 44
    23 McFly Clive M 42
    34 Lane Alice F 35
    44 Hopper fly Frank M 21
    77 Elfly Leslie M 64
    13 Kline Mary F 29

    a) Write a program that will list all the possible names meeting these criteria.

    b) Using the same data set,write a program to extract all last names that sound like "Klein" and with first names beginning with'G'followed by 4 other letters.(Yes,we know this sounds ridiculous but we want to give you some practice with the WHERE operators.)

  5. You have two SAS data sets.Data set DEMOG contains ID,DOB,and GENDER; data set SCORES contains SSN(which is equivalent to ID in data set DEMOG),IQ,and GPA (grade point average).Write a program to compute the mean IQ and GPA for each value of GENDER.Do this for all the data and then for employees born before January 1,1972.Some sample data are displayed below:(Please refer to Chapter 10 for examples using the MEANS procedure.)
  6. Data set demog

    Notes: 1. Data are not in ID order.
    2. There are some IDs that are in one file only.

  7. You have a MASTER file which contains PART (part number), NUMBER (number in stock), PRICE, and SIZE. The file is sorted by PART. You want to update this file as follows:
  8. For PART 222, you now have 15 in stock.
    For PART 123, you have a new price of $1,500.
    For PART 333, you have a new price of $2,000 and 20 in stock.
    Data set master

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

SAS Programming Topics