Joining (Concatenating) Two Strings - SAS Programming

You have a SAS data set (NAMES) which contains the variables FIRST and LAST.As you can guess, these variables represent first and last names.You would like to create a new variable, NAME, which is the full name (first, a blank, and last).You need to join (in computer jargon— concatenate) the first name,a blank,and the last name.Using the concatenation operator (II) you can proceed as follows:



Since the concatenation operation does not remove blanks, you need to use the TRIM function (which removes trailing blanks) before concatenating the single blank and the last name.Otherwise,the variable NAME could contain more than a single blank between the first and last names. For example, if FIRST was equal to Ron (5 blanks) and LAST was equal to Cody (4 blanks),the value of FIRST | | ' ' | LAST would be Ron Cody (6 blanks between the first and last name).The TRIM function removes all the blanks in FIRST,so we need to concatenate the single blank between the first and last names.


  1. You have a SAS data set,ORIG, which contains ID (5 digit numeric), SCORE, PROP (a proportion), and IQ. You want to create the following new variables:

      a. The natural (base e) log of SCORE.
      b. The arcsine of the square root of PROP.
      c. IQ rounded to the nearest 10 points
      d. A character variable which is derived from the first two digits and the last two digits of the numeric variable ID. For example, if ID=12345, the new character variable would be the string 1245.

  2. Here are some sample data for you to work with:

    Joining (Concatenating) Two Strings

  3. You have a SAS data set, SCORES, which contains the variables X1-X20, Y1-Y20. The X and Y variables may contain missing values. Write a SAS program to accomplish the following:

      a. Compute the sum of the non-missing values in variables X1-X20.
      b. Compute the mean of Y1-Y20. If there are 5 or more missing values of Y1-Y20, set the mean equal to a missing value.
      c. Find the minimum and maximum of the X's.

  4. You have a raw data file called TEMPER which contains temperature measurements taken at one hour intervals. Each raw data line contains several pairs of the variables HOUR (hour of the day) and TEMP (temperature).All temperatures are in degrees Fahrenheit unless they are written in the form nC (the number n followed by a C, no spaces), in which case they are expressed in degrees Celsius. In addition,a value of N was coded when a temperature was not obtained. Write a SAS program to read this data file, express all temperatures in degrees Fahrenheit, and convert each N to a numeric missing value.< /br> Hint: The conversion from Celsius to Fahrenheit is:
    Some sample records from file TEMPER are as follows:
  5. sample records from file TEMPER

  6. You have a raw data file of phone numbers and want to verify that all the numbers are in the following format:
  7. (nnn)nnn-nnnn

    where n must be a digit 0-9.Extra spaces are permitted, but assume all numbers willbe 15 characters or less. Have your program place the valid numbers in data setVALID, and the invalid numbers in data set INVALID.

    Hint: Our solution used the following SAS functions: INDEX and VERIFY, but there
    are many other possible solutions.

    Some sample raw data records are as follows:
    (988)463-4490 (valid)
    (241) 343-2233 (valid)
    456-5034 (invalid)
    (123)456-7890 (valid)
    (271)SH4-1234 (invalid)
    (592)2578362 (invalid)
  8. You have two SAS data sets,ONE and TWO.Data set ONE contains a variable called DATE1 which is a character variable in the form MM/DD/YY.(i.e. variable DATE1 is not a SAS date, which is a numerical quantity, but rather a character string of length 8.) Variables HEIGHT and WEIGHT are also contained in data set ONE. Data set TWO contains a character variable called DATE2 which is in the form ddMONyy. Data set TWO also contains the variable HR (heart rate).Write a program to merge these data sets by date, and assume that there are no duplicate dates in each of the data sets. Solve this problem two ways:

      a) Use the PUT function to create a new variable in one of the data sets so that the two data sets have an identical variable to use.
      b) Use an INPUT function to create a real SAS date variable in both data sets for merging.

  9. Some sample data are shown below:

    sample data

  10. You have a SAS data set called STOCKS which contains variables XXX and YYY (daily prices of two stocks). Write a SAS program to compute a moving average of these two stocks, taking 4-day intervals.For days 1 to 3, where you do not have 4 days of data,use as many of the previous days as you have to compute the average (i.e., for day 1,use the daily price; for day 2, use the average of day 1 and day 2, etc.).Some sample data values are shown:
  11. Hint: Use the LAGn family of functions.

    sample data values

  12. You have a SAS data set SCORES, which contains an ID variable and a variable called STRING which holds five 1-digit scores. Write a SAS program to read this data set and create a new data set which contains an ID and five numeric variables XI to X5, where the X's are each of the digits in STRING.Following are some sample data:
  13. sample Sample data

    Hint: You may want to use arrays. We provide solutions with and without them.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status

SAS Programming Topics