CASE Teradata

Compatibility: ANSI

The CASE function provides an additional level of data testing after a row is accepted by the WHERE clause. The additional test allows for multiple comparisons on multiple columns with multiple outcomes. It also incorporates logic to handle a situation in which none of the values compares equal.

When using CASE, each row retrieved is evaluated once by every CASE function. Therefore, if two CASE operations are in the same SQL statement, each row has a column checked twice, or two different values each checked one time.

The basic syntax of the CASE follows:

The above syntax shows that multiple WHEN tests can be made within each CASE. The data value test continues from the first WHEN to the last WHEN or until it finds an equal condition. At that point, it does the THEN and exits the CASE logic by going directly to the END. If the CASE checks all values and does not find an equal condition in any of the WHEN tests, it does the optional ELSE logic and then proceeds to the END.

The ELSE portion of the CASE statement is the only component that is optional. If there is no ELSE and no equal conditions in the WHEN tests, it falls through to the END without doing anything.

It is a common practice to use the ELSE for the ability to provide an alternate value when the condition does not exist in any of the WHEN comparisons. A variation of this basic format is to use a literal value following the CASE and use column names in the WHEN.

Course table

This SELECT uses the Course table to show the basic CASE operation:

7 Rows Returned
Course table to show the basic CASE operation
Course table to show the basic CASE operation

The above answer set demonstrates three things. First, the CASE does a comparison on a numeric data value called Credits with a numeric literal and returns a character string. The SELECT may return numbers or character strings, regardless of their source.

Second, the output is sorted on the credits column and not the <CASE expression> display column. Therefore, the sequence will not change even if the words in the literal change in the SQL statement.

Lastly, the heading is not the phrase <CASE expression> because of the alias "Number of Credits". It is advisable to either alias or TITLE on columns with CASE expressions.

The above syntax is great for comparing equal conditions. However, we do not live in a perfect world and sometimes it is necessary to compare for unequal conditions.

Flexible Comparisons within CASE

When it is necessary to compare more than just equal conditions within the CASE, the format is modified slightly to handle the comparison. Many people prefer to use the following format because it is more flexible and can compare inequalities as well as equalities.

This is a more flexible form of the CASE syntax and allows for inequality tests:

The above syntax shows that multiple tests can be made within each CASE. The value stored in the column continues to be tested until it finds a true condition. At that point, it does the THEN portion and exits the CASE logic by going directly to the END.

If the CASE tests the value and does not find a true condition in any of the WHEN checks, it can do the ELSE portion of the logic and then proceed to the END. If there is no ELSE, it falls to the END without doing anything to the data. As a result, if this is a selected column, a null is returned. Worse yet, if this is in the SET portion of an UPDATE statement, the column is set to a null.

Remember that the ELSE portion of the CASE statement is optional. It can work without it, however, it is common to use that ability to have an alternate value if the condition does not exist in any WHEN comparisons.

The next SELECT is similar to the previous example and also demonstrates that you could compare for a numeric data value with a numeric literal and then return a character literal. The primary difference is that it uses the comparisons inside the WHEN and an ELSE for the default literal:

7 Rows ReturnedFlexible Comparisons within CASE

Both of the previous CASE statements used an equal comparison. In reality, when equal conditions are desired, the first form of the CASE is shorter to write, but not as flexible. The second format could easily be modified to check for inequalities and allows multiple columns to be tested in the same CASE.

Comparison Operators within CASE

In this section, we will investigate adding more power to the CASE statement. In the above examples, a literal value was returned. In most cases, it is necessary to return data. The returned value can come from a column name just like any selected column or a mathematical operation.

Additionally, the above examples used a literal ‘=’ as the comparison operator. The CASE comparisons also allow the use of IN, BETWEEN, NULLIF and COALESCE. In reality, the BETWEEN is a compound comparison. It checks for values that are greater than or equal to the first number and less than or equal to the second number.

The next example uses both formats of the CASE in a single SELECT with each one producing a column display. It also uses AS to establish an alias after the END:

10 Rows ReturnedComparison Operators within CASE

A word of caution is warranted here. Since the CASE stops comparing when it finds a true condition and inequalities are being used, it is important to sequence the comparisons carefully. The above CASE tests for Grade_pt IN (1,2,3) first. If the data is exactly one of these values, the THEN portion is used and the CASE is finished. Therefore, only decimal values and the 4.0 gets through all the subsequent WHEN tests.

Once all the decimal value possibilities have been eliminated in the two other WHEN tests, the only value left is 4.0. That is where the ELSE comes in automatically without requiring an additional test.

When I first saw an IN comparison with the CASE statement, I got very excited because an IN comparison is often seen with a subquery. However, I soon realized that subqueries are only located in the WHERE clause. Currently, subqueries cannot be used in a CASE statement.

CASE for Horizontal Reporting

Another interesting usage for the CASE is to perform horizontal reporting. Normally, SQL does vertical reporting. This means that every row returned is shown on the next output line of the report as a separate line. Horizontal reporting shows the output of all information requested on one line as columns instead of vertically as rows.

Previously, we discussed aggregation. It eliminates detail data and outputs only one line or one line per unique value in the non-aggregate column(s) when utilizing the GROUP BY. That is how vertical reporting works, one output line below the previous. Horizontal reporting shows the next value on the same line as the next column, instead of the next line.

Using the normal SELECT structure we return one row per unique value in the vertical format for the column named in the GROUP BY, the report appears as:

CASE for Horizontal Reporting

Using the next SELECT statement, we achieve the same information in a horizontal reporting format by making each value a column:

1 Row Returnedhorizontal reporting format by making each value a column

When using horizontal reporting, it is important that the column heading in the report indicate what the data represents. Normally, one of the selected columns identifies the origin of the data being reported in the vertical format. To accomplish this type of reporting, the number of the columns and the desired values must be known ahead of time. Therefore, it is not as flexible as the normal SQL statement.

The WHERE clause in the previous example is not required since the CASE will eliminate the NULL for missing class codes. Also, by using the WHERE, the NULL is not compared 4 times with each test producing an unknown result. As it is, every ‘FR’ row is compared 1 time, every ‘SO’ row is compared 2 times, every ‘JR’ row is compared 3 times and every ‘SR’ row is compared 4 times. Every comparison takes time. Therefore, it is best to eliminate as many comparisons as possible by eliminating the row in the WHERE clause. Likewise, if there are more seniors than freshmen, it is faster to compare the ‘SR’ first. This way, instead of testing each senior 4 times, they are only compared once. As a result, the CASE checks fewer values and the entire SELECT will execute much faster. Always think about the impact on performance when using special SQL features and look for opportunities to reduce comparisons. Remember, the tests are performed from the first WHEN through the last WHEN or until a true result is found.

Nested CASE Expressions

After becoming comfortable with the previous examples of the CASE, it may become apparent that a single check on a column is not sufficient for more complicated requests. When that is the situation, one CASE can be imbedded within another. This is called nestedCASE statements.

The CASE may be nested to check data in a second column in a second CASE before determining what value to return. It is common to have more than one CASE in a single SQL statement. However, it is powerful enough to have a CASE statement within a CASE statement.

So that the system can tell where each CASE starts and ends, the nested CASE statements must be imbedded in parentheses and each CASE must have its own END. The size of the SQL statement is more of the limiting factor than is the number of CASE statements in a SELECT.

Prior to V2R3, the CASE could only check one column. Although it is permissible to use different values, only one column per CASE comparison was allowed. To check multiple values, multiple CASE statements were imbedded within each other.

The first CASE tests the first value of one column and the nested CASE normally tests for another value of a different column. This is getting into an advanced technique and it will probably require some practice to get it working exactly as desired.

The next example of nested CASE statements provides a sample to begin coding your own:

4 Rows Returnednested CASE statements provides a sample to begin coding your own

The above nested CASE first compares the class code using the equality-checking format. Then, when the class code is equal to ‘JR’, it starts the literal with ‘Junior‘. Then it begins the nested CASE to test the grade point average. If the row is not for a junior it knows it must be a senior because the WHERE clause selects only juniors and seniors. It immediately tests the seniors GPA to finish the output literal.

For both juniors and seniors, the nested CASE tests the GPA compared to the literal value of 2, meaning that they are not doing well in school. If it is greater than 2, the value is then compared against 3.5. When it is less than 3.5, this means it is also greater than or equal to 2 since these rows failed the first test. These are passing grades. The only rows left are the ones containing a GPA greater than 3.5 that represent students doing very well in school.

There are two reasons why the WHERE clause is very important here. First, it speeds up the SELECT by eliminating all rows except juniors and seniors. Second and more importantly, without the WHERE, all students who are not juniors are assumed to be seniors.

Since there are freshman and sophomores, this is a bad thing and the CASE requires changes to make it correct. Since both the CASE and the WHERE provide testing, they can be written to work together and compliment each other. With the advent of V2R3, the need to imbed nested CASE statements has been reduced, but not eliminated.

The next CASE is equivalent to the one above without using nesting: SELECT Last_name
,CASE WHEN class_code = 'JR' AND grade_pt < 2
THEN 'Junior Failing'
WHEN class_code = 'JR' AND grade_pt < 3.5
THEN 'Junior Passing'
WHEN class_code = 'JR'
THEN 'Junior Exceeding'
WHEN class_code = 'SR' AND grade_pt < 2
THEN 'Senior Failing'
WHEN class_code = 'SR' AND grade_pt < 3.5
THEN 'Senior Passing'
ELSE 'Senior Exceeding'
END AS Current_Status
FROM Student_Table ;

When comparing the two CASE statements, these statements are true:

  1. It takes longer to code without nesting
  2. It takes 5 comparisons to separate juniors and seniors instead of
  3. Therefore, less efficient than nesting.

Many third-party tools generate this form of CASE, because they can create each WHEN without evaluating the totality of the processing being performed.

When additional data value comparisons are needed after the row has been read, there is now a powerful tool. The CASE statement adds IF functionality to the SELECT.

CASE used with the other DML

All of the examples have been using the SELECT. The good news is that it can be used with all four DML statements (SELECT, INSERT, UPDATE, and DELETE). This might be especially helpful when using the CASE within an UPDATE SQL statement.

Many times in this chapter, the answer set was sorted by the class code. Although this grouped them very well, it also put juniors ahead of sophomores. So, it has been determined that the easiest way to put sophomores first is to add a BYTEINT column to the Student table called class_no and store the values 1 through 4 to represent the class. This value is used by the sort, but most likely never displayed.

Now that the column exists within the table it needs data. Remember, when a table is altered with a new column, it contains a NULL value for all existing rows and needs to be populated.

The following UPDATE statements could be used to accomplish this:

Although this technique satisfies the requirements, four different UPDATE statements are needed. Each one locks the table for WRITE. Due to the WRITE lock, they cannot be executed concurrently and each one takes time to complete. Therefore, more time is taken away from the users, or it must be done at night and may interfere with normal batch processing.

As a result, it is best to accomplish this as fast as possible and in a single run if possible. Using the CASE, it is possible. The next UPDATE does the same processing as the 4 statements above:

This approach is faster because it only requires a single pass through all the rows instead of 4 separate passes. However, there are always two sides to every story. Since all rows are being updated at once, the Transient Journal must have enough space in DBC to store all of the before images. With the four different statements, the Transient Journal should require less space for each of the individual statements. The total space used is the same, but it is used at different times. If space is an issue on a system, the choice may not be based solely on efficiency. This is probably an indicator that more disk space needs to be added to the system.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status

Teradata Topics