Ranking Data using RANK Teradata

Compatibility: Teradata Extension and ANSI

The Ranking function (RANK) permits a column to be evaluated and compared, either based on high or low order, against all other rows to create the output set. The order will be sorted by default in descending sequence of the ranking column, which correlates to descending rank.

This style of selecting the best and the worst has been available using SQL in the past. However, it was very involved and required extensive coding and logic in order to make it work. The new RANK function replaces all the elaborate SQL previously required to create this type of output.

The output of the RANK function is the highest or the lowest data values in the column, depending on the sort requested. A query can return a specified number of the "best" rows (highest values) or the "worst" rows (lowest values). These capabilities and output options will be demonstrated below.

Here is the syntax for RANK:

The next SELECT ranks all columns of the Sales table by the daily sales for all available days:


21 Rows Returned

columns of the Sales table by the daily sales for all available days
columns of the Sales table by the daily sales for all available days

In the above output, there were 21 rows. The highest RANK is 21, the lowest is 1 and everything seems correct. Although it is correct, notice the fact that product 1000 on October 2 and product 2000 on October 4 each had sales of $32,800.50. Therefore, they both rank the same as 16 with 17 not seen in the output. The two values tied for the 16th position.

QUALIFY to Find Top Best or Bottom Worse

The above report could have been created without the columns in the RANK function and RANK value. It is a list in descending sequence by the sales amount. With a small number of rows, the best and the worst is readily available. However, when there are hundreds or millions of rows, returning all the rows takes far too much time.

Instead, it is preferable to only return the rows desired. For instance, the best 20 or the worst 20 might be needed. Like the QUANTILE function, the RANK function uses a QUALIFY clause to allow for control of how many rows to output in the final result.

The following SELECT is the same as the above, but uses the QUALIFY to limit the output to the best 3 (highest values) rows:


3 Rows Returned

QUALIFY to Find Top Best or Bottom Worse

Displaying the value of the rank is optional. This is especially true when the data value being ranked is also displayed.

The next SELECT is the same as the above, with one exception. It uses the ASC to reverse the default sequence of DESC. Now, the worst (lowest) 3 values are returned:


3 Rows Returned

worst (lowest) 3 values are returned

The twist here is that the QUALIFY was not changed. It still checks for "<4" in this form of the query. So, it acts more like a row counter instead of a value comparison.

RANK with Reset Capabilities

There is a method available to reset a RANK function to provide the best or worst rank of a group of rows that contain a common value in another column. It uses the GROUP BY designation to specify a data value that, when it changes, causes the accumulation value to be reset back to zero.

As indicated above, the reset process takes place using either technique. The only difference is the syntax used to request and define the values.

Using RANK with GROUP BY

Compatibility: Teradata Extension

As previously seen, the RANK function permits a column to be ranked, either based on high or low order, against other rows. The GROUP BY can be used in conjunction with a RANK function to change the ranking function's scope. This allows a check on which items were in the top sales bracket for each store.

The following SELECT ranks the daily sales for each product using the GROUP BY and creates an alias for the RANK column to use in the QUALIFY to find the best 2 days:


6 Rows Returned

Using RANK with GROUP BY

All of the examples of the RANK function seen so far are based on daily sales. What if the RANK was requested for monthly sales instead?

The next SELECT gets data from a monthly view of the sales table for monthly activity:


6 Rows Returned

monthly view of the sales table for monthly activity



Face Book Twitter Google Plus Instagram Youtube Linkedin Myspace Pinterest Soundcloud Wikipedia

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Teradata Topics