Internal RANK operations Teradata

In the initial releases of RANK, Teradata read all participating rows via the WHERE clause into spool and then performed the ranking processing. On millions of rows, this technique is not terribly effective regarding CPU and space utilization.

In release V2R4, NCR has implemented First N Optimization for RANK processing. This means that the QUALIFY clause is used to determine the number of rows (N) on each AMP.

Therefore, each AMP returns only that many qualifying rows instead of all participating rows. Then, the AMPs aggregate the selected rows to determine the final ranking of the rows for return to the client. This can dramatically reduce the overall number of rows being read and compared.

The current way to determine whether or not this processing is being used is through the EXPLAIN. The phrases "all-AMP STAT FUNCTION" and "redistributed by hash code" are signs that it is working. The caveat is that either phrase can change in future releases. The main telltale sign of its use should be the improved performance characteristic.

Now, with that being said, there might be occasions when the client wishes for Teradata to disable the First N Optimization processing and examine all participating rows. To force this type of processing a dummy condition like (1=1) may be added to the QUALIFY.

The following demonstrates the syntax for using this technique: QUALIFY RANK(<column-name>) <= <literal-value> AND (1=1)


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Teradata Topics