Syntax of ORDER BY Teradata

The Teradata AMPs generally bring data back randomly unless the user specifies a sort. The addition of the ORDER BY requests a sort operation to be performed. The sort arranges the rows returned in ascending sequence unless you specifically request descending. One or more columns may be used for the sort operation. The first column listed is the major sort sequence. Any subsequent columns specified are minor sort values in the order of their appearance in the list.

The syntax for using an ORDER BY:

SEL[ECT] <column-name>
[,<column-name>]
FROM <table-name>
ORDER BY { <column-name> | <relative-column-number> } [ ASC | DESC ]
;

In Teradata, if the sequence of the rows being displayed is important, then an ORDER BY should be used in the SELECT. Many other databases store their data sequentially by the value of the primary key. As a result, the data will appear in sequence when it is returned. To be faster, Teradata stores it differently.

Teradata organizes data rows in ascending sequence on disk based on a row ID value, not the data value. This is the same value that is calculated to determine which AMP should be responsible for storing and retrieving each data row.

When the ORDER BY is not used, the data will appear vaguely in row hash sequence and is not predictable. Therefore, it is recommended to use the ORDER BY in a SELECT or the data will come back randomly. Remember, everything in Teradata is done in parallel, this includes the sorting process.

The next SELECT retrieves all columns and sorts by the Grade point average:

SELECT *
FROM Student_Table
WHERE Grade_Pt > 3
ORDER BY Grade_Pt ;
4 Rows returnedod-by-gd-point

Notice that the default sequence for the ORDER BY is ascending (ASC), lowest value to highest. This can be over-ridden using DESC to indicate a descending sequence as shown using the following SELECT:

SELECT *
FROM Student_Table
WHERE Grade_Pt > 3
ORDER BY Grade_Pt DESC;
4 Rows returneddescending sequence as shown using the following SELECT

As an alternative to using the column name in an ORDER BY, a number can be used. The number reflects the column's position in the SELECT list. The above SELECT could also be written this way to obtain the same result:

SELECT *
FROM Student_Table
WHERE Grade_Pt > 3

ORDER BY 5 DESC ;

In this case, the grade point column is the fifth column in the table definition because of its location in the table and the SELECT uses * for all columns. This adds flexibility to the writing of the SELECT. However, always watch out for the ability words, like flexibility because it adds another ability word: responsibility. When using the column number, if the column that is used for the sort is moved to another location in the select list, a different column is now used for the sort. Therefore, it is important to be responsible to change the list and the number in the ORDER BY.

Many times it is necessary that the value in one column needs to be sorted within the sequence of a second column. This technique is said to have a major sort column or key and one or more minor sort keys.

The first column listed in the ORDER BY is the major sort key. Likewise, the last column listed is the most minor sort key within the sequence. The minor keys are referred to as being sorted within the major sort key. Additionally, some columns can ascend while others descend. This SELECT sorts two different columns: the last name (minor sort) ascending(ASC), within the class code (major sort) descending (DESC):

SELECT Last_Name
,Class_Code
,Grade_Pt
FROM Student_Table
ORDER BY 2 DESC, 1 ;
10 Rows returnedORDER BY 5 DESC

Notice, in the above statement, the use of relative column numbers instead of column names in the ORDER BY for the sort. The numbers 2 and 1 were used instead of Class_Code and Last_Name. When you select columns and then use numbers in the sort, the numbers relate to the order of the columns after the keyword SELECT. When you SELECT * (all columns in the table) then the sort numbers reflect the order of columns within the table.

An additional capability of Teradata is that a column can be used in the ORDER BY that is not selected. This is possible because the database uses a tag sort for speed and flexibility. In other words, it builds a tag area that consists of all the columns specified in the ORDER BY as well as the columns that are being selected.

layout of row  in a spool

Although it can sort on a column that is not selected, the sequence of the output may appear to be completely random. This is because the sorted value is not seen in the display.

Additionally, within a Teradata session the user can request a Collation Sequence and a Code Set for the system to use. By requesting a Collation Sequence of EBCDIC, the sort puts the data into the proper sequence for the IBM mainframe system. Therefore, is the automatic default code set when connecting from the mainframe.

Likewise, if a user were extracting to a UNIX computer, the normal code set is ACSII. However, if the file is transferred from UNIX to a mainframe and converted there, it is in the wrong sequence. When it is known ahead of time that the file will be used on a mainframe but extracted to a different computer, the Collation Sequence can be set to EBCDIC. Therefore, when the file code set is converted, the file is in the correct sequence for the mainframe without doing another sort.

Like the Collation Sequence, the Code Set can also be set. So, a file can be in EBCDIC sequence and the data in ASCII or sorted in ASCII sequence with the data in EBCDIC. The final use of the file needs to be considered when making this choice.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Teradata Topics