DELETE Command Teradata

The DELETE statement has one function and that is to remove rows from a table. A status is the only returned value from the database; no rows are returned to the user. One of the fastest things that Teradata does is to remove ALL rows from a table.

The reason for its speed is that it simply moves all of the sectors allocated to the table onto the free sector list in the AMP's Cylinder Index. It is the fast path and there is no OOPS command, unless the explicit transaction has not yet completed. In that case, a ROLLBACK statement can be issued to undo the delete operation before a COMMIT. Otherwise, the rows are gone and it will take either a backup tape or a BEFORE image in the Permanent Journal to perform a manual rollback. Be Very CAREFUL with DELETE.

The basic syntax for the DELETE statement follows:

DEL[ETE] [FROM ] <table-name> [ AS <alias-name> ] [ WHERE condition ] [ ALL ] ;

Note Using DEL instead of DELETE is not ANSI compliant. Also, if the optional keyword ALL is used, it must be the last word in the statement.

The syntax for a DELETE statement to remove all rows is very easy to write:

DELETE FROM <table-name> [ ALL ] ;

Since the FROM and the ALL are optional, and the DELETE can be abbreviated, the next command still removes all rows from a table and executes exactly the same as the above statement:

DEL <table-name> ;

In the earlier releases of Teradata, the ALL was required to delete all rows. Now, ANSI rules say that ALL is the default for all rows. The ALL is optional and with or without it, all rows are deleted. Make sure that the intent really is to delete all rows! Teradata can delete one billion rows in a heartbeat.

Normally, removing all rows from a table is not the intent. Therefore, it is a common practice for a WHERE clause to limit the scope of the DELETE operation to specific rows. Usually, it is the oldest data that is removed.

As seen previously, the following command deletes all of the rows from My_table.

DELETE FROM My_table;

Whereas, the next DELETE command only removes the rows that contained a date value less than 1001231 (December 31, 2000) in Column6 (DATE, data type) and leaves all rows newer than or equal to the date:

DELETE FROM My_table WHERE Column6< 1001231 ;

Many times in a data warehouse the previous format of the DELETE statement can accomplish most of the processing to remove old rows. It is also commonplace to use the above statement in MultiLoad.

Sometimes it is desirable to delete rows from one table based on their existence in or by matching a value stored in another table. For example, you may be asked to give a raise to all people in the Awards Table. To access these rows from another table for comparison, a subquery or a join operation can be used, as seen in either of these two formats:

The subquery syntax for the DELETE statement follows:

The join syntax for DELETE statement follows:

Unlike a join performed in a SELECT, it is not necessary to use a FROM clause. If an alias is established on the table and then the statement references the actual table name, the resulting join is a Cartesian product and probably not what was intended.

To remove rows from My_table using another table called Control_del_tbl the next DELETE uses a subquery operation to accomplish the operation:

DELETE FROM My_table WHERE Column2 IN ( SELECT Column2 FROM Control_del_tbl WHERE Column3> 5000 AND Column4 IS NULL ) ;

To remove the same rows from My_table using a join with the table called Control_del_tbl, the following is another technique to accomplish the same DELETE operation as the subquery above:

The previous statement could also be written using the format below. However, an alias cannot be used with this format:

Fast Path DELETE

The Fast Path DELETE always occur when the WHERE clause is omitted.

However, most of the time, it is not desirable to delete all of the rows. Instead, it is more practical to remove older rows to make room for newer rows or periodically purge data rows beyond the scope of business requirements.

For instance, the table is supposed to contain twelve months worth of data and it is now month number thirteen. It is now time to get rid of rows that are older than twelve months.

As soon as a WHERE clause is used in a DELETE, it must take the slow path to delete the rows. This simply means that it must log or journal a copy of each deleted row. This is to allow for the potential that the command might fail. If that should happen, Teradata can automatically put the deleted rows back into the table using a ROLLBACK. As slow as this additional processing makes the command, it is necessary to insure data integrity.

To use the Fast Path, a technique is needed that eliminates the journal logging. The trick is again to use a Fast Path INSERT / SELECT. Which means, we insert the rows that need to be kept into an empty table.

All three of the following transactions remove the same rows from My_table, as seen in a previous DELETE. (repeated here):

Normal Path Processing for the DELETE (uses the Transient Journal):

DELETE FROM My_table WHERE Column6< 1001231 ;

There are three different methods for using Fast Path Processing in BTEQ for a DELETE. The first method uses an INSERT/SELECT. It will be fast, but it does require privileges for using the appropriate DDL. It also requires that additional PERM space be available for temporarily holding both the rows to be kept and all of the original rows at the same time.

This next method also uses an INSERT/SELECT and will be fast. It does not require privileges for using any DDL. It probably will not be faster than the first method, since the rows must all be put back into the original table. However, the table is empty and the Fast Path will be used:

Both of these require additional PERM space for temporarily holding the rows to be kept andall of the original rows at the same time. Additionally, it is essential that all statements complete successfully, or none of them complete. This is the definition of a transaction only when using BTEQ. Don't forget that the reason this is one transaction is because the semicolon is on the same line as the next DML statement. So, don't place the semi-colon at the end of the line because this ends the transaction prematurely.

This last INSERT/SELECT covered here uses a Global Temporary Table, which was introduced in Teradata with V2R3 and covered in the Temporary Tables chapter. It is also fast and does not require privileges for using any DDL. However, there is some Data Dictionary involvement to obtain the definition for the Global Temporary Table, but it does not need DDL and its space comes from TEMP space, not from PERM.

The next INSERT/SELECT uses a Global temporary table to prepare for the single transaction to copy My_table in BTEQ:

It requires that TEMPORARY space be available for temporarily holding the rows to be kept and all of the original rows at the same time. A Volatile Temporary table could also be used. Its space comes from spool. However, it requires a CREATE statement to build the table, unlike Global Temporary tables.

If you are not using BTEQ, these statements can be used in a macro. This works because macros always execute as a transaction.

There are many operational considerations to take into account when deciding whether or not to use a Fast Path operation and which one to use. Always consider recovery and data integrity when performing any type of data maintenance. Also consider how large the table is that is being manipulated. The larger the table the more advantageous it might be to use the Fast Path. But remember, to test it on non-production data. Please be careful of the risks when using production data.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Teradata Topics