Sets of Data - Firebird

A DML query defines a logical collection of data items arranged in order from left to right in one or more columns, known as a set. A query may confine a set’s specification to being a single row or it may consist of multiple rows. In a multi-row set, the column arrangement of one row is identical to all others in the set.

People often casually refer to sets as “queries.” They are not synonymous.

A Table Is a Set

A table is a set whose complete specification can be accessed in the system tables by the database server when the table is referred to by a query from a client. The Firebird server performs its own internal queries on the system tables to extract the metadata it needs in order to execute client queries.

Output Sets

A common use of a query statement beginning with the keyword SELECT is to output a set to the client application, for the purpose of displaying data to users. The terms datasetAnd recordset are synonymous with output set. The output set may be in no particular row order, or it can be delivered as a sorted set, as specified in an ORDER BY clause.

For example, the following query will output a set of three columns from TABLEA, containing every row that matches the conditions specified in the WHERE clause. The rows will be sorted so that the row with the lowest value in COL1 will appear first:

SELECT COL1, COL2, COL3 FROM TABLEA WHERE COL3 = 'Mozart' ORDER BY COL1;

If no WHERE clause is provided, the set will contain every row from TABLEA, not just those that have the value 'Mozart' in COL3.

If all columns of TABLEA are wanted, then the symbol * can optionally be used instead of itemizing the columns of the set, for example:

SELECT * FROM TABLEA;

defines an output set consisting of all columns and all rows from TABLEA.

Cardinality and Degree

A term you will sometimes encounter with respect to sets—including tables—is cardinality. Cardinality describes the number of rows in a set, which might be a table or an output set. More loosely, you may encounter cardinality of a row or cardinality of a key value, referring to the position of a row in an ordered output set.

The term used to refer to the number of columns in a set is degree. By extension, you may encounter a phrase like the degree of a column, meaning its position in the left-to-right order of the columns in the set.

Input Sets

Just as the SELECT .. . FROM part of a retrieval query specifies a set for an output or cursor operation, so the other DML statements (INSERT, UPDATE, and DELETE) specify input sets identifying the data upon which the operation is to be performed.

An INSERT query’s input set is specified by identifying a single table and a left-to-right ordered list of identifiers for columns that are to receive input values in the subsequent VALUES() clause. The VALUES() clause must supply a list of values that corresponds exactly to the input set’s order and the data types of its members.

The following example defines an input set consisting of three columns from TABLEB. The constants in the VALUES() clause will be checked to ensure that there are exactly three values and that they are of the correct data types:

INSERT INTO TABLEB(COLA, COLB, COLC) VALUES(99, 'Christmas 2004', '2004-12-25');

INSERT has an alternative syntax where the VALUES() clause is replaced by selecting a set from one or more other tables, real or virtual. INSERT statements are discussed in more detail in the section “The INSERT Statement.”

An UPDATE query defines its input set by identifying a single table and listing one or more input columns, together with their new values, in a SET clause. It identifies the target rows using a WHERE clause:

UPDATE TABLEB SET COLB ='Labor Thanksgiving Day (Japan)', COLC = '2002-23-11' WHERE ...;

UPDATE statements are discussed in more detail in the section “The UPDATE Statement.”

A DELETE query cannot specify columns in its input set—it is implicitly always * (all columns). It identifies the target rows using a WHERE clause. For example, the following query will delete every row where COLC (a DATE type) is earlier than December 13, 1999:

DELETE FROM TABLEB WHERE COLC < '1999-12-13';

DELETE statements are discussed in more detail in the section “The DELETE Statement.”

Output Sets As Input Sets

Grouped or Aggregated Queries

SQL has an important feature that uses an input set formed from an output set generated within the same SELECT query: the GROUP BY clause. Instead of the output set from the column list being passed to the client, it is passed instead to a further stage of processing, where the GROUP BY clause causes the data to be partitioned into one or more nested groups. Each partition typically extracts summaries by way of expressions that aggregate numeric values at one or more levels.

Streams and Rivers for JOINs

When columns in one table are linked to columns in another table to form joined sets, the input column sets from each table are known as streams. The output of two joined streams is referred to as a river. When joins involve more than two tables, streams and rivers become the input for further joining of streams and rivers until the final river becomes the output of the query.

A group of Firebird engine routines known as the optimizer subjects a join query specification and the indexes available to a cost-evaluation process referred to as optimization. The optimizer generates a plan, which is a “best-cost” map of the course the engine will take when generating that final “output river” on subsequent executions of the statement.

Refer to the “Optimization Topic” section at the end of that chapter for a discussion of the optimizer and query plans.

Cursor Sets

A SELECT statement can define a set that is not output to the client at all, but remains on the server to be operated on by a server-side cursor. The cursor itself is a pointer that the application instructs to fetch rows one by one, on demand, by progressing forward through a prepared SELECT statement.

A named cursor declaration, which must be compiled in advance in ESQL or specified through an API structure in DSQL, includes a client-side statement variable that points to the statement. The client application is responsible for ensuring that the prepared statement is available for assignment at the appropriate point in execution of the program.

PSQL has a commonly used language extension that takes the following form:

FOR SELECT <any valid select specification> INTO <list of pre-declared local variables> DO BEGIN < optionally operate on variables >; SUSPEND; END

This is an unnamed cursor

syntax.

The objective of a cursor operation is to use data from the cursor set and “do something” consistently for each cursor set row processed. It can be an efficient way to batch-process sets, where the process performs one or more tasks on the current cursor row, inside a single iteration of a loop through the cursor set. For example, at each iteration, data items could be taken from the current cursor set row and used for inserting new rows into a specified input set.

Another statement can optionally link to an opened cursor, using WHERE CURRENT OF <cursor-name > instead of a search condition, to perform positioned updates or deletes on the rows “bookmarked” by the cursor set.

Implementing Cursors

Firebird provides several methods for implementing cursors:

  • Embedded SQL (ESQL) provides the DECLARE CURSOR statement for implementing the pre-compiled named cursor declaration.
  • To use a cursor, a dynamic application must pre-declare it using the function isc_dsql_set_cursor_name. While some data access components realize this interface, many do not.
  • The PSQL language provides syntax to operate on both named and unnamed cursors locally inside a stored procedure or trigger.

Nested Sets

Nested sets are formed using a special kind of SQL expression syntax known as a subquery. This kind of expression takes the form of a bracketed SELECT statement that can be embedded into the column specification list of the main query to output a single, runtime output value per row or, in a WHERE clause, as part of a search condition. The allowed SELECT syntaxes vary according to the context in which the subquery is used.

In the following simple example, a query on TABLEA uses a nested subquery to include a runtime output column named DESCRIPTION derived from COLX in TABLEB:

SELECT COL1, COL2, COL3, (SELECT COLA FROM TABLEB WHERE COLX='Espagnol') AS DESCRIPTION FROM TABLEA WHERE ... ;

Notice the use of the keyword AS to mark the identifier (DESCRIPTION) of the derived column. It is optional, but recommended for code clarity.

It is very common for an embedded query to be correlated to the main query. A correlated subquery is one whose WHERE clause is linked to one or more values in the outer query’s output list or other values in the underlying tables of the outer query. A correlated subquery is similar in effect to an inner join and can sometimes be used to good effect in place of a join.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Firebird Topics