PART I - USING DB2 IN AN APPLICATION PROGRAM - IBM Mainframe

DB2 application development is the construction of application programs with SQL statements embedded in it. The programming languages or host languages as they are known, currently supported are APL2, Assembler H. IBM BASIC, OS/VS COBOL, VS COBOL II, C/370, FORTRAN, LISP and, PL/I. DB2 application programs can be run in the execution environments like MVS batch using Call Attachment Facility (CAF), TSO batch, DL/I batch, CICS, IMS/DC, IMS/BMP, TSO Interactive, etc. This book uses COBOL as the principal host language, because it is most widely used. But of course the ideas are fairly general and translate into other host languages with only comparatively minor differences.

Embedded SQL

Embedded SQL is a method of combining the computing power of a programming language and the database manipulation capabilities of SQL. Embedded SQL statements are SQL statements written in line with the program source code of the host language. The embedded SQL statements are parsed by an embedded SQL preprocessor and replaced by host-language calls to a code library. The output from the preprocessor is then compiled by the host compiler. This allows programmers to embed SQL statements in programs written in any number of languages such as: C/C++, COBOL and Fortran.

The ANSI SQL standards committee defined the embedded SQL standard in two steps: a formalism called Module Language was defined, then the embedded SQL standard was derived from Module Language. The SQL standard defines embedding of SQL as embedded SQL and the language in which SQL queries are embedded is referred to as the host language. A popular host language is C. The mixed C and embedded SQL is called Pro*C in Oracle and Sybase database management systems. In the PostgreSQL database management system this precompiler is called ECPG. Other embedded SQL precompilers are Pro*Ada, Pro*COBOL, Pro*FORTRAN, Pro*Pascal, and Pro*PL/I. This is presented in the following table:

SQL Statements

SQL Statements

Embedding SQL statements in the application program make the programming easy and the programmer is freed from the tedious jobs of opening, closing of files, coding the file descriptions etc.

The main differences between the conventional programming using flat file programming and DB2 programming is summarized in the following table:

Differences between Conventional Flat File and DB2 Programming

Differences between Conventional Flat File and DB2 Programming

All embedded SQL statements must be enclosed between delimiters. The delimiters for COBOL are 'EXEC SQL' and 'END-EXEC. These delimiters should be coded for each and every SQL statement and must start in column 12.

SQL statement and must start in column 12

The following are some practices/guidelines for embedded SQLs, which will make you DB2 application program, perform better and will improve the maintainability.

  • Comment each SQL statement. Make liberal use of comments to document the nature and purpose of each SQL statement used in the program.
  • Include SQLCA. A structure called SQLCA or SQL Communication Area should be included in any DB2 application program. This is accomplished by coding the following statements in your WORKING-STORAGE section:

WORKING-STORAGE section

The SQLCA contains fields, which are used for communicating information describing the success or failure of the execution of an embedded SQL. The most important among them is the SQLCODE, which contains the return code passed by DB2 to the application program. The return code provides information about the execution of the last SQL statement. A value of zero indicates successful execution, a positive value indicates successful execution but with an exception and a negative value indicates failure during execution. By knowing the value of the SQL code the problem of failure can be pinpointed and this is a very useful tool in debugging.

  • Check the SQLCODE immediately after each SQL statement. SQLCODE as explained above contains information about the success or failure of the last SQL statement executed. Code a mechanism to check the SQLCODE and take the necessary action immediately after every SQL statement.
  • Standardize your Installation's error routine. If possible for all applications in your organization, try using a standard error-handling paragraph. The program, when encountered with an error condition, that is a non-zero value for the SQLCODE should pass the value and control to the error-handling paragraph. The error-handling paragraph can be as complex and precise as you want. Depending on the SQLCODE different processing can occur, for example you may not want to abend the program for every non-zero SQLCODE, so you can include the logic to tolerate those SQLCODEs. The error-handling paragraph can be coded as a copybook and can be included in all the DB2 application programs. This will reduce the coding effort and will improve maintainability of the program.
  • Avoid using WHENEVER. SQL has an error trapping statement called WHENEVER that you can embed in your application program. When the WHENEVER statement is processed, it applies to all subsequent SQL statements issued by the application program and will direct the program to branch to the error-handling paragraph or to continue processing based on the SQLCODE returned for the statement. For example the following statement will direct the program to continue processing when a SQLCODE of +100 is encountered:

continue processing when a SQLCODE of  100 is encountered

But it is always better to avoid the use of WHENEVER statement. It is always safe to code specific SQLCODE checks after each SQL statement and process accordingly.

  • Name DB2 programs, plans, packages and variables cautiously. Do not use the characters DB2, SQL, DSN, DSQ, the SQL reserved words and the SAA database reserved words. These words should be avoided because DB2 is too generic and could be confused with a DB2 system component. Since SQLCA fields are prefixed with SQL, using these letters with another variable name can cause confusion with SQL^A fields. IBM uses three character prefixes DSN to name DB2 system programs and DSQ to the name of QMF system programs. If SQL reserved words are used for host variables and are not preceded by a colon, an error is returned. However these words should not be used in the program, plan or host variables to avoid confusion. Although the use of SAA database reserved words will not result in an error, you should avoid their use to eliminate confusion. Moreover, these words are candidates for future status as SQL reserved words when functionality is added to DB2.

Host Variables, Structures and Null indicators

A host variableis an area of storage allocated by the host language and referenced in the SQL statement. The host variables are defined and named using the host language syntax. In COBOL they must be defined in the DATA DIVISION of the program in the WORKING-STORAGE section or the LINKAGE section. Moreover, when you use the INCLUDE verb to specify the host variable they should be delimited by the EXEC SQL and END-EXEC. When using the host variables in the SQL statements prefix them with a colon(:).

Example

Host Variables, Structures and Null indicators

When the same variable is referenced outside the SQL statement in the COBOL program, do not use the colon, because it will result in compilation errors.

Host variables are means of moving data from the program to DB2 and from DB2 to the program. Data can be read from the file, placed into the host variable and used to modify a DB2 table through SQL. For data retrieval host variables are used to house the retrieved data. Host variables are also used to change the value of the predicates in the WHERE clauses.

DCLGEN, a DB2 utility, produces host variables named as the same as the columns. Use of DCLGEN reduces the coding effort. But when using tables with the same column names, edit the DCLGEN output tp make the host variable names unique. This confusion can be avoided if you prefix the column names with an abbreviated table identifier, while table definition.

In addition to host variables SQL statements can use host structures to enable SQL statements to specify a single structure for storing all retrieved columns. A host structure is thus a COBOL group level data area composed of host variables for all columns to be returned by a given select statement.Example A COBOL variable is defined in the working storage section as:

COBOL variable

This when used in the SQL statements should be prefixed with a colon as follows:

SQL statements should be prefixed

SQL statements should be prefixed

This populates the host variables for all columns defined under the DCLEMPLOYEE group-level data area.

Before you select or insert a column that can be set to null, it must have an indicator variable, called null indicator, for it. You can also use null indicator variables with UPDATE statement to set columns to null. A third and very important use of null indicators (which we have discussed earlier) is when any column is retrieved using the column functions AVG, SUM, MAX and MIN the only exception being COUNT which does not returns the value 0 if no rows meeting the criteria are found.

The null indicator variable is separate form both the columns to which it pertains and the host variable for that column. To determine the value of any nullable column, a host variable and an indicator column are required. The host variable contains the value of the column when it is not null. The indicator variable contains one of the following values to indicate a column's null status:

  • A negative number indicates that the column has been set to null.
  • The value of 2 indicates that the column has been set to null as a result of a data conversion error.
  • A positive or zero value indicates that the column is not null.
  • If the column is defined as CHARACTER data type and is truncated on retrieval because the host variable is not large enough, the indicator variable contains the original length of the truncated column.

You can use null indicator variables with host variables in the following situations:

    • SET clause of the UPDATE statement.
    • VALUES clause of the INSERT statement.
    • INTO clause of the SELECT or FETCH statement.

Use a null indicator variable when referencing a nullable column. Failure to do so will result in a -305 SQLCODE. If you fail to check the null status of the column being retrieved, your program might continue to execute, but with unpredictable results.

Cursors

There are two types of embedded SQL SELECT statements, singleton SELECTS and cursor SELECTs. SQL statements operate on a set of data and return a set of data. Host language programs, on the other hand, operate on a row at a time. A singleton SELECJ simply a SQL SELECT statement that returns a single row. The singleton SELECT differs from the ordinary SELECT statement in that it contains an INTO clause. The INTO clause is where you code your host variables that accept the data returned by DB2. But if such a SELECT statement returns more than on row, the values of the first row is placed in the host variable and DB2 issues an SQLCODE of -811. So in your application program, if the SELECT will return more than one row then use must use Cursors.

DB2 uses cursors to navigate through a set of rows returned by an embedded SQL SELECT statement. A cursor can be compared to a pointer. The programmer declares a cursor and defines the SQL statement for the cursor. After that you can use the cursor like a sequential file. The cursor is opened, rows are fetched from the cursor, one row at a time and at the end of processing the cursor is closed. The four operations that must be performed for the successful working of the cursor are:

  • DECLARE - This statement defines the cursor, gives a name to it and assigns an SQL statement to it. The DECLARE statement does not execute the SQL statement but merely defines it.
  • OPEN - This readies the cursor for row retrieval. OPEN is an executable statement. It reads the SQL search fields, executes the SQL statement and sometimes builds the result table.
  • FETCH - This statement returns data from the result table one row at a time to the host variables. If the result table is not built at the OPEN time, it is built during FETCH.
  • CLOSE - Releases all resources used by the cursor.

Example

Cursors

When coding embedded SQLs using cursors the following guidelines will improve the performance and maintainability of the program:

  • Declare as-many cursors as needed. There is no limit on the number of cursors that can be used in a program.
  • Avoid using certain cursors for modification. A cursor cannot be used for updates or deletes if the DECLARE CURSOR statement includes a UNION, DISTINCT, GROUP BY, ORDER BY or HAVING clauses or if the DECLARE CURSOR statement includes Joins, Subqueries, Correlated subqueries, tables in read-only mode, tables in utility mode, or read-only views.
  • Include only the columns that are being updated.
  • Always use FOR UPDATE OF when updating with a cursor. Although it is not necessary, it is always a good practice to code FOR UPDATE OF clause, in the DECLARE CURSOR statement used for deleting rows. This clause will lock the row before it is being deleted, thus preventing some other user form accessing it, thus ensuring data integrity.
  • Use WHERE CURRENT OF to delete single rows using a cursor. Use where CURRENT of clause on UPDATE and DELETE statements that are meant to modify only a single row. Failure in doing this will result in the modification or deletion of all the rows that are being processed.
  • Avoid the use of FOR UPDATE OF clause on non updateable cursors. Do not code FOR UPDATE of clause on cursors that access read-only data.
  • Open cursors before fetching.
  • Initialize host variables before-opening the cursor.
  • Explicitly close the cursors. Even though DB2 closes all open cursors at the end of the program, explicitly close the cursors using the CLOSE statement; otherwise you will be holding resources, which will affect the performance.
  • Use the WITH HOLD option to retain cursor position. Whenever a COMMIT is issued all open cursors are automatically closed unless the WITH HOLD option is coded for the cursor. However this option is available with only DB2 V2.3 and later releases. The WITH HOLD option is coded in the DECLARE CURSOR statement as follows:

DECLARE CURSOR statement

WITH HOLD prevents subsequent COMMITs from destroying the intermediate results table from the SELECT statement, thereby saving positioning within the cursor. The WITH HOLD option is not available for cursors coded in CICS programs.

Batch Programming Guidelines

  • Favour Clustered Access. Whenever sequential access to t able is needed, process the table row in cluster sequence.
  • Use LOCK TABLE with caution. Issuing a LOCK TABLE command locks all table; in the tablespace containing the table specified. It holds all locks until COMMIT or DEALLOCATION is done. This reduces concurrent access to all tables in the tablespace. But if judiciously used, LOCK TABLE can significantly decrease an application program's processing time. If a significant Number of page locks are taken during program execution, the addition of LOCK TABLE eliminates-page locks and replaces them with table or tablespace locks thus reducing overheads and enhancing performance.
  • Periodically COMMIT in batch update programs. Issue COMMIT statements in all medium to large batch update programs. A COMMIT will save all the modification that you have done so far, since the last COMMIT, thus ensuring data integrity and recoverability. Thus issuing COMMITS periodically will improve performance and will reduce the amount of rework. For example if a program has completed nearly 90% of the work and due to some reason it abends, then if you have not committed the work that you have done the program will have to do the entire job once again.
  • Make the programs re-startable. In time-critical applications, DB2 batch programs that modify table data should be re-startable if there is a system error. To make a batch program re-startable, you first create a DB2 table to control the checkpoint and restart processing for all DB2 update programs. A checkpoint is data written by an application program during its execution and that identifies the status and extent of processing. This is usually accomplished by storing the primary key of the table row being processed. The program must update the primary key as it processes before each COMMIT point. During restart processing, the primary key information is read, enabling the program to continue from where it left off.

Line Programming Guidelines

  • Limit the number of pages retrieved. To get better performance limit the number of pages retrieved or modified to 10. As the number of pages increases the response time reduces try to keep it minimum.
  • Limit on-line joins. When joining rows, limit transactions to no more than 25 rows total. This total applies to all cursors and SELECTS statements issued by the transaction, not issued per statement. To reduce data sorting avoid, if possible, GROUP BY, ORDER BY, DISTINCT and UNION clauses unless indexes are available.
  • Issue COMMITS before displaying data. Always issue a CICS SYNCPOINT, TSO COMMIT or IMS CHKP before sending information to a terminal. .
  • Modularize transactions. Design separate transactions for selecting, updating, inserting and deleting rows. This minimizes page locking, maximizes modular program design and improves performance.
  • Minimize cascading DELETES. Cascading DELETES reduces performance and response times.
  • Use TIMESTAMP for sequencing. Fop columns, consider using TIMESTAMP data types instead of sequentially assigned numbers. Timestamps can be generated by using the special register CURRENT TIMESTAMP and has the same basic functionality as a sequentially assigned number, without the requirement of designing a table to assign sequential numbers. The timestamps will be in serial order unless updates occur across multiple time zones. Although duplicate timestamps can be generated, if two transactions are entered at the same microsecond which is a very rare possibility. You can eliminate this possibility by coding a unique index on the timestamp column and checking for a -803 (duplicate index) SQLCODE. The only drawback is the size of the timestamp data type and the difficulty in remembering the key value. In these cases timestamp cannot be used.
  • Do not INSERT into empty tables. Doing so in an on-line environment causes multiple l/Os when updating indexes and causes index page splits. So if rows are to be inserted into an empty table either format the empty table by pre-filling it with index keys that can be updated on-line instead of inserted into (this reduces i/o and eliminates index page splitting, because index is not updated), or partition the table so that inserts are grouped into separate partitions (this does not reduce i/o but limit index page splits).
  • Increase concurrent on-line access. Limit deadlocks and time-outs by coding applications to increase their concurrency. One option is to code all transactions to access tables in the same order, that is do not try to access an employee table in the alphabetic order of department name in one transaction, by the department number in another, etc. Try to have a single sequential access strategy. Another option is to update and delete using WHERE CURRENT OF cursor instead of independent UPDATES and DELETES.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

IBM Mainframe Topics