CICS RECOVERY CONCEPTS - IBM Mainframe

When developing an application system, it is necessary to examine the processing performed by each program to determine what type of recovery processing, if any is needed for that program. The recovery facilities provided by CICS are designed to restore specified resources to their state* before the failure occurred. System designers must consider whether or not recovery should be provided for each resource—such as a data file or a TSQ—used by an application program. Automatic recovery processing is only performed for resources that the system administrator has defined as recoverable. The resources that can be designated as recoverable are:

  • Data files and databases used by CICS application programs
  • Intrapartition TDQs
  • Auxiliary TSQs
  • Terminal messages associated with VTAM terminals
  • BMS paging and routing data stored in TSQs.

When a task fails, CICS performs recovery processing as part of its task termination duties. If the entire system crashes, recovery processing is performed when CICS is restarted. In either case, the goal of recovery processing is to undo the processing performed by one or more tasks, restoring recoverable resources to a pre-failure state from which processing can continue.

Two conditions must be met for CICS to perform recovery processing for a resource used by a failed task:

  1. The resource must be defined as recoverable in the system table defining the resource.
  2. The transaction that was used to initiate the task must be defined as recoverable in the PCT.

The PCT entry for a recoverable transaction indicates that dynamic transaction backout (DTB) should be performed if a task initiated by that transaction abends. In addition, tasks accessing DL/I databases can be designated for automatic restart. It is the combination of a recoverable transaction modifying a recoverable resource that causes CICS to save the data necessary for recovery from a task or system failure. Recovery is performed for a sequence of operations known as a Logical Unit of Work (LUW). A LUW represents processing that is logically tied together. That is all operations in the group must finish before the whole set of operations can be considered complete from a data integrity point of view. When processing for a LUW fails to complete, any modifications made during that LUW are backed out. In other words, all recoverable resources are restored to their state before the LUW started.

Sync points identify the beginning and end of a LUW. A sync point is automatically generated by CICS at task initiation and again at task completion. Sync points are also requested by an application program.

As an example of a task that may need recovery processing, consider a transaction, that collects order information, creates item records on an open order file and maintains inventory information. The process of creating an item record and updating the inventory file could be considered a logical unit of work. Both file operations must be performed for an item's processing to be complete.

If a task performing this order processing terminates abnormally, it is possible that the inventory and order files are no longer in sync because only one of them is updated. For the information stored in the two files to be consistent, it is desirable to return the modified resources to their state prior to the beginning of the abending LUW. CICS recovery modules perform this process automatically when the resource and the transaction are defined as recoverable.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

IBM Mainframe Topics