Generation Data Groups (GDG) - IBM Mainframe

In z/OS, it is possible to catalog successive updates or generations of related data, which are called generation data groups (GDGs).

Each data set within a GDG is called a generation or generation data set (GDS). A generation data group (GDG) is a collection of historically related non-VSAM data sets that are arranged in chronological order. That is, each data set is historically related to the others in the group.

Within a GDG, the generations can have like or unlike DCB attributes and data set organizations. If the attributes and organizations of all generations in a group are identical, the generations can be retrieved together as a single data set.

Advantages to grouping related data sets include:

  • All of the data sets in the group can be referred to by a common name.
  • The operating system is able to keep the generations in chronological order.
  • Outdated or obsolete generations can be automatically deleted by the operating system.

Generation data sets have sequentially ordered absolute and relative names that represent their age. The operating system's catalog management routines use the absolute generation name. Older data sets have smaller absolute numbers. The relative name is a signed integer used to refer to the latest (0), the next to the latest (-1), and so forth, generation.

For example, the data set name LAB.PAYROLL(0) refers to the most recent data set of the group; LAB.PAYROLL(-1) refers to the second most recent data set; and so forth. The relative number can also be used to catalog a new generation (+1). A generation data group (GDG) base is allocated in a catalog before the generation data sets are cataloged. Each GDG is represented by a GDG base entry.

For new non-system-managed data sets, if you do not specify a volume and the data set is not opened, the system does not catalog the data set. New system-managed data sets are always cataloged when allocated, with the volume assigned from a storage group.

In order for GDGs to work, the GDGs have to be created before data sets that are to be included in them can be made a part of them. In order to create a GDG, the following must be specified to the operating system:

  • Name of the GD
  • The number of generations that are to be retained
  • Whether or not the oldest generation is to be uncataloged once is the limit for the number of generations that are to be retained is reached.
  • Whether or not all generations are to be uncataloged once the limit for the number of generations that are to be retained is reached.
  • Whether or not an entry for a data set that is deleted form a GDG is to be uncataloged and physically deleted from the volume that it resides on.
  • Whether or not an entry for a data set that is deleted form a GDG is to only be uncataloged (and not physically deleted) from the volume that it resides on.

When a GDG is created, it will not have any data set belonging to it. However, each data set that is added to it must of the same type. A model containing parameter information, which includes all the data sets added to a GDG, must be specified to the system. Once a model for a GDG has been established, the system must be informed each time a data set is to be added to it. The system must be able to specify the generation number of each data set within a GDG. The system must be informed if a data set within a GDG is to be deleted. The system must be informed if a data set within a GDG is to be deleted, even if its retention period has not expired. The system must be informed if only the index of a GDG is to be deleted. The system must be informed if the entire GDG is to be deleted; this includes the index and all related data sets.

Before a GDG is created an index, which defines its name along with other, features must be created and cataloged. The IDCAMS utility is used to*create this index. The DEFINE GDG statement is used to convey the information to the IDCAMS utility relating to the index. The following sub-parameters are used to convey information about the index to the IDCAMS utility:

  • NAME - This is coded on the DEFINE GDG statement and is used to specify the name of the data set that is to be created. Its syntax is NAME (name) where name can range from 1 to 35 characters.
  • LIMIT - This is coded on the DEFINE GDG statement and is used to specify the total number of generations that a GDG may contain. Its syntax is LIMIT (number) where number can range from 1 to 255 characters. LIMIT cannot be changed once the GDG is established.
  • EMPTY/NOEMPTY - The EMPTY/NOEMPTY sub-parameters are mutually exclusive. EMPTY specifies that all existing generations of the GDG are to be uncataloged once the limit of possible generations within the GDG has been reached. NOEMPTY specifies that only the oldest generation of the GDG is to be uncataloged if the limit is reached.
  • SCRATCH/NOSCRATCH - These sub-parameters are also mutually exclusive. SCRATCH specifies that if the entry of a data set in a GDG is removed from the index, then its entry should be physically deleted from the volume that it resides on. NOSCRATCH indicates that the entry of the data set should only be uncataloged, not physically deleted from the volume.

Example

No more than 255 data sets can exist within one GDG. All rules that apply to coding data sets apply equally to data sets within GDGs. The only difference is that a generation number must be coded within brackets for the data set name in the DSN parameter. GDGs must be cataloged. GDGs must reside on tape or direct access device. The DSN and UNIT parameters must be coded for all new generation data sets. The DISP parameter must be set to GATLG for all new generation data sets.

Features of GDGs

GDGs are characterized by the following:

  • All data sets within the GDG will have the same name.
  • The generation number of a data set within a GDG is automatically assigned by the operating system when it is created. The syntax of the number is recorded by the system as follows:
    1. GaaaaVnn, where 'G' is for generation, 'aaaa' is the absolute sequence number of the generation, 'V is for version, and 'nn' is the version number. The absolute sequence number can range from 0000 to 9999 and version number from 00 to 99. Version number will always automatically default to 00.
  • Data sets within a CDG can be referenced by their relative generation number or the actual data set name, as assigned by the operating system.
  • Generation 0 always references to the current generation, generation -1 always references to the generation just before the current one. Generation -1 will always reference the next generation after the current one.

Advantages of GDGs

The record keeping (as to what generation number should be assigned to new data sets, and which data set goes if the limit is reached), is the responsibility of the operating system, not the application programmer. Further more, GDGs provide a convenient method of relating data sets together, and automatically discarding those data sets, which are outdated.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

IBM Mainframe Topics