OVERVIEW of storage organization - IBM Mainframe

Management of data stored on auxiliary devices is one of the main functions of an operating system. Data management encompasses the storing, cataloging, organizing and retrieval of data. Examples of such storage organizations are:

  • Physical sequential
  • Partitioned
  • Indexed sequential
  • Direct

Some access methods used to store and retrieve information from these organizations are Basic sequential access method (BSAM), Queued sequential access method (QSAM), Partitioned access method (PAM), Basic direct access method (BDAM) etc.

Virtual storage access method (VSAM)

Virtual storage access method (VSAM) an IBM disk file storage access method, first used in the OS/VS1, OS/VS2 Release 1 (SVS) and Release 2 (MVS) operating systems, later used throughout the Multiple Virtual Storage (MVS) architecture and now in z/OS. Originally a record-oriented file system, VSAM comprises four data set organizations: Key Sequenced Data

Set (KSDS), Relative Record Data Set (RRDS), Entry Sequenced Data Set (ESDS) and Linear Data Set (LDS). The KSDS, RRDS and ESDS organizations contain records, while the LDS organization (added later to VSAM) simply contains a sequence of pages with no intrinsic record structure, for use as a memory-mapped file.

IBM uses the term data set in official documentation as a synonym of file, and Direct access storage device (DASD) because it supported other devices similar to disk drives.

VSAM records can be of fixed or variable length. They are organized in fixed-size blocks called Control Intervals (CIs), and then into larger divisions called Control Areas (CAs). Control Interval sizes are measured in bytes — for example 4 kilobytes — while Control Area sizes are measured in disk tracks or cylinders. Control Intervals are the units of transfer between disk and computer so a read request will read one complete Control Interval. Control Areas are the units of allocation so, when a VSAM data set is defined, an integral number of Control Areas will be allocated.

The Access Method Services utility program IDCAMS is commonly used to manipulate ("delete and define") VSAM data sets.

Custom programs can access VSAM datasets through Data Definition (DD) statements in Job Control Language (JCL), via dynamic allocation or in online regions such as in Customer Information Control System (CICS).

Both IMS/DB and DB2 are implemented on top of VSAM and use its underlying data structures

VSAM—Benefits and Drawbacks

A review of some of the major benefits of VSAM would highlight why the use of any other access method is redundant:

  • Quick and efficient retrieval of data owing to a compact and efficient index
  • Records can be accessed randomly by key or by address or sequentially. Although conventional data organizations also supports random access in indexed or direct files, VSAM stands out in that even in its sequential organization random access is made possible by the use of the Relative byte address or RBA. This will be discussed in greater lengths in a later section.
  • Insertion of records is made easier by the use of imbedded free space
  • Deletion of records physically deletes the records, thus ensuring the reclamation of such free space within a data set
  • Concurrent usage of VSAM data sets is possible across partitions, regions, address spaces and systems
  • Data security is enforceable at different levels by password protection. For instance, you could have read access but not update or vice versa
  • VSAM data sets are device and operating system independent. This means they are portable across systems On the contrary, the major drawbacks of VSAM are:
  • VSAM increases the disk space requirements of systems. This is because VSAM offers certain capabilities like partial self-reorganization to make things more efficient in data sets that can be modified. However, to take advantage of this feature, free space must deliberately be left. For data sets, that are read-only no free-space need to be left.
  • Except for read-only data sets, the integrity of VSAM data sets in a cross-system or cross-region shared environment must be controlled by the user. AH said and done, it is clearly evident that the benefits far outweigh the drawbacks in most situations, a testimony to the immense popularity of VSAM in application systems.

Types of VSAM Data Sets

The different types of organization for VSAM data sets are:

  • Entry-sequenced data set (ESDS)
  • Key sequenced data set (KSDS)
  • Relative record data set (RRDS)
  • Linear data set (LDS)

These organizations are very similar to their counterparts physical sequential, indexed sequential and direct files, but they also provide more advanced and powerful features. Let us take a look at each of them in detail.

Entry-Jequenced Data Set

Entry-Jequenced Data Set

Entry-Sequenced Data Set

As in a physical sequential file, the records in an ESDS are sequenced in the order in which they are written into the data set. New records are always added to the end of the data set. An entry-sequenced data set is shown in the above figure.

The term data component will be explained in the next section. ESDS has more update and access flexibility than a physical sequential file. A record can be updated in place in an ESDS although the length cannot be changed. Random access of records is possible as mentioned earlier: Records however cannot be deleted. The records can be of variable length and there is no imbedded free space as illustrated above.

Key Sequenced Data Set

A key-sequenced data set has each of its records identified by a key. (The key of each record is simply a field in a predefined position within the record.) Each key must be unique in the data set.

When the data set is initially loaded with data, or when new records are added, the logical order of the records depends on the collating sequence of the key field. This also fixes the order in which you retrieve records when you browse through the data set.

To find the physical location of a record in a KSDS, VSAM creates and maintains an index. This relates the key of each record to the record's relative location in the data set. When you add or delete records, this index is updated accordingly.

With releases of DFSMS/MVS 1.4 and later, a data set can be greater than 4GB in size if it is defined as extended format and extended addressability in the storage class. CICS supports, in both RL and non-RLS mode, KSDS data sets that are defined with these extended attributes.

As the name implies, in this data set organization, records are sequenced on a key field. With the value of a key field, you can randomly access a record in a KSDS. A KSDS consists of two components:

The data component, containing the user data including the key field and an index component containing pointers (addresses) to the location of the record to which the key field belongs as illustrated in the following figure.

Index and Data Components

Index and Data Components

The key field is normally just a small portion of the entire data record and hence the index component is much smaller compared to the data component. The simplest analogy to the index component is to the index of a book, which is short and can be searched quickly to find the physical location of a particular topic. The index and data components together are called the base cluster.

Records are stored in a KSDS in physical sequence of the primary key field. Both random and sequential retrieval and deletion of records are possible. Free space specified during the allocation of the KSDS is left at regular intervals during the initial load of the data set. This free space helps keep the data component in physical sequence in spite of random insertions. However after many such insertions and deletions, the records are bound to become out of physical sequence. However, the index component with its pointers to the actual data component helps keep the records in logical sequence. Occasional reorganizations put the records in a KSDS back to physical sequence. Records, which can also be of variable length, can be updated in place, physically deleted and the resulting free space can be used for other insertions.

It is possible to access the records in a KSDS in a sequence other than that of the primary key. Such keys are called alternate keys and they can be non-unique. For instance, in a pay-roll system where employee number is the unique primary key, you can have the employee name as an alternate key.

Relative Record Data Set

A relative record data set has records that are identified by their relative record number (RRN). The first record in the data set is RRN 1, the second is RRN 2, and so on.

Records in an RRDS can be fixed or variable length records, and the way in which VSAM handles the data depends on whether the data set is a fixed or variable RRDS. A fixed RRDS has fixed-length slots predefined to VSAM, into which records are stored. The length of a record on a fixed RRDS is always equal to the size of the slot. VSAM locates records in a fixed RRDS by multiplying the slot size by the RRN (which you supply on the file control request), to calculate the byte offset from the start of the data set.

A variable RRDS, on the other hand, can accept records of any length up to the maximum for the data set. In a variable RRDS VSAM locates the records by means of an index.

A fixed RRDS generally offers better performance. A variable RRDS offers greater function.

Each slot occupies a fixed position and is identified by its position relative to the first slot of the data set. Each slot may or may not contain an actual record. This is illustrated in the following figure.

Relative Record Data Set Organization

Relative Record Data Set Organization

Shaded slots in the figure have records, while the others are empty. An algorithm is formed to map a record to a RRN slot where the record can be stored. In a company with 10,000 employees whose employee numbers range from 1 to 10,000 the employee number can be used as the RRN. The lack of an index component is fairly obvious. There are also no alternate indexes. As mentioned previously, records can be inserted, deleted and updated randomly or sequentially. When a record is deleted, the slots are freed up which remain in the same place. The slots being fixed in length implies that the records in an RRDS are also fixed in length.

Linear Data Set (LDS)

Though we will go into the internal organization of VSAM data sets in a later section, suffice it to say here that, VSAM requires certain control fields in each unit of VSAM storage to keep track of how many records there are, how long each record is, how much free space is available etc. An LDS is a data set that does not contain any such control information. It is very similar to an ESDS without the control information. This kind of data set organization is primarily used by DB2, a relational database management system, details of which are beyond the scope of this book.

Internal Organization

In a physical sequential data set, records are grouped into blocks and each block may have one or more records. VSAM uses a very similar unit of record storage called control interval (CI). A control interval may contain one or more records. If a record's length is greater than the length of the CI, it can span multiple control intervals. Actually, internally a control interval consists of one physical block, but it is transparent to applications. A control interval is the smallest unit of information storage transferred between the storage devices and the buffers.

Control Areas and Control Intervals

Control Areas and Control Intervals

Control intervals are part of a larger storage structure called control area (CA). A control area may consist of multiple control intervals. This is shown in the following figure.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

IBM Mainframe Topics