DB2 OBJECTS - IBM Mainframe

In previous sections we have mentioned tables and databases. These are examples of DB2 objects. (DB2 objects are databases, tables, tablespaces, indexes, indexspaces, views, and storage groups.) An object is anything you can create or manipulate with SQL information about DB2 objects is stored in the system tables of the DB2 Catalog. With the exception of indexspaces, the data definition language component of SQL is used to create or manipulate these objects. Indexspaces are automatically built any time an index is created.

Databases

A database is a collection of logically related objects. That is a collection of stored tables that are related together in some way, together with their associated indexes and the various spaces that contain those tables and indexes. It thus consists of a set of tablespaces and index spaces.

A DB2 database consists of tables, their indexes, and the spaces (tablespaces and indexspaces) that contain them. DB2 maintains a structure called a DBD (database descriptor) for each database, as well as catalog entries associated with each database. But a DB2 database is not associated with specific datasets; this

makes it different from previous DBMSs. In older systems, when you created a database, you defined what datasets were included in H. In DB2, databases are usually created for operational convenience only; the datasets (tablespaces and indexspaces) the database contains can be specified later, and the specification can be changed as needed. When you create a DB2 database, there are no associated datasets. It is only when you create a tablespace in that database that the database becomes associated with physical storage. In contrast, a table cannot be created unless a tablespace and its associated physical storage already exist.

Implementation of Physical Storage in DB2

Implementation of Physical Storage in DB2

You will find that it is common practice to create different databases for each application's tables. This helps to minimize the effect of one application on another. For example, if you're creating or deleting an object in a database, no other user may create or delete an object in that database until you have finished. This is because DB2 locks-the DBD for a database (there is one per database) when its objects are being created or deleted. If many applications were combined in a single database, the creation or deletion of an object for any one application would mean that the other application(s) would have to wait. Users within the same application can also interfere with each other, but it is likely that changes to an application's tables will be made by one person or at least be coordinated by those doing the changing. Also, by grouping related objects in the same database, you can also make all of them available or unavailable for access with the DB2 commands START and STOP, which operate at the database level.

The database is the unit of stop and start. Or in other words the system operator can make a specific database available or unavailable for processing using the START and STOP command. So the related tables are put in the same database for operational convenience. Tables can be moved form one database to another without any logical impact on the user or application programs.

Tablespaces

A table space can be considered as a logical address space on secondary storage that is used to hold one or more stored tables. As the amount of data in those tables grows, storage will be acquired from the appropriate storage group and added to the table space. One table space can be up to approximately 64 billion bytes in size, and there is effectively no limit to the number of table spaces in a database, nor the number of databases.

Tables are assigned to tablespaces. A tablespace contains one or more VSAM ESDS datasets. In this way, the data from a table is assigned to physical storage on DASD. Each tablespace is physically divided into equal units called pages. Each page, which may contain 4K or 32K bytes, holds one or more rows of a table and is the unit of I/O. You must use the larger page size if any row in the table is more than 4K bytes long. However, 32K pages are the exception, not the rule. The size of the tablespace's pages is based on the page size of the bufferpool specified in the tablespace's creation statement. Bufferpools are areas of virtual storage that DB2 uses to store data temporarily. The rows of a table are physically stored as records on a page. A record is always fully contained within a page. Each record is made up of control information to identify the row and the table to which it be-longs along with a field for 'ach column of the table.

The table space is the storage unit for recovery and reorganization purposes, that is, it is the unit that can be recovered via the RECOVER utility or reorganized using the REORG utility. If the table space is very large, then recovery and reorganization can take a very long time. DB2 therefore provides a facility to partition the table space into smaller units. Recovery and reorganization can be done on each partition independently, rather than the entire table space.

Tablespaces come in three flavors—partitioned, simple, and segmented. A partitioned tablespace is used for very large tables and may only contain one table. The table is divided among the tablespace's partitions, with each partition stored as a separate VSAM ESDS dataset. When you define the tablespace, you specify what key ranges will go into which partitions. Partitioning enables you to put different parts of a table on different device types. Thus, frequently accessed data can be put on faster devices. Also, partitioning may break up a very large table into more manageable pieces, which can be important for recovery. A simple tablespace normally consists of only one VSAM ESDS and may contain one or more tables. However, DB2 has some major limitations in its support of multiple tables in a simple tablespace. For instance, you cannot perform a sequential read (called a tablespace scan) of one table without reading the entire tablespace; this increases I/O and decreases performance. As a result, database designers normally recommend only one table per simple tablespace. The only exception might be when several very small tables that are generally used together occupy the same tablespace.

To rectify these limitations, IBM introduced a new type of tablespace with version 2.1, a segmented tablespace. It was specifically designed to support multiple tables in a single tablespace. All the segments of a segmented tablespace are stored on one VSAM dataset. Each segment is a group of 4 to 64 pages. The segment size is defined when the tablespace is created. In a segmented tablespace, each segment contains rows for one table. Segmented tablespaces provide improved performance, concurrency, and space management for multiple tables in a single tablespace. Enhancements include faster table scans and the ability to lock a single table within the tablespace. With segmented tablespaces, related tables can be efficiently placed together in one tablespace, resulting in fewer tablespaces. Fewer tablespaces can simplify database administration. Also, MVS places a limit of about 1600 on the datasets that any one job (such as the DB2 Data Manager) can have allocated at any one time. And, since each open dataset uses a certain amount of virtual storage, reducing the number of datasets can improve performance. Segmented tablespaces can have performance advantages over single tablespaces even when, the tablespace contains only one table. In fact, IBM now recommends that all tables not requiring partitioning be placed in segmented tablespaces.

Stored Tables

A stored table is the stored representation of the base table. It consists of a set of stored records, one for each data row in the base table. Each stored record will be wholly contained within a single page. But one stored table can spread across multiple pages and one page can contain stored records from multiple stored tables.

A stored record is not identical to the corresponding row of the base table. Instead, it consists of a byte string containing a record prefix, which contains some control information and 'n' fields, where n is the number of columns in the table. Each field consists of a length prefix, a null indicator prefix if nulls are allowed, and an encoded form of the actual data value

Indexes

An index is an ordered set of pointers to the data in a DB2 table. There is one physical order to the rows in a table and it is determined by DB2. You will see in a moment how you can specify a physical order with the CREATE INDEX statement, but you can never completely control or easily know the exact physical order chosen by the DBMS. Without an index, finding a particular row would require a scan of the entire table. Indexes provide an alternate means of access to the data, greatly decreasing the I/O and processing time required. Users create indexes on tables, but programs or users accessing DB2 data cannot explicitly specify that an index be used; this can only be specified by DB2's Optimizer, which makes that choice at BIND time.

An index is a separate physical entity. It is based on one or more columns of a table and can be created any time after the table is built. In practice, indexes are most often created at table creation time because this is more efficient operationally. A table may have zero to many indexes associated with it. A partitioned table, however, must have at least one index. This index is called the partitioning index and is used to define the scope of each partition and thereby assign the rows of the table to their respective partitions.

In addition to improving data retrieval performance, indexes can be created as UNIQUE. This means that DB2 will not allow two rows to be inserted into a table if the result would be duplicate index values. Earlier,we learned that in the relational model, every table has a primary key, consisting of one or more columns, which ensure that each row in a table is unique. When created, a unique index makes it possible to support the relational model's concept of a unique primary key.

An index may also be created as a CLUSTERing index. When DB2 sees that you have specified a clustering index, it physically stores the rows in order according to the values in the column(s) you have specified as the clustering index. You can specify ascending or descending order. If you create the customer table with a clustering index on the column CUSTSOCSEC, when you insert your rows of data, DB2 will attempt to insert them in order based on the values in CUSTSOCSEC. If it has stored the records for customers 111111111, 222222222, and 444444444 and then attempts to insert that of 333333333, it tries to find free space between 222222222 and 444444444. If no space is available, DB2 puts it elsewhere. DB2 has an elaborate algorithm, which it uses to put a row as close as possible to where it belongs. When DB2 has been unable to put 5 percent of the rows in order, it flags the index as unclustered in the Catalog. This is useful information for the Optimizer because an unclustered index is a more expensive access path than a clustered index. When you do a REORG, DB2 can then put the rows into sequential order.

Since a clustering index defines the way data is actually stored in a table, there can be only one clustering index per table. If no clustering index is defined for a table, DB2 still attempts to load data in order according to the first index defined. In this case, however, it will not reorder the rows at REORG time, so it is always better to explicitly choose the CLUSTER option. If a table is partitioned, it must have a clustering index on the columns that are used to specify the key ranges of the partitions. This index, the partitioning index, is used to divide the rows among the partitions.

Indexes in DB2 are based on a structure known as B-tree. A B-tree is a multilevel, tree structured index with the property that the tree is always balanced, which means that ail the leaf entries in the structure are equidistant from the root of the tree and this property is maintained as new entries are inserted into or existing entries are deleted from the tree. As a result the index provides a uniform and predictable performance for retrieval operations. Details of how this is achieved are a very complex and are beyond the scope of this book.

Indexspaces

An index space is to an index what a table space is to a table. However, the correspondence between indexes and index spaces are always one-to-one, there are no data definition statements for index spaces, instead the necessary index space parameters are specified in the corresponding index definition statements. Thus there is no CREATE INDEXSPACE but only CREATE INDEX, which will automatically create the corresponding index space. Index spaces are always 4KB in size. The unit of locking can be less than one page, for example a quarter page. This is not possible with table spaces where the locking can be done only at the page level. Like table spaces, index spaces can be recovered and reorganized independently. An index space that contains the required index for a partitioned table space is itself considered to be partitioned; all other index spaces are simple or non-partitioned.

Storage Groups

Storage groups are a named collections of direct access volumes, all of the same device type. Each table space and each index space normally has an associated storage group. When storage is needed for the space or partition, it is taken form the specified storage group. Within each storage group, spaces and partitions are stored using VSAM linear data sets. DB2 uses VSAM for such things as direct access space management, data set cataloging, and physical transfer of pages into and out of memory. But the space handling within pages is handled by DB2 and not VSAM and VSAM indexing is not used at all.

We have talked about DB2 data being stored in VSAM ESDS datasets. Now let us look at the storage group that DB2 provides to simplify the job of defining those datasets. DBAs can use IDCAMS to explicitly define datasets and to specify the DASD volumes on which the datasets should be placed. Alternatively, the responsibility for defining the datasets may be left to DB2 itself through the use of storage groups. A storage group is a set of one or more DASD volumes. When a tablespace or indexspace is assigned to a given storage group, DB2 creates the necessary VSAM dataset places it on one of the DASD volumes associated with that storage group. Clearly, if a DBA defines the datasets, he or she has control over which volumes to use. If you delegate the responsibility for the definition to DB2, you still have that control. The difference is that with storage groups, DB2 does the VSAM dataset creation for you on a volume in that storage group. In many installations the convention is to assign each volume to its own storage group so that the DBA indicates a specific volume by naming a storage group.

Until DB2 version 2.1, DBAs preferred to define their own VSAM, ESDS datasets rather than use storage groups. While ^storage groups) removed the burden of defining and deleting datasets, they were considered inflexible and were therefore seldom used. They were particularly troublesome when data had to be migrated to a different DASD. DB2 now has an Alter Storage facility, which provides more flexibility and makes the use of storage groups a more attractive alternative.

View

A view is another way to present data, a different way to look at it. Views are derived from base tables, or from other views. Unlike base tables, which represent physically stored data, views are virtual tables that have no associated physical storage. A view can be derived from more than one base table or even from other views.

You can access data from a view as you would from any base table. When DB2 receives your request, it finds the statement stored in SYSIBM.SYSVIEWS and retrieves the specified subset (that is, the view) of the base table(s). In the case of a view (let's call it View-A) of a view (View-B), DB2 first retrieves the subset specified by View-B, and then, from that subset, retrieves the subset specified by View-A. The process is transparent. You can deal with views as if they were any other base table.

Views provide several benefits. A view may be used as a security mechanism, which allows the user to access only a portion of the table. Complicated queries can be stored as views so that to the end user, a request may appear to be very simple, although it actually performs a complex operation. Views can also minimize the program modifications that may be required when base tables change.

If your program accesses only three columns from a table, you may use a view that includes only those columns. That way, you will not need to change your program when columns are added to the table. The view is still valid, as is your program.

Bufferpool

Bufferpools are areas of virtual storage used by DB2 during the execution of an application program or an interactive SQL request to store pages of a table or an index temporarily. When you need access to a row of a table, the page containing that row is read from the DASD and brought into a buffer. If data is changed, that buffer must be written back to the tablespace on DASD. If the data needed is already in a buffer, you have immediate access to it without waiting for it to be retrieved from DASD. The result is quicker performance. DB2 has four different bufferpools to choose from—BPO, BPI, BP2 and BP32K. BP32K is only used with tablespaces containing 32K pages; the others are used for 4K pages. At installation time the number of pages in each bufferpool is specified. Three different bufferpools of 4K pages are provided. However, most shops do not use BP1 and BP2.

IBM recommends that only BPO be used because the efficiency gained by giving DB2 one large bufferpool to manage outweighs any advantage you might attempt to achieve by defining smaller bufferpools and assigning them, for instance, to heavily used tables or indexes. IBM says, in' effect, that DB2 is better at managing buffers than are DBAs or systems programmers. However, you do have the option of assigning BPI and BP2 to particular tables or indexes if you wish. If, for instance, you have some online data, which must have fast response time, you might assign it to its own bufferpool so that it always stays in virtual storage. This might degrade DB2's overall performance, but the performance of this particular data, which would be might be worth it.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

IBM Mainframe Topics