# The Fault Diagnosability Infrastructure - Oracle 11g

Although we discuss several diagnostic and repair tools in this chapter,the real contribution of Oracle Data base 11g is in providing a unified frame work for diagnosing and resolving problems,the new automatic diagnostic repository.

In previous releases,the diagnostic data was spread over different file systems and wasn’t consolidated in any way.Thus, when you had to trouble shoot a critical error or upload various log files,trace files,and core dumps to the Oracle Support folks,you had to gather all this material from disparate sources and upload it via FTP.In Oracle Database 11g,there is a new infra structure for database diagnosability,the heart of which is the Automatic Diagnostic Repository (ADR).

The ADR is a dedicated repository on your file system,used for storing both traditional diagnostic data sources,such as the alert log,trace files,and dump files,and new types of diagnostic data, such as the Health Monitor reports.

All diagnostic and repair tools rely on the ADR to resolve problems.For example,when a critical problem occurs in the database,it causes Oracle to create an incident and to file away all diagnostic data for this incident in the ADR.Using the new incident packaging framework,you can then send this diagnostic data to Oracle Support and resolve the problem.

Similarly, the Health Monitor,which is the new proactive diag nostic checking framework,stores the results of its checks in the ADR.The Data Recovery Advisor uses this data to repair the problem.

Currently, you use the initialization parameters background_dump_dest, core_ dump_dest,and user_dump_dest to specify where the data base should store all diagnostic data,including the all- important alert log file.Starting with the Oracle Data base 11g release,all data base diagnostic data will be stored in the new ADR.

The great benefit of the ADR is that you can access the diagnostic data for a data base even when the database is down or can’t be opened for general access because of problems.

The ADR stores the diagnostic data not only for all data base instances but also for all other Oracle components and features as well,such as automatic storage management (ASM),Cluster Ready Services (CRS),and others.The ADR uses a consistent diagnostic data format across these various products.

The ADR provides the following benefits in serving as the repository for all the databases’diagnostic data:

• A unified directory structure
• Consistent diagnostic data formats for diagnostic data not only from multiple Oracle instances but also from multiple products
• The same set of tools to analyze diagnostic data across instances

The ADR consists of all the familiar Oracle database diagnostic files, such as the following:

• Trace files
• Core and dump files

In addition to the previously mentioned diagnostic files, the ADR contains new Oracle Database 11g release diagnostic data such as the following:

• Incident packages
• SQL test cases
• Data repair records

We advise DBAs to quickly transition to the ADR. DBAs are accustomed to Oracle’s Optimal Flexible Architecture and the standard set of directories in the $ORACLE_BASE/admin/$ORACLE_SID directories.

The ADR’s new directory structures and the newly implemented XML alert log file will take a little time for DBAs to get used to using.On a positive note,the ADR implements a standard layout for DBAs no matter who originally set up the server,and every server will have the same look and feel for directory structures as they move from one server to another.

Problems and Incidents

There are two new diagnostic-related concepts in Oracle Data base 11g around which the entire fault diagnos ability infra structure revolves: problems and incidents.We’ll take a minute to define these critical terms before we wade into the details of the fault diagnosability infrastructure:

• Any critical error in the database is defined as a problem. Typically these include familiar Oracle data base errors denoted as ORA-600 errors and errors such as ORA-04031(out of shared pool memory).All metadata concerning a data base problem is stored in the ADR.Each problem is assigned a problem key,which helps identify and describe that problem.The problem key contains the Oracle error number and error argument values.Here’s an example (part of the output from a show incident command in the adrci tool,which we explain later in this chapter):
• An incident is a one-time occurrence of a particular problem.Thus, if the same problem occurs multiple times,you’ll have one problem and many incidents to denote the multiple occurrence of that problem.A frequently occurring problem is denoted by a large number of incidents.Each incident has its own incident ID as shown in the previous example.

The database does three things when a particular incident occurs:

• It creates an alert for that incident and assigns the appropriate level of severity for that alert.
• It makes an entry regarding the incident in the alert log.
• It gathers and stores the relevant diagnostic data for that incident in the appropriate subdirectory in the ADR.

You can’t disable the automatic incident creation for critical errors.A problem is created automatically when the first incident occurs.Once the last incident is removed from the ADR,the problem metadata is deleted as well.

The ADR limits the dumping of diagnostic data to a certain number of inci dents under any one problem.Two retention policies dictate how long the ADR retains the diagnostic data it accumulates for various incidents:

• The “incident metadata retention policy” determines how long the ADR retains metadata. The default retention period is for one year.
• The “incident files and dumps retention policy” determines how long the ADR retains dump files for the incidents.The default retention period is one month.

You can change either or both the retention polices using the Incident Package Configuration link on the Support Workbench page in Database Control.The background process MMON automatically purges expired ADR data.

Incident Packaging Service

In Oracle Data base 11g,all diagnostic data relating to a particular error is tagged with that error’s incident number. Thus,when you have to send diagnostic data to Oracle Support,you don’t have to go rummaging through your trace file,dump files,and alert logs to see which files you must send to Oracle Support.You can now automatically gather and package all diagnostic data and files concerning a critical error for sending them to Oracle Support in the form of a ZIP file.This feature is called the Incident Packaging Service (IPS).

Besides automatically identifying the required files for problem resolution,IPS also lets you customize the package by adding other information as well as diagnostic data for related incidents.You’ll see the IPS in action in the section“Packaging Incidents”later in this chapter.

Structure and Location of the ADR

One of the ways in which the ADR provides great diagnostic help is by always being available for problem diagnosis,since it is located outside the data base.Thus,following a data base crash,you can access the ADR without any hindrance.The database creates the ADR by default—the only thing you need to specify is the location for it.Use the new initialization parameter diagnostic_dest to specify the root directory for the ADR,as shown here:

diagnostic_dest = /u05/app/oracle

This root directory for the ADR is called the ADR base.Oracle will create an ADR even if you omit the diagnostic_dest initiali zation parameter.In such a case, the database will create the ADR in one of the following locations:

• If you set the ORACLE_BASE environmental variable,the diagnostic_dest default value is set to the same directory.
• If you haven’t set the ORACLE_BASE variable, Oracle will set the diagnostic_dest parameter value to $ORACLE_HOME/log by default. In an Oracle Real Application Cluster(RAC) environment,you can set a node’s ADR base either on local storage or on shared storage The ADR stores diagnostic data for all Oracle products.The ADR allocates a separate home directory for each instance of each Oracle product.Thus,a single ADR base can contain multiple ADR homes,each pointing to a different Oracle instance.Each ADR home is the root directory for all the diagnostic files for a database instance or any other Oracle product or component. The location of an ADR home is shown by the following directory path: ADR_base/diag/product_type/product_id/instance_id/ For example, if you set the diagnostic_dest parameter to /u05/app/oracle, the ADR home for an Oracle database with an identical SID and database name of prod1,would be as follows: /u05/app/oracle/diag/rdbms/prod1/prod1/ In the previous example, product_type is rdbms since you’re dealing with a data base.In each ADR home directory,you’ll find subdirectories where Oracle stores diagnostic data for that instance.The following are the important subdirectories in the ADR home directory: • alert:Contains the alert log for the instance(in XML format) • cdump:Contains the core files • vhm: Contains Health Monitor reports • incident:Contains subdirectories for each incident, containing all trace dumps for an incident • incpkg:Contains the incident packages you create for uploading to Oracle Support • ir:Contains the incident reports for the instance • trace:Stores user session trace files You can query the V$DIAG_INFO view to find out where all the ADR-related locations are:

As you can see,the V\$DIAG_INFO view also shows the number of active problems and incidents in the data base.

Now that you have learned about the basic structure of the new fault diag nosability infra structure,you can turn your attention to investigating and resolving problems by using this diagnostic framework.