CaseStudy: Creating the GLOBAL XWorkspace From Alternative Sources - OLAP

This example shows how to create an analytic workspace by acquiring data from relational tables and flat files. It uses Global data simply because you are already familiar with this data set;

These are the basic steps:

  1. Create the GLOBALX user and a default tablespace.
  2. Create a star schema in GLOBALX.
  3. Create OLAP Catalog metadata for the GLOBALX star schema that defines all of the dimensions, levels, hierarchies, attributes, and measures.
  4. Define the GLOBALX_AW user and default tablespace.
  5. Create the GLOBALX standard form analytic workspace.
  6. Modify the GLOBALX analytic workspace, such as redefining composites and setting the segment size.
  7. Populate the Price cube from relational tables using the OLAP DML SQL command.
  8. Populate the Units cube from a flat file using the OLAP DML File Reader commands.
  9. Aggregate the data.
  10. Enable the GLOBALX analytic workspace for use by the BI Beans.

Designing and Implementing the GLOBALX Star Schema

Because Global data is already stored in the GLOBAL star schema, GLOBALX can simply mimic its design for the Price and Units cubes. The only difference is that while GLOBAL is populated with data, GLOBALX contains empty tables.

GLOBALX Schema Diagram

GLOBALX Schema Diagram

Procedure: Creating the GLOBALX Sample Schema

Take these steps to create the sample GLOBALX schema:

  1. Create the GLOBALX user and a default tablespace. Sample scripts are shown in "SQL Scripts for Defining Users and Tablespaces".
  2. Create the SQL scripts listed in "SQL Scripts for the GLOBALX Star Schema".
  3. Log in to SQL*Plus or a similar SQL command processor as the GLOBALX user.
  4. Execute the scripts using the SQL @ command.
  5. After the scripts execute without errors, issue a SQL COMMIT statement.

Creating OLAP Catalog Metadata for the GLOBALX Schema

The metadata for the GLOBALX star schema can be generated by any available method: the OLAP Management tools in Oracle Enterprise Manager, the OLAP Bridge in Oracle Warehouse Builder, or the CWM2 PL/SQL package. This example arbitrarily uses the CWM2 packages.

Take these steps to create OLAP Catalog metadata for the GLOBALX schema:

  1. Create the SQL scripts listed in "SQL Scripts for OLAP Catalog Metadata".
  2. Log in to SQL*Plus or a similar SQL command processor as the GLOBALX user.
  3. Issue these SQL commands so that you can see the full report from the metadata validator, both on the screen and saved to a file:
    The buffer for server output holds a maximum of 1,000,000 characters. If you are building a large application, you may need to control the size of the output with a combination of SET_ECHO_ON, SET_ECHO_OFF, BEGIN_LOG, and END_LOG commands.
  4. Execute the CWM2 scripts using the SQL @ command.
  5. After the scripts execute without errors, issue a SQL COMMIT statement.
  6. Examine the metadata in Analytic Workspace Manager, as shown in Figure.
    • Open Analytic Workspace Manager and connect as the GLOBALX user.
    • In the OLAP Catalog view, expand the Cubes, GLOBALX, and Relational Cubes folders.

GLOBALX Metadata Displayed in Analytic Workspace Manager

GLOBALX Metadata Displayed in Analytic Workspace Manager

Creating the GLOBALX Analytic Workspace

You can create the GLOBALX analytic workspace from the empty tables and OLAP Catalog metadata using any of the available methods: The Create Analytic Workspace wizard in Analytic Workspace Manager, the OLAP bridge in Oracle Warehouse Builder, or the DBMS_AWM PL/SQL procedures. This example uses the wizard to generate a script containing calls to DBMS_AWM, then modifies the script. Take these steps to create the GLOBALX analytic workspace.

  1. Create the GLOBALX_AW user with access rights to the GLOBALX tables and tablespaces, using a script like the one in "SQL Scripts for Defining Users and Tablespaces".
  2. Open Analytic Workspace Manager and log in to the database as the GLOBALX_AW user.
  3. From the Tools menu, choose Create Analytic Workspace Using Wizard.
  4. Make these choices in the wizard:
    • Specify Analytic Workspace page: Type GLOBALX as the workspace name, and select GLOBALX_AW as the schema.
    • Select Cubes page: Select all cubes in the GLOBALX relational schema.
    • Choose Data Loading Options page: Select Build analytic workspace and load dimensions and facts. Clear the Generate unique keys box.
      There is no data in the tables to load; however, this choice populates more catalogs in the analytic workspace than the other choices. The lack of data does not cause the build to fail.
    • Choose Advanced Storage and Naming Options page: Select Display the pages for setting the advanced storage options. Clear the Prefix measure names with cube names box.
      Because no data is available, the tools cannot determine the correct order of the dimensions, nor an appropriate segment size. You must provide this information or the analytic workspace will run slower than it should.
    • Create Composite Dimension and following pages: Create a composite for the Units Cube named UNITS_CUBE_COMPOSITE with the dimensions in this order: CUSTOMER PRODUCT CHANNEL. Omit TIME from the composite.
    • Specify Segment Width and Dimension Order page: For the Units Cube, specify the dimensions like this.
  5. TIME 85<CUSTOMER PRODUCT CHANNEL> 1000000
  6. Save the new analytic workspace.
  7. Open the GLOBALX analytic workspace in OLAP Worksheet, and make the following modifications:
    • Delete and redefine UNIT_COST_VARIABLE and UNIT_PRICE_VARIABLE so they are dimensioned by <TIME PRODUCT>.
      These 80% dense, two-dimensional measures will perform better without a composite.
    • Delete PRICE_CUBE_COMPOSITE.
    • Set the segment size on either one of the variables with a command like this:
      CHGDFN unit_price_variable SEGWIDTH 85 50 These settings reserve contiguous disk space for 85 time periods and 50 products. A single command changes the segment size on all measures in the same cube.
    • Set the segment size on the dimension attributes with the following commands:
    • Save these changes by issuing UPDATE and COMMIT commands.

Fetching the Price Cube From Relational Tables

The Price cube has two measures, UNIT_COST and UNIT_PRICE, with two dimensions, PRODUCT and TIME. In this example, the data is loaded manually from the GLOBAL star schema. However, it could be loaded from any form of relational tables using the method described here.

Take these steps to populate the Price cube in the GLOBALX analytic workspace:

  1. Open Analytic Workspace Manager and connect to the database as the GLOBALX_AW user.
  2. In the Object View, open the GLOBALX analytic workspace in read/write mode.
  3. Create the OLAP DML programs for fetching the PRODUCT and TIME dimension members.
    You can create and compile programs in the Object View or in OLAP Worksheet. You can execute programs, and view the contents of the objects they populated, only in OLAP Worksheet.
  4. Open OLAP Worksheet and execute the programs using the CALL command. CALL program_name
  5. After the programs run without error, check the contents of the target workspace objects to verify that they are populated correctly.
  6. Issue UPDATE and COMMIT commands to save the loaded data.
  7. Create and execute a data load program for the Price cube.
  8. After that program runs without error, check the data in the target variables
  9. Issue UPDATE and COMMIT commands to save the loaded data.

Loading Products From GLOBAL.PRODUCT_DIM

The GETPROD program shown in Example fetches data into the PRODUCT dimension, the member_parentrel relation and the long_description variable. Note that parent values must be added to the PRODUCT dimension before their children, otherwise an error will occur in populating the parent relation. Thus the SELECT statement lists the level columns from the highest level of aggregation to the lowest. The member_inhier Boolean variable is populated with values of 1 for true and 0 for false. The member_levelrel relation is also populated with text values that match the values of the levellist dimension.

Define the example program, then execute it with this command:

CALL getprod

OLAP DML Program for Loading Products From GLOBAL.PRODUCT_DIM

Example shows a selection of the data to verify that the load was successful.>

Exampe Viewing the PRODUCT Dimension and Attributes

Loading Time From GLOBAL.TIME_DIM

The program to fetch TIME members, shown in Example, is very similar to the previous program for fetching PRODUCT members. It differs only in the addition of time span and end date attributes.

However, TIME members must be sorted chronologically within levels in order to support time series analysis functions. Each row contains dimension members at every level, so the TIME dimension is populated with the levels completely mixed. Example shows a program that sorts the TIME dimension. It uses the SORT command to order the current, temporary status of the TIME dimension, saves this order in a valueset, then loops over the valueset with the MAINTAIN command to reorder the values permanently.

Define the example programs, then execute them with these commands:

CALL gettime CALL timesort

OLAP DML Program for Loading Time From GLOBAL.TIME_DIM

OLAP DML Program for Sorting TIME Dimension Members

The TIME dimension has too many members to list in its entirety, but selecting members by ancestry (as shown for PRODUCT) temporarily reorders the dimension. The results will show whether the objects were populated correctly, but not necessarily whether the members are sorted correctly. Example uses LIMIT commands that do not change the original order. The report shows the correct sort order.

Viewing the TIME Dimension and Attributes

Loading the PRICE Cube From PRICE_AND_COST_HISTORY_FACT

Example shows the program for fetching data into UNIT_PRICE_VARIABLE and UNIT_COST_VARIABLE. Note that the data must be loaded into the variables, not into the measuredef formulas, which have the same names as the logical measures. These are the definitions for these variables:

The ORDER BY clause in the DECLARE CURSOR SELECT statement sorts the rows so that PRODUCT (ITEM_ID) is the slower varying dimension and TIME (MONTH_ID) is the faster varying dimension. This organization corresponds to the order in which the values are stored in the workspace variables, as shown by their definitions. This sort order enables the data to be loaded as quickly as possible.

All of the dimension members must already exist in the analytic workspace. If a value is found without a match among the dimension members, then the program fails with an error.

Define the example program, then execute it with this command:

CALL getpricecube

OLAP DML Program to Load the PRICE Cube From PRICE_ AND_ COST_ HISTORY _ FACT

Unlike most measures, those from the Price cube are dense so that it is easy to check the data. The LIMIT commands in Example select members at all levels of the PRODUCT and TIME hierarchies. There is only data at the lowest levels, so the other levels are calculated on demand. Notice that the measuredef formulas are shown, not their underlying variables.

To make a quick check for any values in a variable, use the ANY function:

SHOW ANY(variable NE NA)

For example:

SHOW ANY(unit_price_variable NE NA)

A return value of YES indicates that at least one cell has data; a value of NO indicates that all cells are empty.

Validating the PRICE_CUBE Data Load

Validating the PRICE_CUBE Data Load

Loading the Units Cube From Flat Files

The Units cube has one measure, UNITS, and four dimensions, TIME, CUSTOMER, PRODUCT, and CHANNEL. The TIME and PRODUCT dimensions have already been added to the analytic workspace in "Fetching the Price Cube From Relational Tables", so unless additional dimension members are contained in the flat files, these two dimensions do not need to be maintained. However, the CUSTOMER and CHANNEL dimensions must be fully populated before loading the UNITS measure.
This example loads data from three flat files:

  • CHANNEL.DAT contains all CHANNEL dimension members and their attributes. It is equivalent to the CHANNEL_DIM dimension table in the GLOBAL star schema.
  • CUSTOMER.DAT contains all CUSTOMER dimension members and their attributes. It is equivalent to the CUSTOMER_DIM dimension table in the GLOBAL star schema.
  • UNITS.DAT contains the base-level data for the UNITS measure. It is equivalent to the UNITS_HISTORY_FACT fact table in the GLOBAL star schema.

The basic process for loading from flat files is the same as loading from relational tables, as described earlier in "Fetching the Price Cube From Relational Tables". The difference is only in the OLAP DML programs.

Loading Channels From CHANNELS.DAT

CHANNELS.DAT is a comma-delimited file as shown in Example . It has fields that correspond to the columns of the CHANNELS_DIM dimension table in the Global star schema:

Channel ID
Channel Description
All Channels ID
All Channels Description

With these fields, you can populate the CHANNEL dimension, the CHANNEL_LONG_DESCRIPTION attribute, the CHANNEL_PARENTREL relation, and the CHANNEL_LEVELREL relation. In addition, you can populate CHANNEL_INHIER and CHANNEL_LEVELREL with literal text during the data load.

CHANNELS.DAT Flat File

2,Direct Sales,1,All Channels 3,Catalog,1,All Channels 4,Internet,1,All Channels

Loading the dimension values is straightforward except for the All Channels dimension member (1), which appears only in the third field. It must be added to the CHANNEL dimension before it can be used as the parent of other dimension members in CHANNEL_PARENTREL. For this reason, the third field is read first as a dimension member that has no parent, and again as a parent value. Example shows the program for loading the data. Define the sample program, then execute it with this command:

CALL read_channels

OLAP DML Program for Loading Channels from CHANNELS.DAT

CHANNEL is a very small dimension with only four members, so you can review the results of the load without selecting a sample.

Example shows the results of the load.

Viewing the CHANNEL Dimension and Attributes

Loading Customers From CUSTOMERS.DAT

CUSTOMERS.DAT is a structured file, so that text columns are enclosed in double quotes. It has fields that correspond to the columns in the CUSTOMERS_DIM dimension table in the GLOBAL star schema:

Ship_To ID
Ship_To Description
Account ID
Account Description
Market Segment ID
Market Segment Description
Total Market ID
Total Market Description
Warehouse ID
Warehouse Description
Region ID
Region Description
All Customers ID
All Customers Description

Example shows the first six fields of a few sample records. It contains the same types of information as CHANNELS.DAT, so that all of the equivalent workspace objects are populated. The one significant difference is that the data supports two hierarchies.

CUSTOMERS.DAT Flat File

The load program for CUSTOMERS.DAT, like the one for CHANNELS.DAT, must read parent dimension members before their children. Field 13 contains the most aggregate level, All Customers ID, so it is loaded first. The program shown in Example 11–18 loads the parent members for the SHIPMENTS_ROLLUP hierarchy first, then the parent members for the MARKET_ROLLUP hierarchy. The base level, SHIP_TO, belongs to both hierarchies.

Define the example program, then execute it with this command:

CALL read_customers

OLAP DML Program for Reading CUSTOMERS.DAT

CUSTOMER is too large a dimension to show the complete results of the load. Example shows how to select a few base-level dimensions and their ancestors, so that you can check that the supporting objects were populated correctly.

Viewing the CUSTOMER Dimension and Attributes

Viewing the CUSTOMER Dimension and Attributes

Reading the UNITS_CUBE.DAT File

UNITS_CUBE.DAT contains just the Units measure with columns for each dimension key. Example shows several sample rows.

UNITS_CUBE.DAT Flat File

The data is written to the UNITS_VARIABLE variable, not to the UNITS formula. This is the definition of UNITS_VARIABLE:

DEFINE UNITS_VARIABLE VARIABLE DECIMAL <TIME UNITS_CUBE_COMPOSITE<CUSTOMERPRODUCT CHANNEL>>

Notice that it is dimensioned by UNITS_CUBE_COMPOSITE, but the incoming data is aligned with the base dimensions, as shown in Example. All four base dimensions are already populated.
Define the example program, then execute it with this command:

CALL read_units

OLAP DML Program For Reading UNITS_CUBE.DAT

Measures typically contain vast amounts of data but are quite sparse, so you must target specific cells to verify that the data was loaded correctly. You can do this by selecting a row or two from the source file and limiting the workspace dimensions to those values, as shown in Example.

Validating the UNITS_CUBE Data Load

Populating Additional Standard Form Metadata Objects

If you enable the GLOBALX analytic workspace for the BI Beans now, the dimension views will have many empty columns. For example, the view of the CHANNEL dimension has these empty columns:

The ___POP_FMLYREL and ___ORDR.HIERARCHIES programs populate the workspace objects that are displayed by these columns. Example shows the commands used for the CHANNEL dimension. Repeat these commands for the PRODUCT, CUSTOMER, and TIME dimensions.

You do not need to re-enable the GLOBALX workspace after populating these objects. The data is available through the views as soon as you commit the changes to the database.

OLAP DML Commands to Populate CHANNEL Metadata Objects

Using Tools with the GLOBALX Analytic Workspace

You can now use the Create and Deploy Aggregation Plan wizards and the enablers.
For refreshing the data, you must revise your data loading programs to access new data sources or to restrict the load to new time periods.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

OLAP Topics