Creating a Type 2 Dimension/Version Data Mapping - Informatica

The Type 2 Dimension/Version Data mapping filters source rows based on user-defined comparisons and inserts both new and changed dimensions into the target. Changes are tracked in the target table by versioning the primary key and creating a version number for each dimension in the table. In the Type 2 Dimension/Version Data target, the current version of a dimension has the highest version number and the highest incremented primary key of the dimension.

Use the Type 2 Dimension/Version Data mapping to update a slowly changing dimension table when you want to keep a full history of dimension data in the table. Version numbers and versioned primary keys track the order of changes to each dimension.

When you use this option, the Designer creates two additional fields in the target:

  • PM_PRIMARYKEY. The Integration Service generates a primary key for each row written to the target.
  • PM_VERSION_NUMBER. The Integration Service generates a version number for each row written to the target.

Handling Keys

In a Type 2 Dimension/Version Data mapping, the Integration Service generates a new primary key value for each new dimension it inserts into the target. An Expression transformation increments key values by 1,000 for new dimensions.
When updating a dimension, the Integration Service increments the existing primary key by 1.

For example, the Integration Service inserts the following new row with a key value of 65,000 since this is the sixtyfifth dimension in the table.

Handling Keys

The next time you run the workflow containing the session, the same item has a different number of styles. The Integration Service creates a new row with updated style information and increases the existing key by 1 to create a new key of 65,001. Both rows exist in the target, but the row with the higher key version contains current dimension data.

Handling Keys

When you run the workflow again, the Integration Service again increments the key. The highest key version contains current dimension data. The target keeps a full history of the item and the order in which the versions occurred.

Handling Keys

Numbering Versions

In addition to versioning the primary key, the Integration Service generates a matching version number for each row inserted into the target. Version numbers correspond to the final digit in the primary key. New dimensions have a version number of 0.

For example, in the data below, the versions are 0, 1, and 2. The highest version number contains the current dimension data.

Numbering Versions

Understanding the Mapping

The Type 2 Dimension/Version Data mapping performs the following tasks:

  • Selects all rows.
  • Caches the existing target as a lookup table.
  • Compares logical key columns in the source against corresponding columns in the target lookup table.
  • Compares source columns against corresponding target columns if key columns match.
  • Flags new rows and changed rows.
  • Creates two data flows: one for new rows, one for changed rows.
  • Generates a primary key and version number for new rows.
  • Inserts new rows to the target.
  • Increments the primary key and version number for changed rows.
  • Inserts changed rows in the target.

The following figure shows a mapping that the Slowly Changing Dimensions Wizard creates when you select the Type 2 Dimension/Version Data option:

Understanding the Mapping

The Type 2 Dimension/Version Data mapping uses a Lookup and an Expression transformation to compare source data against existing target data. When you step through the Slowly Changing Dimensions Wizard, you enter the lookup conditions (source key columns) and source columns that you want the Integration Service to compare against the existing target.

For each source row without a matching primary key in the target, the Expression transformation marks the row new. For each source row with a matching primary key in the target, the Expression compares user-defined source and target columns. If those columns do not match, the Expression marks the row changed. The mapping then splits into two data flows.

The first data flow uses the Filter transformation, FIL_InsertNewRecord, to filter out existing rows. The Filter transformation passes only new rows to the UPD_ForceInserts Update Strategy transformation. UPD_ForceInserts inserts new rows to the target. A Sequence Generator creates a primary key for each row. The Expression transformation, EXP_KeyProcessing_InsertNew, increases the increment between keys by 1,000 and creates a version number of 0 for each new row.

In the second data flow, the FIL_InsertChangedRecord Filter transformation allows only changed rows to pass to the Update Strategy transformation, UPD_ChangedInserts. UPD_ChangedInserts inserts changed rows to the target. The Expression transformation, EXP_KeyProcessing_InsertChanged, increments both the key and the version number by one.

Understanding the Transformations

The following table describes the function of each transformation in the Type 2 Dimension/Version Data mapping:

Understanding the Transformations

Data Flow for New Rows

The following table describes the data flow for new rows. For each new row in the source, this data flow creates a primary key and version number, and then inserts it into the target:

Data Flow for New Rows

Data Flow for Changed Rows

The following table describes the data flow for changed rows. For each changed row in the source, this data flow increments the existing primary key by one, creates a corresponding version number, and inserts the row into the target:

Data Flow for Changed Rows

Steps to Create a Type 2 Dimension/Version Data Mapping

To create a Type 2 Dimension/Version Data mapping:

  1. In the Mapping Designer, click Mappings > Wizards > Slowly Changing Dimensions.
  2. Enter a mapping name and select Type 2 Dimension. Click Next.
  3. Select a source definition to be used by the mapping.
  4. All available source definitions appear in the Select Source Table list. The list includes shortcuts, flat file, relational, and Application sources.

  5. Enter a name for the mapping target table. Click Next.
  6. The naming convention for target definitions is T_TARGET_NAME.
  7. Select the column or columns you want to use as a lookup condition from the Target Table Fields list and click Add.
  8. The wizard adds selected columns to the Logical Key Fields list.

    Tip:The columns you select should be a key column in the source.

    When you run the workflow containing the session, the Integration Service performs a lookup on existing target data. The Integration Service returns target data when Logical Key Fields columns match corresponding target columns.

    To remove a column from Logical Key Fields, select the column and click Remove.

  9. Select the column or columns you want the Integration Service to compare for changes, and click Add.
  10. The wizard adds selected columns to the Fields to Compare for Changes list.

    When you run the workflow containing the session, the Integration Service compares the columns in the Fields to Compare for Changes list between source rows and the corresponding target rows. If the Integration Service detects a change, it marks the row changed. To remove a column from the list, select the column and click Remove.

  11. Click Next.
  12. Select Keep the ‘Version’ Number in Separate Column. Click Finish.

Note:In the Type 2 Dimension/Version Data mapping, the Designer uses two instances of the same target definition to enable the two separate data flows to write to the same target table. Generate only one target table in the target database.

Customizing the Mapping

Depending on the number of versions you expect the dimension data to have, you might reduce or increase the increment the Integration Service creates between generated keys. By default, the Integration Service increments the sequence by 1,000. This allows for 1,000 versions of a single dimension.

If appropriate, you might reduce or increase the increment. To do this, you edit the Expression transformation,
EXP_KeyProcessing_InsertNew, after creating the mapping.

To change the primary key increment:

  1. Double-click the title bar of the Expression transformation, EXP_KeyProcessing_InsertNew.
  2. Click the Ports tab.
  3. Click in the far right corner of the Expression field of the PM_PRIMARYKEY port.
  4. The Expression Editor appears.
  5. Delete the existing value, 1000, and enter the value you want the Integration Service to use when incrementing the primary key. Click Validate.

Configuring a Type 2 Dimension/Version Data Session

The Type 2 Dimension/Version Data mapping inserts both new and updated rows with a unique primary key. When you configure a session for the mapping, complete the following steps:

  1. In the session properties, click the General Options settings on the Properties tab. Set Treat Source Rows As to Data Driven.
  2. In the session properties, click the Target Properties settings on the Mappings tab. To verify that the Integration Service loads rows to the target properly, select Insert for each relational target

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Informatica Topics