Dimensional Roles - Data Warehouse ETL Toolkit

The data warehouse architect will frequently specify a dimension to be attached multiple times to the same fact table. These are called dimensionalroles. Probably the most common role-playing dimension is the calendar date dimension. Many fact tables, especially the accumulating snapshot fact tables, have multiple date foreign keys. We discuss accumulating snapshot fact tables later. See Figure below. Another common example of a roleplaying dimension is the employee dimension, where different foreign keys in the fact table represent different types of employees being involved in a single transaction. See Figure below.

A typical accumulating snapshot fact table.

A typical accumulating snapshot fact table.

A typical accumulating snapshot fact table.
Two employee role playing dimensions.

In all role-playing dimension implementations, were commend first building a generic single dimension table and then implementing each of the roles with a view on this generic table. See Figure below. For instance, if we have an order-date, a shipment-date, a payment-date, and a return-date on an orders transaction accumulating snapshot fact table, we would first build a generic calendar date dimension and the create four views corresponding to the four dates needed. If the fields in each view are identically named, the application developer and possibly the end user will need to see the fully qualified names to distinguish similar fields from the different views in the same query. For that reason, we recommend creating distinguishable field names in the original view definitions so that every tool, even those not supported by metadata, will display the fields unambiguously.

Two employee role playing dimensions.

Multiple calendar role playing dimensions

The recommended design of dimensional roles described previously makes the impact of dimensional roles on the ETL team equal to zero. So why do we discuss it? Our objective is to make sure the ETL team doesn’t generate multiple physical tables in cases where view definitions (roles) accomplish the same purpose.

Don’t use the dimensional-role techniques as an excuse to build abstract, super-large dimensions. For instance, in a telco environment, nearly everything has a location. If every possible location of every entity is represented in a single location dimension, this dimension could have millions of rows. Using a view on a multimillion row dimension in every application with a location dimension is probably a performance killer. In this case, actual physical dimensions created as extracted subsets of the big location dimension are probably better.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Data Warehouse ETL Toolkit Topics