Small Dimensions - Data Warehouse ETL Toolkit

Many of the dimensions in a data warehouse are tiny lookup tables with only a few records and one or two columns. For example, many transaction grained fact tables have a transaction type dimension that provides labels for each kind of transaction. These tables are often built by typing into a spreadsheet and loading the data directly into the final physical dimension table. The original source spreadsheet should be kept because in many cases new records such as new transaction types could be introduced into the business.

Although a little dimension like transaction type may appear in many different data marts, this dimension cannot and should not be conformed across the various fact tables. Transaction types are unique to each production system.

In some cases, little dimension tables that serve to decode operational values can be combined into a single larger dimension. This is strictly a tactical maneuver to reduce the number of foreign keys in a fact table. Some data sources have a dozen or more operational codes attached to fact table records, many of which have very low cardinalities. Even if there is no obvious correlation between the values of the operational codes, a single junkdimension can be created to bring all these little codes into one dimension and tidy up the design. The ETL data flow for a typical junk dimension is shown in Figure below. The records in the junk dimension should probably be created as they are encountered in the data, rather than beforehand as the Cartesian product of all the separate codes. It is likely that the incrementally produced junk dimension is much smaller than the full Cartesian product of all the values of the codes. The next section extends this kind of junk-dimension reasoning to much larger examples, where the designer has to grapple with the problem of one dimension or two.

ETL data flow for a typical junk dimension.

ETL data flow for a typical junk dimension.

All rights reserved © 2020 Wisdom IT Services India Pvt. Ltd Protection Status

Data Warehouse ETL Toolkit Topics