When should I use aggregate tables? - Microstrategy

Not every attribute level or hierarchy intersection is suitable for pre aggregation. Build aggregate tables only if they will benefit users, since the creation and maintenance of aggregate tables require additional work by the database administrator. Also, do not waste the database space if the tables will not be used.

Some factors to consider when deciding whether to create aggregate tables:

  • the frequency of queries at that level
  • the relationship between the parent and child
  • the compression ratio

Frequency of queries at the level

Build aggregate tables only if they will be useful to your users. If aggregate tables are never accessed, they consume disk space and impose unnecessary burdens on the extraction, transaction, and loading process, as well as the database backup routines. This is in direct contrast to why you create aggregate tables.

However, usefulness is not always easy to quantify. For example, consider the following hierarchy:

Frequency of queries at the level

The summary of data to the department level seems to be a good candidate for an aggregate table. However, if users frequently want to exclude inactive items, the query must use item-level data and summarize the department datadynamically. Therefore, the department aggregate tables would not be used in this situation.

Once your warehouse is in production, trace the usage of the aggregate tables to determine how frequently they are used in real life. If any table is not used, eliminate it from the warehouse. MicroStrategy Enterprise Manager allows you to easily track table usage.

Relationship between the parent and child

When an aggregate table is created, the child records are usually summarized into the parent record, based on the key combinations in a relationship table. In any hierarchical relationship, when the parent-child relationship is altered, all tables that hold that relationship or data relevant to it must be updated. Whether these relationships are dynamic or static change how they are aggregated into tables.

Dynamic relationships

When the relationship between parent and child elements change, the relationship is called dynamic. These changes often occur because of organizational restructuring; geographical realignment; or the addition, reclassification, or discontinuation of items or services. For example, a store can decide to reclassify the department to which items belong.

Aggregate tables that contain dynamic relationships must be recalculated every time a change is made. If the tables are large, this process can take time, consume resources, and complicate the batch process. Frequent changes can mean aggregate tables are not optimal for this situation. Consider the frequency of the changes, the table size, and the impact on the batch process, and then balance the disadvantages against the advantages of having an aggregate table.

Static relationships

When elements rarely or never change relationships, they are termed static relationships. In these cases, maintenance of aggregate tables is very easy. For example, time hierarchies are seldom dynamic—days do not migrate into different weeks, and fiscal weeks do not move into different months. Also, rolling up an entire hierarchy can avoid many problems with relationship changes. For example, a table contains one value for the sum of all stores. It is not affected by a reorganization within the geography hierarchy.

Compression ratio

The process of data aggregation applies an aggregate function, such as sum or average, to a set of child records to produce a single parent record. The average number of child records combined to calculate one parent record is called the compression ratio. The effectiveness of an aggregate table can be estimated from this number, since it represents the decrease in records that must be read to respond to a query at that level.

Recall that some of the reasons to build aggregate tables are the reduction of disk I/O and the amount of records that must be dynamically sorted and aggregated. Therefore, pre-aggregating data is effective only if the compression ratio is significant. If the compression ratio is 3:2, the aggregate table requires 2/3 of the base table’s storage space but yields only a 1/3 reduction in the number of records. In contrast, if the compression ratio is 4:1, the aggregate table reduces the number of records by 3/4 and uses only 1/4 of the storage space.

When the number of elements differs significantly between two attributes in the same hierarchy, the compression ratio suggests that an aggregate table can provide more efficient queries. Also, for smaller base tables, the resource demands placed on the database server by dynamic aggregations decrease and therefore so does the effectiveness of pre-aggregation. To determine when pre-aggregation is worthwhile for your system, you must balance the importance of speed and the availability of disk space and resources to maintain the schema.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Microstrategy Topics