Partitioning and Materialized Views - Data Warehousing

Because of the large volume of data held in a data warehouse, partitioning is an extremely useful option when designing a database. Partitioning the fact tables improves scalability, simplifies system administration, and makes it possible to define local indexes that can be efficiently rebuilt. Partitioning the fact tables also improves the opportunity of fast refreshing the materialized view because this mayenable Partition Change Tracking (PCT) refresh on the materialized view. Partitioning a materialized view also has benefits for refresh, because the refresh procedure can then use parallel DML in more scenarios and PCT-based refresh can use truncate partition to efficiently maintain the materialized view.

Partition Change Tracking
It is possible and advantageous to track freshness to a finer grain than the entire materialized view. The ability to identify which rows in a materialized view are affected by a certain detail table partition, is known as Partition Change Tracking (PCT). When one or more of the detail tables are partitioned, it may be possible to identify the specific rows in the materialized view that correspond to a modified detail partition(s); those rows become stale when a partition is modified while all other rows remain fresh.

You can use PCT to identify which materialized view rows correspond to a particular partition. PCT is also used to support fast refresh after partition maintenance operations on detail tables. For instance, if a detail table partition is truncated or dropped, the affected rows in the materialized view are identified and deleted.

Identifying which materialized view rows are fresh or stale, rather than considering the entire materialized view as stale, allows query rewrite to use those rows that are fresh while in QUERY_REWRITE_INTEGRITY=ENFORCED or TRUSTED modes. Oracle does not rewrite against partial stale materialized view if partition change tracking on the changed table is enabled by the presence of join dependent expression in the materialized view.

  • At least one of the detail tables referenced by the materialized view must be partitioned.
  • Partitioned tables must use either range, list or composite partitioning.
  • The top level partition key must consist of only a single column.
  • The materialized view must contain either the partition key column or a partition marker or ROWID or join dependent expression of the detail table.
  • If you use a GROUP BY clause, the partition key column or the partition marker or ROWID or join dependent expression must be present in the GROUP BY clause.
  • If you use an analytic window function or the MODEL clause, the partition key column or the partition marker or ROWID or join dependent expression must be present in their respective PARTITION BY subclauses.
  • Data modifications can only occur on the partitioned table. If PCT refresh is being done for a table which has join dependent expression in the materialized view, then data modifications should not have occurred in any of the join dependenttables.
  • The COMPATIBILITY initialization parameter must be a minimum of 9.0.0.0.0.
  • PCT is not supported for a materialized view that refers to views, remote tables, or outer joins.
  • PCT-based refresh is not supported for UNION ALL materialized views.

Partition Key
Partition change tracking requires sufficient information in the materialized view to be able to correlate a detail row in the source partitioned detail table to the corresponding materialized view row. This can be accomplished by including the detail table partition key columns in the SELECT list and, if GROUP BY is used, in the GROUP BY list.

Consider an example of a materialized view storing daily customer sales. The following example uses the sh sample schema and the three detail tables sales, products, and times to create the materialized view. sales table is partitioned by time_id column and products is partitioned by the prod_id column. timesis not a partitioned table.

Example Partition Key

The detail tables must have materialized view logs for FAST REFRESH. The following is an example:

For cust_dly_sales_mv, PCT is enabled on both the sales table and products table because their respective partitioning key columns time_id and prod_id are in the materialized view.

Join Dependent Expression
An expression consisting of columns from tables directly or indirectly joined through equijoins to the partitioned detail table on the partitioning key and which is either a dimensional attribute or a dimension hierarchical parent of the joining key is called a join dependent expression. The set of tables in the path to detail table are called join dependent tables.

In this query, times table is a join dependent table since it is joined to sales table on the partitioning key column time_id. Moreover, calendar_month_name is a dimension hierarchical attribute of times.time_id, because calendar_month_ name is an attribute of times.mon_id and times.mon_id is a dimension hierarchical parent of times.time_id. Hence,the expression calendar_month_name from times tables is a join dependent expression. Let's look at another example:

Here, times table is denormalized into times_d, times_m and times_y tables. The expression calendar_year_name from times_y table is a join dependent

expression and the tables times_d, times_m and times_y are join dependent tables. This is because times_y table is joined indirectly through times_m and times_d tables to sales table on its partitioning key column time_id.

This lets users create materialized views containing aggregates on some level higher than the partitioning key of the detail table. Consider the following example of materialized view storing monthly customer sales.

Example Join Dependent Expression

Assuming the presence of materialized view logs defined earlier, the materialized view can be created using the following DDL:

Here, you can correlate a detail table row to its corresponding materialized view row using the join dependent table times and the relationship that times.calendar_month_name is a dimensional attribute determined by times.time_id. This enables partition change tracking on sales table. In addition to this, PCT is enabled on products table because of presence of its partitioning key column prod_id in the materialized view.

Partition Marker
The DBMS_MVIEW.PMARKER function is designed to significantly reduce the cardinality of the materialized view. The function returns a partition identifier that uniquely identifies the partition for aspecified row within a specified partition table. Therefore, the DBMS_MVIEW.PMARKER function is used instead of the partition key column in the SELECT and GROUP BY clauses.

Unlike the general case of a PL/SQL function in a materialized view, use of the DBMS_MVIEW.PMARKER does not prevent rewrite with that materialized view even when the rewrite mode is QUERY_REWRITE_INTEGRITY=ENFORCED.

As an example of using the PMARKER function, consider calculating a typical number, such as revenue generated by a product category during a given year. If there were 1000 different products sold each month, it would result in 12,000 rows in the materialized view.

Example Partition Marker

Consider an example of a materialized view storing the yearly sales revenue for each product category. With approximately hundreds of different products in each product category, including the partitioning key column prod_id of products table in the materialized view would substantially increase the cardinality. Instead, this materialized view uses the DBMS_MVIEW.PMARKER function, which increasesthe cardinality of materialized view by a factor of the number of partitions in the products table.

prod_yr_sales_mv includes the DBMS_MVIEW.PMARKER function on the products table in its SELECT list. This enables partition change tracking on products table with significantly less cardinality impact than grouping by the partition key column prod_id. In this example, the desired level of aggregation forthe prod_yr_sales_mv is to group by products.prod_category. Using the DBMS_MVIEW.PMARKER function, the materialized view cardinality is increased only by a factor of the number of partitions in the products table. This would generally be significantly less than the cardinality impact of including the partition key columns.

Please note that partition change tracking is enabled on sales table because of presence of join dependent expression calendar_year in the SELECT list.

Partial Rewrite
A subsequent INSERT statement adds a new row to the sales_part3 partition of table sales. At this point, because cust_dly_sales_mv has PCT available on table sales using a partition key, Oracle can identify the stale rows in the materialized view cust_dly_sales_mv corresponding to sales_part3 partition (The other rows are unchanged in their freshness state).Query rewrite cannot Partitioning and identify the fresh portion of materialized views cust_mth_sales_mv and prod_yr_sales_mv because PCT is available on table sales using join dependent expressions. Query rewrite can determine the fresh portion of a materialized view on changes to a detail table only if PCT is available on the detail table using a partition key or partition marker.

Partitioning a Materialized View
Partitioning a materialized view involves defining the materialized view with the standard Oracle partitioning clauses, as illustrated in the following example. This statement creates a materialized view called part_sales_mv, which uses three partitions, can be fast refreshed, and is eligible for query rewrite.

Partitioning a Prebuilt Table
Alternatively, a materialized view can be registered to a partitioned prebuilt table as illustrated in the following example:

In this example, the table part_sales_tab has been partitioned over three months and then the materialized view was registered to use the prebuilt table. This materialized view is eligible for query rewrite because the ENABLE QUERY REWRITE clause has been included.

Benefits of Partitioning a Materialized View
When a materialized view is partitioned on the partitioning key column or join dependent expressions of the detail table, it is more efficient to use a TRUNCATE PARTITION statement to remove one or more partitions of the materialized view during refresh and then repopulate the partition with new data. Oracle Database uses this variant of fast refresh (called PCT refresh) with partition truncation if thefollowing conditions are satisfied in addition to other conditions described in "Partition Change Tracking".

  • The materialized view is partitioned on the partitioning key column or join dependent expressions of the detail table.
  • If PCT is enabled using either the partitioning key column or join expressions, both the materialized view should be range or list partitioned.
  • PCT refresh is non-atomic.

Rolling Materialized Views
When a data warehouse or data mart contains a time dimension, it is often desirable to archive the oldest information and then reuse the storage for new information. This is called the rolling window scenario. If the fact tables or materialized views include a time dimension and are horizontally partitioned by the time attribute, then management of rolling materialized views can be reduced to a few fast partition maintenance operations provided the unit of data that is rolled out equals, or is at least aligned with, the range partitions.

If you plan to have rolling materialized views in your data warehouse, you should determine how frequently you plan to perform partition maintenance operations, and you should plan to partition fact tables and materialized views to reduce the amount of system administration overhead required when old data is aged out. An additional consideration is that you might want to use data compression on yourinfrequently updated partitions.

You are not restricted to using range partitions. For example, a composite partition using both a time value and a key value could result in a good partition solution for your data.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Data Warehousing Topics