Query Optimization in Centralized Systems - Distributed DBMS

What is Query optimization in centralized Systems?

The optimal access path is determined after the alternative access paths are derived for the relational algebra expression. This chapter focus on query optimization in centralized system.

Query processing for a centralized system is done to achieve:

  • The response time of a query is minimized.
  • The system throughput is maximized
  • The memory and storage used for processing is reduced.
  • Parallelism is increased.

What is Query Parsing and Translation?

After scanning the SQL query, the query is parsed for identifying the syntactical errors and the data types are corrected. Once this step is passed, the query is decomposed into several smaller blocks of query. Each of the block is translated into relational algebra expression.

Steps for Query Optimization

There are three steps for query optimization. They are -

Step 1 − Query Tree Generation

A relational algebra expression is represented by a tree data structure known as a query tree. Leaf nodes represent the tables of the query. The internal nodes represent the relational algebra operations and the complete query is represented by a root.

When the operand table is made available, the internal node is executed. The result table replaces the node and the process is continued until the result table replaces the root node.

For instance, consider the following example

EMPLOYEE

EmpID
EName
Salary
DeptNo
DateOfJoining

DEPARTMENT

DNo
DName
Location

Example 1

The query considered is as follows:

πEmpIDEName="ArunKumar"(EMPLOYEE))

The query tree appears as follows:

Corresponding Query Tree

Example 2

A query with a join is considered.

QOCS

For the above query, the query tree is as follows:

Query Tree

Step 2 − Query Plan Generation

The query plan is prepared once the query tree is generated. All the operations of the query tree are included with access paths which are known as query plan. The relational operations on the performance of the tree are specified by the access paths. For instance, the access path for a selection operation provides information on the use of B+ tree index.

The information about the intermediate tables that are required to be passed from one operator to another is provided by a query plan. The information about the usage of temporary tables, combining of the operations is provided by the query plan.

Step 3− Code Generation

The final step of query optimization is generation of the code. The type of the underlying operating system determines the form of the query. The results are produced by running the query code thus generated by the Execution Manager.

What are the different approaches to Query Optimization?

The different approaches to query optimization are as follows:

Exhaustive Search Optimization

Initially all the possible query plans are generated and then the best plan is selected. The large solution space is relates with the exponential time and space complexity for these techniques. For instance, dynamic programming technique.

Heuristic Based Optimization

Query optimization is done by using the rule-based optimization approaches by Heuristic based optimization. Polynomial time and space complexity are involved in these algorithms, but these algorithms do not produce the query plan.

The common heuristic rules are as follows:

  • Before the join operations are performed, select and project operations are performed by moving both the operations to the down of the query tree thus the number of tuples required for a join is reduced.
  • The select or project operations that are most restrictive are performed first.
  • Cross-product operations are avoided.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Distributed DBMS Topics