Set Operators in Oracle - Oracle DBA

Set operators combine the results of two or more queries into a single query result. The set operators in Oracle are UNION, UNION ALL, INTERSECT, and MINUS.

All of the set operators have the same precedence. To override the default left-to-right evaluation, use parentheses to group SELECT statements that you want evaluated first.

UNION and UNION ALL

The UNION operator will combine two query result sets into a single result set, sorted by the first column of the SELECT clause for both queries. The syntax for using UNION is very straightforward: Two queries that can otherwise stand alone are combined with the keyword UNION. The first query does not need a semicolon; the entire SQL statement is terminated by a single semicolon, after the second query.

There are a few rules in force when writing a compound query using UNION. The number of columns in both queries must match, and the corresponding columns must also have the same data types. The names of the columns need not match, though; the query result will use the column names from the first query.

A compound query using UNION removes duplicates by using a sort operation before returning the results of the query. The values of all columns must be equal for one of the rows to be removed from the query result. This is one of the few cases where a NULL value in one of the queries is considered to be equal to a corresponding NULL value in the other query.

UNION ALL operates in much the same way as UNION, except that duplicates are not removed. A row that exists in both queries will show up twice in the results. Because a UNION ALL does not need to remove duplicates, a sort operation does not occur. Therefore, a UNION ALL will usually return results faster than a UNION with the same queries. If you know ahead of time that the two queries do not have duplicates, use UNION ALL.

At Scott's widget company, the database not only keeps track of an employee's current information in the EMPLOYEES table, but it also keeps track of what jobs the employees have held in the past in the JOB_HISTORY table. The boss, King, wants to get a report that includes both the current and previous positions held by employees in the company, along with the beginning and ending dates for when the employee held that position. Janice realizes that she'll need a UNION or UNION ALL operation, plus a sort operation. She is not sure how she will retrieve the employee names from the JOB_HISTORY table, since it has only the employee's ID number.

Her first attempt at a query tries to combine the job history information with the current employment information, as follows:

select employee_id, last_name, hire_date, job_id, department_id ERROR at line 1: ORA-01790: expression must have same datatype as corresponding expression

The two queries have the same number of columns, but the data types of the corresponding columns don't match. This is because the employee data doesn't have an ending date, and the JOB_HISTORY table doesn't have a column to store the employee name. To fix this problem, Janice changes the first query to include a NULL value for an ending date (since the EMPLOYEES file has only active employees):

She changes the second query to include a constant of an empty string to be a placeholder to match the name in the other query:

The resultant query using the UNION operator looks like this:

UNION operator

117 rows selected.

Since the UNION of the two queries will result in adjacent employee IDs due to the default sort behavior of the UNION operator, the report makes sense to King. From this report, he can see that Kochhar was employed as both an account representative and account manager, before becoming a vice president in her current position.

Also worth noting in this report is that the columns EMPLOYEE_ID and DEPARTMENT_ID were assigned column aliases in the first query, and so those aliases applied to the entire result.

But, of course, Janice is not satisfied with the results of the report. The HIRE_DATE column should really be a starting date for the employee in that department, but for the rows in the EMPLOYEE table, it is the employee's starting date at the company. To make the column more accurate, Janice changes the column alias for the first query to STRT_DATE and makes it a correlated subquery, so that the date is actually the date the employees started in their current department:

setoperators

117 rows selected.

There are two differences between this query and the previous one. A minor difference is that the query result is sorted by employee number in ascending order and by the starting date in descending order. King wants to see the employee's most recent job first.

The second difference is a bit more complex. Janice's goal was to find out if the employee had any previous jobs and, if so, return the ending date for the last job that employee had. Remember that you can have the SQL text (in parentheses) of a correlated subquery in the SELECT, FROM, or WHERE clause of the parent query. In this case, the correlated subquery is as follows:

For each row in the EMPLOYEE table, this subquery will find the last date that the employee worked in any department and adds one day, resulting in the first date that the employee started in their current position. But if the employee has never switched departments, there will be no rows in the JOB_HISTORY table, and therefore the subquery will return a NULL result. The solution is to wrap the COALESCE function around the query.

The COALESCE function will return the first non-NULL argument in the argument list. The HIRE_DATE column is specified as the second argument to COALESCE, so if the employee has never switched departments, the original hire date will be returned from this function:

To reiterate, the above section of SQL evaluates to either the first day employees started in their current department or their hiring date, if they have never switched departments. The column alias STRT_DATE is assigned to this derived column.

The next morning, Janice realizes that she could have used UNION ALL instead of UNION in this query. There will never be any duplicate records between the two queries in this compound query, mainly because the database does not store the employee's current job position and starting date in the JOB_HISTORY table.

Tip DBAs should be on the lookout for queries that use UNION when UNION ALL would produce the same desired results. Because UNION does a sort while removing duplicates, many UNION queries will have a much more noticeable performance impact on the system than the same queries that use UNION ALL. Oracle 10g's web-based Enterprise Manager Database Control can easily identify SQL statements or sessions with a high impact on the system using the Top SQL and Top Sessions functions.

INTERSECT

There are times when you need to know which rows two tables or queries have in common. The INTERSECT operator provides this functionality. As with the UNION operator, the number and types of the columns in the two queries to be compared must be the same, but the column names can be different. Rows are returned from an INTERSECT operation only if all columns in the two queries match.

In Scott's widget database, the current employment information is kept in the EMPLOYEES table, and the previous employment information (when employees have changed jobs) is kept in the JOB_HISTORY table. The boss wants to find out which employees have changed departments multiple times and have come back to work in the department they worked in previously, with the same job title. Janice knows that she needs to use the EMPLOYEES and JOB_HISTORY tables, and she decides to use the INTERSECT operator to see if there are current employees in a particular department and job title that are also in the JOB_HISTORY table. Janice realizes that a multicolumn join in a WHERE clause may produce similar results, but she thinks that the INTERSECT method is more straightforward and easier to use and maintain. Her first query looks like this:

INTERSECT

1 row selected.

King looks at this report and thinks that something is amiss. He is sure that there was another employee besides employee number 176 who has changed job titles and came back to work with her original job title. Janice realizes that she is comparing too many columns, and she rewrites her query as follows:

INTERSECT

2 rows selected.

As King suspected, employee number 200 is back working with her old job title, after previously switching departments. Because one of the three columns was different in the previous query, employee number 200 did not show up in the results.

Now that Janice has the result set that King was looking for, she decides that it would be more readable if the employee's last name and first name were in the report also. The problem is, she can't add it to the EMPLOYEES query with the INTERSECT operator, since the JOB_HISTORY table does not have the employee's last name, and as a result the compound INTERSECT query would not return any rows. Instead, she treats the last query as a subquery and joins it back to the EMPLOYEES table:

INTERSECT

2 rows selected.

Notice that Janice is using Oracle's INNER JOIN syntax, available since Oracle9i. The query in parentheses is treated just as if it were another table being joined in the new query.

MINUS

The MINUS compound-query operator returns rows from the first query only if they are not in a second query. In other words, the second query is subtracted from the first query. Any rows in the second query that are not in the first query are ignored and do not affect the results of the entire compound query. As with the UNION operator, the number and types of the columns in the two queries to be compared must be the same, but the column names can be different.

The boss wants to make sure that the company's expansion plans are going well, and he wants to know which countries don't yet have a department located in that country. Janice realizes that a MINUS operator might do the trick here. She can subtract the countries with departments from a query with the COUNTRIES table. The first part of her query is straightforward. It is a SELECT from the COUNTRIES table:

MINUS

25 rows selected.

The second part is a bit trickier. She needs to subtract the countries in which the departments reside. The DEPARTMENTS table does not have a COUNTRY_ID column, but it does have a LOCATION_ID column. The LOCATIONS table has a COUNTRY_ID column, so Janice will need to join the DEPARTMENTS and LOCATIONS tables to get the list of countries with departments:

CO ---- CA DE UK US 4 rows selected.

Janice realizes that she will also need the country name in the query for the INTERSECT operation to work, so this query needs to have the COUNTRIES table as part of the join:

MINUS

4 rows selected.

Janice can now bring it all together by using the MINUS operator to subtract this query from the first query against the COUNTRIES table:

COUNTRIES table

21 rows selected.

King now realizes that the company is a long way from having a significant presence in all of the countries where there are company employees.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Oracle DBA Topics