Correlated subqueries and Query Transformations


Many times ask me my students when we have – at least – two solutions for a particular problem, which is the best queries ?
The correct answer is typically complex, and started “ It depends on ….” or “let’s see the environment…”.
Let’s see the following simple example:
We would like to select those employees whose salaries are greater than their department’s average salary!
1. First solution can be:

SELECT /* well-paid people */ e.last_name, e.department_id, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM EMPLOYEES
WHERE department_id=e.department_id);


This is correlated subquery and we can ask how many times the subquery will be executed?
We will see that the answer is not a number.
2.Second solution can be:

SELECT /* well-paid people */ e.last_name, e.department_id,
e.salary ,d.average
FROM employees e,
(SELECT department_id, AVG(salary) average FROM employees 
GROUP BY department_id ) d
WHERE e.department_id=d.department_id 
AND e.salary > d.average ;


Now we examine the execution plans (using my sqlid stroed procedure, see an other blog entry .
Let’s the execution plains:

EXEC sqlid('well-paid people','typical')


For the second solution, please consider the following figure:

We can observe – although- the plan hash values are different, actually the content of them are identical!
In other words Oracle executed a query transformation for the second SELECT statement and used the semantically same plan for both quires.
Let’s suppose that we prohibit the query transformation for the first SELECT statement:

SELECT /*+ NO_QUERY_TRANSFORMATION*/
e.last_name, e.department_id, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM EMPLOYEES
WHERE department_id=e.department_id);


We can realize that this plan is more expensive than previous one.
Be careful with hints!
Sometimes we can demolish the server’s good plan with our hints!

Advertisements

2 Responses to “Correlated subqueries and Query Transformations”

  1. Timo Raitalaakso Says:

    Nicely written about query transformations. The optimized might do even better. Reduce to only one employees table access. This is a classic excample how to manually rewrite the query to use a analytic function.

    SELECT last_name, department_id, salary
    FROM (
    SELECT /* well-paid people */ e.last_name, e.department_id, e.salary
    , AVG(salary) OVER (PARTITION BY e.department_id) average
    FROM employees e
    )
    WHERE salary>average;

    —————————————–
    | Id | Operation | Name |
    —————————————–
    | 0 | SELECT STATEMENT | |
    |* 1 | VIEW | |
    | 2 | WINDOW SORT | |
    | 3 | TABLE ACCESS FULL| EMPLOYEES |
    —————————————–

    Predicate Information (identified by operation id):
    —————————————————

    1 – filter(“SALARY”>”AVERAGE”)

  2. lczinkoc Says:

    Timo,
    thanks for your comment!
    I also know that with analytical function we can solve this problem more efficiently.
    However my goal was to show that sometimes two – virtually – different solutions provide the same result.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: