Archive for March, 2010

Efficient Analysis with correlated subquires

March 30, 2010

Many developers think that correlated subquires are not efficient.
The correct answer starts with "It depends on...".
Sometimes we can achieve the same efficient perfomance
what we can do with inline views.
Let's see the following example:
(Who earns more then his/her departments's average salary?)

SELECT  e.last_name, e.department_id, e.salary  FROM employees e WHERE e.salary> (SELECT AVG(salary) FROM employees where department_id=e.department_id );

SQL_ID  3jr7194pfhh3s, child number 0
-------------------------------------
select  e.last_name, e.department_id, e.salary  from employees e where
e.salary> (select avg(salary) from employees where
department_id=e.department_id )
Plan hash value: 2167610409
----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |       |       |     8 (100)|          |
|*  1 |  HASH JOIN           |           |     5 |   190 |     8  (25)| 00:00:01 |
|   2 |   VIEW               | VW_SQ_1   |    11 |   286 |     4  (25)| 00:00:01 |
|   3 |    HASH GROUP BY     |           |    11 |    77 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMPLOYEES |   107 |   749 |     3   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | EMPLOYEES |   107 |  1284 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ITEM_1"="E"."DEPARTMENT_ID")
filter("E"."SALARY">"AVG(SALARY)")
====================================================================================================
We can try with inline view:

SELECT e.last_name, e.department_id, e.salary ,d.average FROM employees e, (SELECT department_id,AVG(salary) average FROM employeesGROUP BY department_id ) d WHERE e.department_id=d.department_id AND e.salary>d.average;

SQL_ID  b77bmkd7rn84h, child number 0
-------------------------------------
select e.last_name, e.department_id, e.salary ,d.average from employees
e,(select department_id,avg(salary) average from employees group by
department_id ) d where e.department_id=d.department_id and
e.salary>d.average
Plan hash value: 2695105989
----------------------------------------------------------------------------------
| Id  | Operation            | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |           |       |       |     8 (100)|          |
|*  1 |  HASH JOIN           |           |     5 |   190 |     8  (25)| 00:00:01 |
|   2 |   VIEW               |           |    11 |   286 |     4  (25)| 00:00:01 |
|   3 |    HASH GROUP BY     |           |    11 |    55 |     4  (25)| 00:00:01 |
|   4 |     TABLE ACCESS FULL| EMPLOYEES |   107 |   535 |     3   (0)| 00:00:01 |
|   5 |   TABLE ACCESS FULL  | EMPLOYEES |   107 |  1284 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
filter("E"."SALARY">"D"."AVERAGE")
====================================================================================================
As we can see the result and execution plans are identical!
Just the name of the temporary views are different.
We can realize that Oracle - sometimes - rewrite our SELECT statement achieving the same result,
but with better performance

Efficient analysis with IN operator

March 29, 2010

SQL developers frequently hear and realize that the subqueries 
are not efficient, avoid using them.
Indeed, many times those SELECT statements, which contains subquires
in the WHERE or HAVING clause are not very efficient.

But this not a general rule!

I'd like to demonstrate that subquieres with IN operator and
correlated subquires can be very efficient.

Let see the next example:

SELECT last_name, department_id, job_id FROM employees
WHERE department_id IN
(SELECT department_id FROM departments WHERE location_id = 1700);
Here is the execution plan for the SELECT statement:

SQL_ID  gm32t8f65882q, child number 0
-------------------------------------
SELECT last_name, department_id, job_id FROM   employees WHERE
department_id IN (SELECT department_id FROM departments WHERE  location_id = 1700)
Plan hash value: 1021246405
--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |       |       |     6 (100)|          |
|   1 |  NESTED LOOPS                |                   |       |       |            |          |
|   2 |   NESTED LOOPS               |                   |    37 |   814 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL         | DEPARTMENTS       |     4 |    20 |     4   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |    10 |       |     0   (0)|          |
|   5 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |    10 |   170 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("LOCATION_ID"=1700)
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
====================================================================================================
Let us see the second solution and the execution plan!

SELECT last_name, e.department_id, job_id
FROM employees e,departments d
WHERE e.department_id =d.department_id and location_id = 1700;

SQL_ID  gq10rm3ypv7v2, child number 0
-------------------------------------
SELECT last_name, e.department_id, job_id FROM   employees
e,departments d WHERE  e.department_id =d.department_id and
location_id = 1700
Plan hash value: 1021246405
--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |       |       |     6 (100)|          |
|   1 |  NESTED LOOPS                |                   |       |       |            |          |
|   2 |   NESTED LOOPS               |                   |    37 |   814 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL         | DEPARTMENTS       |     4 |    20 |     4   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |    10 |       |     0   (0)|          |
|   5 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |    10 |   170 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("LOCATION_ID"=1700)
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
====================================================================================================

We can realize that both quires use the same execution plan (The plan Hash values are same!)
so they are same efficient!