Keep the latest modification’s timestamp for each row

December 19, 2012

It is well known that Oracle stores the latest modification’s timestamp
with SCN (System Change Number) on data block level.
Sometimes we want to keep this information on row level.
To achieve this feature we can use the ROWDEPENDENCIES clause in the CREATE TABLE statement.

DROP TABLE emp2 PURGE;
CREATE TABLE emp2
( employee_id NUMBER(6),
   last_name  VARCHAR2(25),
   dml        VARCHAR(20),
   salary     NUMBER(8,2))  ROWDEPENDENCIES;

With this option Oracle will allocate six bytes for storing the appropriate SCN for each row.
Note, that You can not add this clause later, with an ALTER TABLE statement. However you can use it in CTAS:

CREATE TABLE emp33 AS SELECT * FROM employees ROWDEPENDENCIES;

Let’s populate this table with the following INSERT SELECT statement:

INSERT INTO emp2
SELECT employee_id,  last_name , 'Insert First', salary
FROM employees WHERE salary<3000;
COMMIT;

 

Now check the result with ORA_ROWSCN pseudo column and convert it into timestamp:

SELECT  SCN_TO_TIMESTAMP(ORA_ROWSCN)  TIMESTAMP,
last_name , dml, salary,
TO_CHAR(ORA_ROWSCN,'999,999,999,999,999') ORA_ROWSCN
FROM emp2;

TIMESTAMP                         LAST_NAME   DML                  SALARY ORA_ROWSCN
--------------------------------- ----------- -------------------- ------ --------------------
19-DEC-12 12.11.42.000000000 PM   Oconnell    Insert First           2600           30,401,401
19-DEC-12 12.11.42.000000000 PM   Grant       Insert First           2600           30,401,401
19-DEC-12 12.11.42.000000000 PM   Baida       Insert First           2900           30,401,401
19-DEC-12 12.11.42.000000000 PM   Tobias      Insert First           2800           30,401,401
19-DEC-12 12.11.42.000000000 PM   Himuro      Insert First           2600           30,401,401
19-DEC-12 12.11.42.000000000 PM   Colmenares  Insert First           2500           30,401,401
19-DEC-12 12.11.42.000000000 PM   Mikkilineni Insert First           2700           30,401,401
19-DEC-12 12.11.42.000000000 PM   Landry      Insert First           2400           30,401,401
19-DEC-12 12.11.42.000000000 PM   Markle      Insert First           2200           30,401,401
19-DEC-12 12.11.42.000000000 PM   Atkinson    Insert First           2800           30,401,401
19-DEC-12 12.11.42.000000000 PM   Marlow      Insert First           2500           30,401,401
19-DEC-12 12.11.42.000000000 PM   Olson       Insert First           2100           30,401,401
19-DEC-12 12.11.42.000000000 PM   Rogers      Insert First           2900           30,401,401
19-DEC-12 12.11.42.000000000 PM   Gee         Insert First           2400           30,401,401

We can issue a MERGE statement (introduced in Oracle 9i, enhanced in Oracle 10g)
to modify the content of this table:
– Let’s increase the salary of those people whose salary less than or equal 2500$
– Delete those rows from the target table where the salary greater than 5000$
– Insert those rows from the source whose salary greater than 12000$
Consider the statement:

MERGE INTO emp2  c USING
(SELECT employee_id, last_name, 'x',salary FROM EMPLOYEES  ) e
ON (c.employee_id = e.employee_id)
WHEN MATCHED THEN
 UPDATE SET
c.last_name = UPPER(e.last_name),
 c.dml       = 'Update',
c.salary    = e.salary * 2 WHERE e.salary <= 2500      DELETE WHERE (c.salary>5000)
 WHEN NOT MATCHED THEN
  INSERT VALUES(e.employee_id, e.last_name,'Insert Second', e.salary)
  WHERE e.salary>12000;
COMMIT;

Check the result with the following SELECT statement:

SELECT  SCN_TO_TIMESTAMP(ORA_ROWSCN)  TIMESTAMP,
last_name , dml, salary,
TO_CHAR(ORA_ROWSCN,'999,999,999,999,999') ORA_ROWSCN
FROM emp2;

TIMESTAMP                         LAST_NAME   DML                  SALARY ORA_ROWSCN
--------------------------------- ----------- -------------------- ------ --------------------
19-DEC-12 11.47.55.000000000 AM   Oconnell    Insert First           2600           30,397,931
19-DEC-12 11.47.55.000000000 AM   Grant       Insert First           2600           30,397,931
19-DEC-12 11.47.55.000000000 AM   Baida       Insert First           2900           30,397,931
19-DEC-12 11.47.55.000000000 AM   Tobias      Insert First           2800           30,397,931
19-DEC-12 11.47.55.000000000 AM   Himuro      Insert First           2600           30,397,931
19-DEC-12 11.51.27.000000000 AM   COLMENARES  Update                 5000           30,398,452
19-DEC-12 11.47.55.000000000 AM   Mikkilineni Insert First           2700           30,397,931
19-DEC-12 11.51.27.000000000 AM   LANDRY      Update                 4800           30,398,452
19-DEC-12 11.51.27.000000000 AM   MARKLE      Update                 4400           30,398,452
19-DEC-12 11.47.55.000000000 AM   Atkinson    Insert First           2800           30,397,931
19-DEC-12 11.51.27.000000000 AM   MARLOW      Update                 5000           30,398,452
19-DEC-12 11.51.27.000000000 AM   OLSON       Update                 4200           30,398,452
19-DEC-12 11.47.55.000000000 AM   Rogers      Insert First           2900           30,397,931
19-DEC-12 11.51.27.000000000 AM   GEE         Update                 4800           30,398,452
19-DEC-12 11.51.27.000000000 AM   PHILTANKER  Update                 4400           30,398,452
19-DEC-12 11.47.55.000000000 AM   Seo         Insert First           2700           30,397,931

You can see two the different SCN numbers with two different timestamps accordingly.

Advertisements

Iterative UPDATE with SKIP LOCKED option

August 28, 2012

Many times occur that we can not execute an UPDATE or DELETE statement, because a couple of rows are locked by an other session(s).
We can generate – and handle – exceptions, but this solution sometimes not enough.
Using the SKIP LOCKED clause – officially introduced in Oracle 11g- we can try to solve the problem incrementally.
Let’s suppose that other sessions sooner or later will finish their transactions. During their transactions we can modify the “free” (not locked) rows with this option.
Let’s see a simple case study:
Our goal is to increase all employees’s salary with 20% ,
however other session can prevent us from doing this modification.
First we create a table that is copy of employees owned by HR.
We modify the structure of the new table adding a column which will indicate whether the modification is done or not.
Initially the value of this column should ‘N’:
To check the timestamp of the modification we change the data type of hire_date from DATE to TIMESTAMP:

DROP TABLE emp4;
CREATE TABLE EMP4 AS SELECT * FROM EMPLOYEES;
ALTER TABLE EMP4 ADD MODIFIED CHAR(1) DEFAULT 'N';
ALTER TABLE EMP4 MODIFY hire_date TIMESTAMP;


We have two sessions:
In the first session we issue periodically the following statement:

UPDATE emp4 
set last_name=UPPER(LAST_NAME) WHERE SALARY >&SAL;


where Sal substitution variable should be 10000, 14000
For 10000 the result:
15 rows updated
This session has not issued the COMMIT statement!

In the second session execute the following PL/SQL program:

DECLARE
CURSOR c_emp is
SELECT * FROM  EMP4
WHERE modified ='N' FOR UPDATE SKIP LOCKED ;
r  emp4%ROWTYPE;
BEGIN
OPEN c_emp;
LOOP
FETCH c_emp INTO r;
EXIT WHEN C_EMP%NOTFOUND;
r.hire_date:=SYSTIMESTAMP;
UPDATE EMP4 SET salary=salary*1.2, 
                 modified ='Y', hire_DATE=r.hire_date
WHERE CURRENT OF c_emp;
DBMS_OUTPUT.PUT_LINE
(r.employee_id||': '||r.last_name||' modified');
END LOOP;
DBMS_OUTPUT.PUT_LINE(CHR(10)||c_emp%ROWCOUNT||' modifications were');
CLOSE c_emp;
END;
/

The result :
92 modifications were
Issue a COMMIT in the first session and repeat the UPDATE with 14000
The result:
3 rows updated
Run again the PL/SQL program:
The result:
12 modifications were.
When the first session issues a COMMIT, after the second session finish it’s modification
The result:
1 row updated
It is not guaranteed that this method always successful.
We can use it, if the likelihood of the fact that other transactions will be finished on time, great enough.

Dynamic SQL with FOR UPDATE clause

August 28, 2012

Many times we can hear that the Native Dynamic SQL can not be implemented with FOR UPDATE clause in a PL/SQL program.

In this article I will proof that we can associate a dynamic SQL statement with a weak ref cursor variable using FOR UPDATE clause.

Let us consider the following – simple- case study:
We would like to modify the salary column of an employees like table
(This means that the structure of the table is identical with HR’s employees table):
We increase the salary for those employees whose salary is less than a particular value.

Let’s see the code:

CREATE OR REPLACE PROCEDURE din_sql_upd
(p_tname VARCHAR2,p_sal NUMBER)
IS
TYPE c_emp IS REF CURSOR;
c  c_emp;
r EMPLOYEES%ROWTYPE;
v_updated_rows INTEGER:=0;
BEGIN
OPEN c FOR 'SELECT * FROM '||p_tname||' WHERE salary < '
|| p_sal ||' FOR UPDATE';
LOOP
FETCH c INTO r;
EXIT WHEN C%NOTFOUND;  
DBMS_OUTPUT.PUT_LINE(RPAD(R.LAST_NAME,20)||' Salary: '||r.salary);
IF r.salary < p_sal THEN
EXECUTE IMMEDIATE
'UPDATE '||p_tname ||' SET salary =salary*2 WHERE employee_ID=:r '
USING r.employee_id;
v_updated_rows:=v_updated_rows+1;
END IF;
END LOOP;
CLOSE c;
DBMS_OUTPUT.PUT_LINE(CHR(10)||v_updated_rows||' modifications made.');
END din_sql_upd;
/
EXECUTE din_sql_upd('employees',2500)


You can observe that the UPDATE statement does not contain
the WHERE CURRENT OF clause, because it is PROHIBITED!
This program demonstrates one of a kind workaround for solving this problem:
We used simply the primary key instead of WHERE CURRENT OF.

Correlated subqueries and Query Transformations

October 27, 2011

Many times ask me my students when we have – at least – two solutions for a particular problem, which is the best queries ?
The correct answer is typically complex, and started “ It depends on ….” or “let’s see the environment…”.
Let’s see the following simple example:
We would like to select those employees whose salaries are greater than their department’s average salary!
1. First solution can be:

SELECT /* well-paid people */ e.last_name, e.department_id, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM EMPLOYEES
WHERE department_id=e.department_id);


This is correlated subquery and we can ask how many times the subquery will be executed?
We will see that the answer is not a number.
2.Second solution can be:

SELECT /* well-paid people */ e.last_name, e.department_id,
e.salary ,d.average
FROM employees e,
(SELECT department_id, AVG(salary) average FROM employees 
GROUP BY department_id ) d
WHERE e.department_id=d.department_id 
AND e.salary > d.average ;


Now we examine the execution plans (using my sqlid stroed procedure, see an other blog entry .
Let’s the execution plains:

EXEC sqlid('well-paid people','typical')


For the second solution, please consider the following figure:

We can observe – although- the plan hash values are different, actually the content of them are identical!
In other words Oracle executed a query transformation for the second SELECT statement and used the semantically same plan for both quires.
Let’s suppose that we prohibit the query transformation for the first SELECT statement:

SELECT /*+ NO_QUERY_TRANSFORMATION*/
e.last_name, e.department_id, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(salary) FROM EMPLOYEES
WHERE department_id=e.department_id);


We can realize that this plan is more expensive than previous one.
Be careful with hints!
Sometimes we can demolish the server’s good plan with our hints!

Flashback queries for Data Dictionary

August 17, 2011

Oracle developers sometimes delete their subprograms, views, tables etc accidentally.
From Oracle 10g we have opportunity to salvage them from the undo segment.
Note that – by default – the Oracle does not guarantee the successful salvage in 100%.
(See later in this article or the documentation fore more detail)

The FLASHBACK statement – with wide range of  functionalities – ensures the possibility to restore earlier state of objects or completely recreate them.
Flashback queries appeared in Oracle 9i first.
This is well known by the DBA’s and developers.
Maybe not so popular that we can execute flashback queries against the Data Dictionary.
Average users (ie. Not SYS) can not do it without necessary object privileges.
Note the FLASHBACK ANY TABLE  system privilege is not enough for these queries!

Let’s see an example:
1. HR user created earlier a function, this was the content of the tax.sql:
CREATE OR REPLACE FUNCTION TAX(P_AMOUNT IN NUMBER)
RETURN NUMBER
IS
M  NUMBER;
BEGIN
IF P_AMOUNT <8000 THEN M:=0.08;
ELSIF P_AMOUNT <18000 THEN M:=0.25;
ELSE  M:=0.3;
END IF;
RETURN P_AMOUNT*M;
END;
/

Although HR has the original version of the function, but he modified his function in the database many times  and he did not track these modifications in the text file (in our example tax.sql).
HR user deleted his function accidentally:

DROP FUNCTION TAX;

A couple minutes later he realized that it was a mistake and his version of the function in tax.sql is too old! He asked the DBA to help him.
The DBA connected to the database instance as SYS and issued the following GRANT statement:

GRANT FLASHBACK ON USER_SOURCE TO HR;

The DBA wanted to be proactive and  helpful, so he or she  issued the second  SQL statement:

GRANT FLASHBACK ON USER_VIEWS TO HR;

Now HR will be able to issue a flashback query against the subset (only for USER_SOURCE and USER_VIEWS)  of the Data Dictionary:

SELECT TEXT FROM USER_SOURCE AS OF TIMESTAMP SYSTIMESTAMP- INTERVAL ‘&MIN’ MINUTE
WHERE NAME=’TAX’;

TEXT
————————————
FUNCTION TAX(P_AMOUNT IN NUMBER)
RETURN NUMBER
IS
M  NUMBER;
BEGIN
IF P_AMOUNT <8000 THEN M:=0.08;
ELSIF P_AMOUNT <18000 THEN M:=0.25;
ELSE  M:=0.3;
END IF;
RETURN P_AMOUNT*M;
END;

11 rows selected.

If HR redirects the output of the above query in a spool file, he can recreate  the latest version of his function.
Of course HR can issue the same select against the USER_VIEWS.

This method useful If:

  1. The Oracle object is a text type object (subprograms, views, directories, context objects, …)
  2. The DROP statement was not “too old”. This means, if there was a big traffic in the database, many transactions used the undo segment in latest couple of hours or even minutes)
  3. The DBA issued similar statements to this:
    ALTER SYSTEM SET UNDO_RETENTION = 3600;
    ALTER TABLESPACE UNDOTBS1 RETENTION GUARANTEE ;

    The UKOUG annual conference is regularly rated as ‘top of the list’ for events to attend.  The International representation is very strong and with the proliferation of a global economy, this is one of those conferences where you can actually get a flavour and appreciation for the issues and concerns on a global scale.   



Analyzing Oracle Data Warehouse with Analytic Functions II.

April 29, 2011

Window and Report Aggregate Family

Window aggregate functions operate on an ordered set of rows and for each row they return an aggregate value for a window rooted at that row.
The window can be unbounded, partially unbounded or bounded. The size of the window can be specified by physically (eg. With number of rows, ROWS) or logically (typically with date/time values, RANGE).

There are several keywords available:

CURRENT ROW – specifies that window starts or ends at the current row

UNBOUNDED PRECEDING – specifies that the window starts at the first row of the partition.

UNBOUNDED FOLLOWING – specifies that the window ends at the last row of the partition.
If we don’t specify explicitly the one of the boundaries the CURRENT ROW is the default.
Example 1:
We want to create a cummulated aggregation (sometimes called running aggregates) for the salaries for the whole table and for each department:

SELECT department_id,employee_id, last_name, salary, 
SUM(salary) 
OVER (ORDER BY salary DESC RANGE UNBOUNDED PRECEDING) cum_range,
SUM(salary) 
OVER (PARTITION BY department_id 
ORDER BY salary DESC RANGE UNBOUNDED PRECEDING) 
AS cum_range_for_depts,
SUM(salary) 
OVER (ORDER BY salary desc ROWS UNBOUNDED PRECEDING)  cum_rows,
SUM(salary) 
OVER (PARTITION BY department_id 
ORDER BY salary DESC ROWS UNBOUNDED PRECEDING)  cum_rows_for_depts
FROM employees
 --ORDER BY department_id ;


Example 2:
Now we define an increasing, decreasing and fix sized window for each department:

SELECT department_id, last_name, salary,
SUM(salary) 
OVER (PARTITION BY department_id ORDER BY salary DESC
ROWS UNBOUNDED PRECEDING)  AS cum_sum,
SUM(salary) 
OVER (PARTITION BY department_id ORDER BY salary DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)  till_current,
AVG(salary) OVER (PARTITION BY department_id ORDER BY salary DESC
RANGE UNBOUNDED PRECEDING) cum_avg , AVG(salary)
OVER ( PARTITION BY department_id 
ORDER BY salary DESC 
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) avg_sal_rest,
AVG(salary)   OVER (PARTITION BY department_id) AS dept_avg
FROM employees


Example 3:
We want to select to 5 most popular products and running total for each quarter in each year.

SELECT CASE WHEN t.rank=1 THEN t.year END year, t.quarter, t.product, 
t.amount, t.quantity, t.rsum_for_quarter, t.rank rank_for_amount
FROM
(SELECT t.calendar_year year,t.calendar_quarter_desc quarter, 
p.prod_name product, SUM(s.amount_sold) amount, 
SUM(SUM(s.amount_sold))
OVER (PARTITION BY t.calendar_year,t.calendar_quarter_desc
ORDER BY SUM(s.amount_sold) DESC RANGE UNBOUNDED PRECEDING) 
rsum_for_quarter, 
SUM(s.quantity_sold) quantity,
RANK()  OVER (PARTITION BY t.calendar_year , t.calendar_quarter_desc
ORDER BY SUM(s.amount_sold) DESC ) rank
FROM sh.sales s, sh.times t, sh.products p
WHERE s.time_id=t.time_id AND s.prod_id=p.prod_id
GROUP BY t.calendar_year,t.calendar_quarter_desc,p.prod_name) t
WHERE t.rank < 5;


In the SUM(SUM(s.amount_sold)) expression the inner SUM is standard group function, which is the argument of the outer SUM that is an analytical function.
Example 4:
Many times we want to see each element’s contribution (rate) for the total value.
Also called percentages or sharing value.
In the following example we compute the percentages of each  salary of subtotal (department level) of grand total:

SELECT department_id, last_name, salary,
RATIO_TO_REPORT(salary)
OVER (PARTITION BY department_id)  AS rate_for_dept,
RATIO_TO_REPORT(salary) OVER ()   AS rate_for_total
FROM employees;


Example 5:
Now we want to compare each employee’s salary to the lowest anf highest salaries of his/her department using the FIRST_VALUE and LAST_VALUE functions.

SELECT t.*,t.salary/t.lowest_sal  rate_for_lowest_sal,
t.salary/t.highest_sal AS rate_for_highest_sal
FROM ( SELECT department_id , last_name, salary,
FIRST_VALUE(salary)
OVER ( PARTITION BY department_id ORDER BY salary
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) lowest_sal,
LAST_VALUE(salary) OVER ( PARTITION BY department_id ORDER BY salary
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) highest_sal
FROM employees 
ORDER BY department_id) t


Example 6:
Many times we need to compute moving values(MAX,SUM,AVG). In this example we calculate moving sum for better understanding:

SELECT last_name, hire_date, salary,
SUM(salary)  
OVER (ORDER BY hire_date RANGE BETWEEN NUMTOYMINTERVAL(3,'MONTH') 
PRECEDING AND NUMTOYMINTERVAL(3,'MONTH') FOLLOWING) AS mov_sum_3_months,
SUM(salary)  
OVER (ORDER BY hire_date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) 
AS mov_sum_3_rows
FROM employees;


Observe the difference between RANGE and ROWS.
With RANGE we specified a six months window, while with ROWS we created a 7 row width window.


To be continued…

Efficient analysis with IN operator

April 28, 2011

SQL developers frequently hear and realize that the subqueries are not efficient,
avoid using them.
Indeed, many times those SELECT statements, which contains subqueries
in the WHERE or HAVING clause are not very efficient.

But this not a general rule!

I'd like to demonstrate that subqueries with IN operator and
correlated subqueries can be very efficient.

Let see the next example:
 
SELECT last_name, department_id, job_id 
FROM employees
WHERE department_id IN 
(SELECT department_id FROM departments WHERE location_id = 1700);

Here is the execution plan for the SELECT statement:

 
SQL_ID  gm32t8f65882q, child number 0
-------------------------------------
SELECT last_name, department_id, job_id FROM   employees WHERE
department_id IN (SELECT department_id FROM departments WHERE  location_id = 1700)
Plan hash value: 1021246405
--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |       |       |     6 (100)|          |
|   1 |  NESTED LOOPS                |                   |       |       |            |          |
|   2 |   NESTED LOOPS               |                   |    37 |   814 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL         | DEPARTMENTS       |     4 |    20 |     4   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |    10 |       |     0   (0)|          |
|   5 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |    10 |   170 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("LOCATION_ID"=1700)
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
====================================================================================================

Let us see the second solution and the execution plan!

 
SELECT last_name, e.department_id, job_id
FROM employees e,departments d
WHERE e.department_id =d.department_id and location_id = 1700;
 
SQL_ID  gq10rm3ypv7v2, child number 0
-------------------------------------
SELECT last_name, e.department_id, job_id FROM   employees
e,departments d WHERE  e.department_id =d.department_id and
location_id = 1700
Plan hash value: 1021246405
--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |       |       |     6 (100)|          |
|   1 |  NESTED LOOPS                |                   |       |       |            |          |
|   2 |   NESTED LOOPS               |                   |    37 |   814 |     6   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL         | DEPARTMENTS       |     4 |    20 |     4   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | EMP_DEPARTMENT_IX |    10 |       |     0   (0)|          |
|   5 |   TABLE ACCESS BY INDEX ROWID| EMPLOYEES         |    10 |   170 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("LOCATION_ID"=1700)
4 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
====================================================================================================

We can realize that both quires use the same execution plan (The plan Hash values are same!)
so they are same efficient!

Analyzing Oracle Data Warehouse with Analytic Functions I.

April 28, 2011

Introduction. The Rank family

 

The traditional SQL statements and functions do not provide strong support for analytic tasks. Basic business intelligence calculations such as moving averages, rankings and lead/lag comparisons have required extensive programming outside of standard SQL , often with performance challenges.

From Oracle 8.1.6 Oracle provides powerful new family of functions which are useful for analyzing data warehouses or simple big data sets.

Oracle has created four families of analytic functions

  • Ranking family: RANK, DENSE_RANK, PERCENT_RANK, CUME_DIST
  • Window Aggregate family: AVG, SUM, MIN, MAX, COUNT, VARIANCE and STDDEV (typically with fix sized window)
  • Reporting Aggregate family: AVG, SUM, MIN, MAX, COUNT, VARIANCE and STDDEV , RATIO_TO_REPORT (typically with varying sized window)
  • LAG/LEAD family: LAG  and LEAD

It is important to understand the processing order of the functionalyties of the analytic SQL statements

  • The joins will be executed
  • Evaluating the WHERE, GROUP BY, HAVING clauses
  • Partitions created
  • Analytic functions applied to each row in each partition
  • ORDER BY is processed to allow for precise output ordering.

In the following examples we will use the the sample schemas (HR,SH) which are available in all Oracle database.

Ranking family:

Example 1:

Let’s suppose we want rank the salaries in the employees table.

Issuing the following SELECT

SELECT ROWNUM,last_name,salary from
(SELECT * FROM EMPLOYEES ORDER BY salary DESC) t
WHERE ROWNUM<=&counter;

for counter=2  :

1 KING               24000

2 KOCHHAR      17000

for counter=3  :

 1 KING               24000

2 KOCHHAR      17000

3 DE HAAN        17000

Which are the ranks for Kochhar and De Haan?
Solution:

SELECT last_name,salary,
RANK() OVER( ORDER BY salary DESC) normal_rank,
DENSE_RANK() OVER( ORDER BY salary desc) dense_rank
FROM employees;

Example 2:
Let’s see the top 5 best products regarding the revenue for each year:

SELECT t.*
FROM (select t.calendar_year year , p.prod_name product, SUM(s.amount_sold), SUM(s.quantity_sold),
RANK()  OVER (PARTITION BY t.calendar_year ORDER BY SUM(s.amount_sold) DESC ) rank
FROM sh.sales s, sh.times t ,sh.products p
WHERE s.time_id=t.time_id AND s.prod_id=p.prod_id
GROUP BY t.calendar_year,p.prod_name) t
WHERE t.rank<=5;

Example 3:

We can retrieve the 4 best products regarding the revenue for each product category:

SELECT t.*
FROM (select p.prod_category,t.calendar_quarter_desc quarter  , ROUND(SUM(s.amount_sold)) amount ,SUM(s.quantity_sold) quantity,
RANK()  OVER (PARTITION BY P.prod_category
ORDER BY SUM(s.amount_sold) DESC ) rank
FROM sh.sales s, sh.times t ,sh.products p
WHERE s.time_id=t.time_id AND s.prod_id=P.prod_id
GROUP BY p.prod_category,calendar_quarter_desc ) t
WHERE t.rank<=4
/

Example 4:
Now we consider the cumulative distribution of the salaries for each departments:

SELECT department_id, last_name, last_name, salary,
CUME_DIST() OVER(PARTITION BY department_id ORDER BY  Salary)
FROM employees;

Example 5:
We create ranks and subtotals  for amounts in  each year:

SELECT t.*
FROM (select t.calendar_year year , p.prod_name  product , SUM(s.amount_sold) amount,SUM(s.quantity_sold),
RANK()  OVER (PARTITION BY t.calendar_year ORDER BY SUM(s.amount_sold) DESC ) -1 rank,GROUPING(p.prod_name) grp_id
FROM sh.sales s, sh.times t ,sh.products p
WHERE s.time_id=t.time_id AND s.prod_id=p.prod_id
GROUP BY ROLLUP ( t.calendar_year,p.prod_name)) t
WHERE t.rank<=5
ORDER BY year,grp_id,amount DESC
/

Example 6:
The ROW_NUMBER function – which is not equivalent with neither ROWNUM nor RANK – assigns a unique number, in the sequence defined by the ORDER BY clause, to each row within the partition.

SELECT  ROW_NUMBER() OVER(ORDER BY department_id) row_number , ROWNUM ,
ROW_NUMBER() OVER( PARTITION BY department_id ORDER BY salary DESC) dept_row_number,
department_id,last_name,salary,
RANK() OVER( PARTITION BY department_id ORDER BY salary DESC) normal_rank,
DENSE_RANK() OVER( PARTITION BY department_id ORDER BY salary DESC) dense_rank
FROM employees
/
To be continued…

Trace an Oracle application client user in 3-tier environment

February 10, 2011

Let us suppose that you want to trace client user which was established in an application environment( for example  Oracle  E-Business-Suite application  in 3-tier  architecture)

Tracing  a  client

Let’s suppose there are two client or application user (JOEL and LASZLO).
They use the same application, but different modules and doing different activities.
Via connection pool they use –implicitly – the apps Oracle user account.
LASZLO is responsible for REPORTING using the EDUCATION module.
JOEL is manager working with SME REPORTING in MANAGEMENT MODULE.
LASZLO is working in EDUCATION_DEV modul  using  DEVELOPING activity
JOEL wants to observe LASZLO’s activity tracing LASZLO  as a client user:
We can imitate with DBMS_SESSION and DBMS_APPLICATION_INFO packages:

In SQL*Plus APPS executed the following PL/SQL program:
EXEC DBMS_SESSION.SET_IDENTIFIER(‘LASZLO’)
EXEC DBMS_APPLICATION_INFO.SET_MODULE(‘EDUCATION_DEV’,’DEVELOPING’)

In an other SQL*Plus instance APPS executed the following PL/SQL program:
EXEC DBMS_SESSION.SET_IDENTIFIER(‘LASZLO’)
EXEC DBMS_APPLICATION_INFO.SET_MODULE(‘EDUCATION’,’REPORTING’)

In SQL Devepoler SYS executed the following PL/SQL program:
EXEC DBMS_SESSION.SET_IDENTIFIER(‘JOEL’)
EXEC DBMS_APPLICATION_INFO.SET_MODULE(‘MANAGEMENT’,’SME REPORTING’)

We want to trace LASZLO’S activity
The trace files will be created under this directory:

Now  you can execute (as SYS) the following PL/SQL program:
EXEC DBMS_MONITOR.CLIENT_ID_TRACE_ENABLE(‘LASZLO’)
If you want to set the statistics level for that session
create AFTER LOGON trigger, like this:
CREATE OR REPLACE TRIGGER logon_trig
AFTER LOGON  ON  hr.SCHEMA
BEGIN
EXECUTE IMMEDIATE ‘ALTER SESSION SET STATISTICS_LEVEL=ALL’;
END;
/

LASZLO executed  the following query  as DEVELOPING using  apps Oracle user :

SELECT /* LASZLO_DEVELOPING */ last_name,e.salary
FROM hr.employees e
WHERE
E.SALARY>(select avg(d.salary) FROM hr.employees d
WHERE d .department_id=e.department_id);

LASZLO executed  the following query  as REPORTING  using  apps Oracle user :

SELECT /* LASZLO_REPORT */ last_name,e.salary
FROM hr.employees e
WHERE E.SALARY>(select avg(d.salary)
FROM hr.employees d
WHERE  d.department_id=e.department_id);

Now  you can execute (as SYS) the following PL/SQL program:
EXEC DBMS_MONITOR.CLIENT_ID_TRACE_DISABLE(‘LASZLO’)
Under the trace directory you will find the following files: orcl_ora_9508.trc and orcl_ora_1752.trc.

But you don’t have to know them!

It is enough to run the TRCSESS utility which collects the necessary information from the trace files and produce  a new one:

trcsess output=laszlo.trc clientid=LASZLO

This file contains very detailed information. If You need a more readable text file, please run the tkprof utility in  command  line
tkprof LASZLO.trc  LASZLO.txt

Part from laszlo.txt (LASZLO_DEVELOPING)

Part from laszlo.txt (LASZLO_REPORT)


Trace an arbitrary Oracle application session in client/server architecture

February 10, 2011

Let us suppose that you want to trace a session which was established in client/server architecture (Dedicated Server architecture)

Tracing  a  session

This method is applicable, if the session was created by an c/s application.
Very easy to identify which is the target  session:

Let us suppose that  the session is identified by
SID=420 and SERIAL#=2103.
The HR user’s  Operating system process identifier (SPID) is 6684.
You can find the directory of the trace file which will be created later:

Now  you can execute (as SYS) the following PL/SQL program:
EXEC DBMS_MONITOR.session_trace_enable(420,2103,TRUE,TRUE,’ALL_EXECUTIONS’)

3. argument (waits)  TRUE, wait information is present in the trace
4. argument (binds)  TRUE, bind information is present in the trace
If you want to set the statistics level for that session
create an AFTER LOGON trigger, like this:
CREATE OR REPLACE TRIGGER logon_trig
AFTER LOGON  ON  hr.SCHEMA
BEGIN
EXECUTE IMMEDIATE
‘ALTER SESSION SET STATISTICS_LEVEL=ALL’;

END;
/

Connect and execute the following query as HR :

You can turn off the tracing for the same session:

EXEC DBMS_MONITOR.session_trace_disable(420,2103)

Under the trace directory you will find the following file: orcl_ora_6684.trc where 6684 is HR’s SPID.
This file contains very detailed information.
If You need a more readable text file, please run the tkprof utility in  command  line
tkprof orcl_ora_6684.trc hr_sqlplus2.txt
Detail from the file:

Execute the sqlid stored procedure
(which can be found in “Trace your Oracle session” blog)
as SYS:

EXEC hr.sqlid(‘MAX(SUM(salary))’)