LISTAGG function versus Tom Kyte’s STRAGG (UDF)


LISTAGG  (Oracle defined ) versus
Tom Kyte’s STRAGG (User Defined Aggregate Function)

As we know Oracle implemented the LISTAGG aggregate function in 11gR2.
With this function you can concatenate member row’s value in one grouped row.
The structure of the LISTAGG function is similar to other group functions,
but we have an extra WITHIN GROUP clause in the function definition.
Of course we can use this group function with other group functions:

SELECT department_id "Dept",SUM(salary),COUNT(*),
LISTAGG(last_name||':'||salary, ' | ')
WITHIN GROUP (ORDER By salary, last_name) list
FROM employees
WHERE department_id NOT IN (50,80,100,110)
GROUP BY  department_id;

Dept SUM(SALARY)   COUNT(*) LIST
---------- ----------- ---------- --------------------------------------------------------------------------------------
10        4400          1 Whalen:4400
20       19000          2 Fay:6000 | Hartstein:13000
30       24900          6 Colmenares:2500 | Himuro:2600 | Tobias:2800 | Baida:2900 | Khoo:3100 | Raphaely:11000
40        6500          1 Mavris:6500
60       28800          5 Lorentz:4200 | Austin:4800 | Pataballa:4800 | Ernst:6000 | Hunold:9000
70       10000          1 Baer:10000
90       58000          3 De Haan:17000 | Kochhar:17000 | King:24000

7 rows selected.

Oracle used a simple INDEX FULL SCAN operator with SORT GROUP BY.
Other example: we want to see all direct worker’s name in one row for each manager:

SELECT m.last_name mgr_name,
LISTAGG(e.last_name, ' | ')
WITHIN GROUP (ORDER By e.salary, e.last_name) workers
FROM employees e JOIN employees m
ON e.manager_id=m.employee_id
WHERE e.manager_id IS NOT NULL
GROUP BY m.last_name;
MGR_NAME                  WORKERS
------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------
Cambrault                 Kumar | Bates | Smith | Fox | Bloom | Ozer
De Haan                   Hunold
Errazuriz                 Banda | Ande | Lee | Marvins | Greene | Vishney
Fripp                     Olson | Marlow | Atkinson | Cabrio | Bissot | Dellinger | Bull | Sarchand
Greenberg                 Popp | Sciarra | Urman | Chen | Faviet
Hartstein                 Fay
Higgins                   Gietz
Hunold                    Lorentz | Austin | Pataballa | Ernst
Kaufling                  Philtanker | Gee | Perkins | Gates | Rogers | Mallin | Dilly | Chung
King                      Mourgos | Vollman | Kaufling | Weiss | Fripp | Zlotkey | Cambrault | Raphaely | Errazuriz | Hartstein | Partners | Russell | De Haan | Kochhar
Kochhar                   Whalen | Mavris | Baer | Greenberg | Higgins
Mourgos                   Vargas | Grant | Matos | Oconnell | Feeney | Davies | Walsh | Rajs
Partners                  Sewall | Doran | Smith | Mcewen | Sully | King
Raphaely                  Colmenares | Himuro | Tobias | Baida | Khoo
Russell                   Tuvault | Cambrault | Olsen | Hall | Bernstein | Tucker
Vollman                   Patel | Seo | Jones | Mccain | Stiles | Ladwig | Everett | Bell
Weiss                     Markle | Landry | Sullivan | Mikkilineni | Geoni | Fleaur | Nayer | Taylor
Zlotkey                   Johnson | Grant | Livingston | Taylor | Hutton | Abel

18 rows selected.

We can extend the LISTAGG function with analytic clause:

 SELECT department_id "Dept", last_name "Name",
LISTAGG(last_name, '; ')
WITHIN GROUP (ORDER By department_id, last_name)
OVER (PARTITION BY department_id ) emp
FROM employees
WHERE department_id NOT IN (50,80,100,110);
Dept Name                   EMP
---- ---------------------- ------------------------------------------------------
10   Whalen                 Whalen
20   Fay                    Fay; Hartstein
20   Hartstein              Fay; Hartstein
30   Baida                  Baida; Colmenares; Himuro; Khoo; Raphaely; Tobias
30   Colmenares             Baida; Colmenares; Himuro; Khoo; Raphaely; Tobias
30   Himuro                 Baida; Colmenares; Himuro; Khoo; Raphaely; Tobias
30   Khoo                   Baida; Colmenares; Himuro; Khoo; Raphaely; Tobias
30   Raphaely               Baida; Colmenares; Himuro; Khoo; Raphaely; Tobias
30   Tobias                 Baida; Colmenares; Himuro; Khoo; Raphaely; Tobias
40   Mavris                 Mavris
60   Austin                 Austin; Ernst; Hunold; Lorentz; Pataballa
60   Ernst                  Austin; Ernst; Hunold; Lorentz; Pataballa
60   Hunold                 Austin; Ernst; Hunold; Lorentz; Pataballa
60   Lorentz                Austin; Ernst; Hunold; Lorentz; Pataballa
60   Pataballa              Austin; Ernst; Hunold; Lorentz; Pataballa
70   Baer                   Baer
90   De Haan                De Haan; King; Kochhar
90   King                   De Haan; King; Kochhar
90   Kochhar                De Haan; King; Kochhar

19 rows selected.
 

The result is fine but we have to face some restrictions:

1. The maximum length is 4000 characters.
2. The ORDER BY clause and the window clause  can not be used in the OVER clause. We can specify the query partition clause only!
From the second restriction follows that we can not implement a
cumulative aggregation” for the argument(s) of the LISTAGG!
On of the nicest solution might be  Tom Kyte’s STRAGG implementation.
(See http://www.sqlsnippets.com/en/topic-11591.html)
As we know we can add an analytic clause to a user defined aggregation function!
Now we want to see the all previous names –according to the salary descending order – for each employee in each selected groups:

 SELECT d.department_id deptno ,d.department_name,
 stragg(e.last_name)
 OVER(PARTITION BY d.department_id order by e.last_name ROWS  UNBOUNDED PRECEDING ) names
 FROM departments d, employees e
 WHERE e.department_id=d.department_id
 and e.department_id NOT IN (50,80,100,110);
DEPTNO     DEPARTMENT_NAME      NAMES
---------- -------------------- -----------------------------------------------
10         Administration       Whalen
20         Marketing            Fay
20         Marketing            Fay,Hartstein
30         Purchasing           Baida
30         Purchasing           Baida,Colmenares
30         Purchasing           Baida,Colmenares,Himuro
30         Purchasing           Baida,Colmenares,Himuro,Khoo
30         Purchasing           Baida,Colmenares,Himuro,Khoo,Raphaely
30         Purchasing           Baida,Colmenares,Himuro,Khoo,Raphaely,Tobias
40         Human Resources      Mavris
60         IT                   Austin
60         IT                   Austin,Ernst
60         IT                   Austin,Ernst,Hunold
60         IT                   Austin,Ernst,Hunold,Lorentz
60         IT                   Austin,Ernst,Hunold,Lorentz,Pataballa
70         Public Relations     Baer
90         Executive            De Haan
90         Executive            De Haan,King
90         Executive            De Haan,King,Kochhar

19 rows selected.

You combine the STRAGG function with other analytic functions:

 SELECT d.department_id deptno,last_name,salary,
 ROW_NUMBER() OVER(PARTITION BY d.department_id order by e.SALARY DESC ) COUNTER,
 SUM(salary) OVER (PARTITION BY e.department_id ORDER BY salary DESC
 ROWS UNBOUNDED PRECEDING) cum_sum,
 stragg(e.last_name||': '||SALARY)
 OVER(PARTITION BY d.department_id ORDER BY e.salary DESC ROWS  UNBOUNDED PRECEDING ) EMP_DATA
 FROM departments d, employees e
 WHERE e.department_id=d.department_id
 AND  e.department_id NOT IN (50,80,100,110)
 ORDER BY d.department_id;

DEPTNO LAST_NAME   SALARY    COUNTER    CUM_SUM   EMP_DATA
------- ----------- ------ ---------- ----------  ---------------------------------------------------------------------------------------
10      Whalen        4400          1       4400   Whalen: 4400
20      Hartstein    13000          1      13000   Hartstein: 13000
20      Fay           6000          2      19000   Hartstein: 13000,Fay: 6000
30      Raphaely     11000          1      11000   Raphaely: 11000
30      Khoo          3100          2      14100   Raphaely: 11000,Khoo: 3100
30      Baida         2900          3      17000   Raphaely: 11000,Khoo: 3100,Baida: 2900
30      Tobias        2800          4      19800   Raphaely: 11000,Khoo: 3100,Baida: 2900,Tobias: 2800
30      Himuro        2600          5      22400   Raphaely: 11000,Khoo: 3100,Baida: 2900,Tobias: 2800,Himuro: 2600
30      Colmenares    2500          6      24900   Raphaely: 11000,Khoo: 3100,Baida: 2900,Tobias: 2800,Himuro: 2600,Colmenares: 2500
40      Mavris        6500          1       6500   Mavris: 6500
60      Hunold        9000          1       9000   Hunold: 9000
60      Ernst         6000          2      15000   Hunold: 9000,Ernst: 6000
60      Austin        4800          3      19800   Hunold: 9000,Ernst: 6000,Austin: 4800
60      Pataballa     4800          4      24600   Hunold: 9000,Ernst: 6000,Austin: 4800,Pataballa: 4800
60      Lorentz       4200          5      28800   Hunold: 9000,Ernst: 6000,Austin: 4800,Pataballa: 4800,Lorentz: 4200
70      Baer         10000          1      10000   Baer: 10000
90      King         24000          1      24000   King: 24000
90      De Haan      17000          2      41000   King: 24000,De Haan: 17000
90      Kochhar      17000          3      58000   King: 24000,De Haan: 17000,Kochhar: 17000

19 rows selected.

Before 11gR2 -based on Tom Kyte’s solution- we can solve this problem with an hierarchical query.
(http://tkyte.blogspot.co.uk/2007/03/stringing-them-up.html,
The original query caused an Oracle error for me).

The modified query is the following:

SELECT department_id,
COUNT(SUBSTR(sys_connect_by_path(last_name, ',' ),2)) counter,
MAX(SUBSTR(sys_connect_by_path(last_name, ',' ),2)) emp_name
FROM
(SELECT department_id, last_name,
ROW_NUMBER() OVER(PARTITION BY department_id ORDER BY last_name) rn
FROM employees)
WHERE department_id not in (50,80,100,110)
START WITH rn = 1
connect by
PRIOR rn = rn-1 AND PRIOR department_id = department_id
GROUP BY department_id
ORDER BY department_id;
DEPARTMENT_ID    COUNTER EMP_NAME
------------- ---------- -----------------------------------------------
10                     1 Whalen
20                     2 Fay,Hartstein
30                     6 Baida,Colmenares,Himuro,Khoo,Raphaely,Tobias
40                     1  Mavris
60                     5 Austin,Ernst,Hunold,Lorentz,Pataballa
70                     1 Baer
90                     3 De Haan,King,Kochhar

7 rows selected.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: