Title GROUPING_ID
Author: Markus Jägle (markus.jaegle@trivadis.com)
Info Technical background info (April 2002)
Source from the Oracle9i Data Warehousing Guide and the Trivadis courses New Features Oracle9i and Advanced SQL.
Introduction
Oracle8i introduced the GROUP BY functions ROLLUP and CUBE. With the same version, Oracle also offered the GROUPING function to analyze the aggregates created by the ROLLUP and CUBE functions. But since this function sometimes hogs large amounts of memory, a new function was developed with the version Oracle9i to solve this problem: the GROUPING_ID function.
ROLLUP, CUBE and GROUPING
The ROLLUP function introduced with Oracle8i allows the creation of sorted subtotals at every level of an aggregation up to the sum total. ROLLUP first calculates the aggregates specified in the GROUP BY clause. Subtotals of the higher levels are then created progressively, navigating from right to left through the grouping attributes. Finally, the sum total is output (see following example).
SQL> SELECT deptno, job, COUNT(*), SUM(sal)
2 FROM emp
3 GROUP BY ROLLUP (deptno, job);
DEPTNO JOB COUNT(*) SUM(SAL)
---------- --------- ---------- ----------
10 CLERK 1 1300
10 MANAGER 1 2450
10 PRESIDENT 1 5000
10 3 8750
20 ANALYST 2 6000
20 CLERK 2 1900
20 MANAGER 1 2975
20 5 10875
30 CLERK 1 950
30 MANAGER 1 2850
30 SALESMAN 4 5600
30 6 9400
14 29025
13 rows selected.
Example 1: ROLLUP
The CUBE function, also introduced with Oracle8i, forms all possible combinations of subtotals, including the sum total (asymmetrical aggregation). This allows very simple cross-classified table reports, such as are necessary in drill-down functionalities of client tools. With multi-dimensional analyses, the CUBE function generates all the subtotals that can be formed for a cube with the specified dimensions (see following example).
SQL> SELECT deptno, job, COUNT(*), SUM(sal)
2 FROM emp
3 GROUP BY CUBE(deptno, job);
DEPTNO JOB COUNT(*) SUM(SAL)
---------- --------- ---------- ----------
10 CLERK 1 1300
10 MANAGER 1 2450
10 PRESIDENT 1 5000
10 3 8750
20 ANALYST 2 6000
20 CLERK 2 1900
20 MANAGER 1 2975
20 5 10875
30 CLERK 1 950
30 MANAGER 1 2850
30 SALESMAN 4 5600
30 6 9400
ANALYST 2 6000
CLERK 4 4150
MANAGER 3 8275
PRESIDENT 1 5000
SALESMAN 4 5600
14 29025
18 rows selected.
Example 2: CUBE
With the help of the GROUPING function, it is possible to identify the super aggregate rows within a GROUP BY expression (see ROLLUP and CUBE). The following example shows how the GROUPING function is used. GROUPING returns the value 1 if the row is an aggregate/subtotal; otherwise, the value 0 is returned.
SQL> SELECT deptno
2 ,job
3 ,COUNT (*)
4 ,SUM (sal)
5 ,GROUPING(deptno) AS group_dept
6 ,GROUPING(job) AS group_job
7 FROM emp
8 GROUP BY CUBE (deptno, job);
DEPTNO JOB COUNT(*) SUM(SAL) GROUP_DEPT GROUP_JOB
---------- --------- ---------- ---------- ---------- ----------
10 CLERK 1 1300 0 0
10 MANAGER 1 2450 0 0
10 PRESIDENT 1 5000 0 0
10 3 8750 0 1
20 ANALYST 2 6000 0 0
20 CLERK 2 1900 0 0
20 MANAGER 1 2975 0 0
20 5 10875 0 1
30 CLERK 1 950 0 0
30 MANAGER 1 2850 0 0
30 SALESMAN 4 5600 0 0
30 6 9400 0 1
ANALYST 2 6000 1 0
CLERK 4 4150 1 0
MANAGER 3 8275 1 0
PRESIDENT 1 5000 1 0
SALESMAN 4 5600 1 0
14 29025 1 1
18 rows selected.
Example 3: CUBE with GROUPING
The value 1 in the GROUP_DEPT and GROUP_JOB columns of this statement's result shows clearly whether the respective row is a total. If, for example, there is a 1 in the GROUP_JOB column, then we are looking at the subtotal of all JOBs of a DEPTNO (as, for instance, in the 4th line of the example). The sum total row can be recognized in that 1 is returned in all GROUPING functions for all GROUP BY columns (see last row in the
example).
GROUPING_ID
Since the information of the GROUPING function must be output for the identification of the super aggregate rows for each GROUP BY column, this can hog a lot of memory if the query result is to be stored in tables, e.g. with materialized views. A select statement, for example, that uses a GROUP BY across four columns requires four GROUPING functions in order to be analyzed.
The GROUPING_ID function offers a possibility for saving memory in identifying the super aggregate rows. Similar to the GROUPING function, this function identifies these rows and returns the GROUP BY level as a bit vector.
GROUPING_ID takes all the 1's and 0's that were generated using the GROUPING function and combines them to form a bit vector. The bit vector is treated as a binary number, and the value is returned by the GROUPING_ID function. If, for example, a grouping is carried out with the expression CUBE (a,b), the following values are possible.
Aggregation Level Bit Vector GROUPING_ID
a, b 0 0 0
a 0 1 1
b 1 0 2
Sum total 1 1 3
The following SQL statements should illustrate this functionality with a simple example. In
these statements, only the CUBE function is used, but GROUPING_ID can also be used in
the same way in combination with the ROLLUP function.
SQL> SELECT deptno,job
2 ,COUNT (*)
3 ,SUM (sal)
4 ,GROUPING(deptno) AS grp_dept
5 ,GROUPING(job) AS grp_job
6 ,GROUPING_ID(deptno, job) AS grp_id
7 FROM emp
8 GROUP BY CUBE (deptno, job);
DEPTNO JOB COUNT(*) SUM(SAL) GRP_DEPT GRP_JOB GRP_ID
------ --------- -------- -------- -------- ------- ------
10 CLERK 1 1300 0 0 0
10 MANAGER 1 2450 0 0 0
10 PRESIDENT 1 5000 0 0 0
10 3 8750 0 1 1
20 ANALYST 2 6000 0 0 0
20 CLERK 2 1900 0 0 0
20 MANAGER 1 2975 0 0 0
20 5 10875 0 1 1
30 CLERK 1 950 0 0 0
30 MANAGER 1 2850 0 0 0
30 SALESMAN 4 5600 0 0 0
30 6 9400 0 1 1
ANALYST 2 6000 1 0 2
CLERK 4 4150 1 0 2
MANAGER 3 8275 1 0 2
PRESIDENT 1 5000 1 0 2
SALESMAN 4 5600 1 0 2
14 29025 1 1 3
18 rows selected.
Example 4: CUBE with GROUPING and GROUPING_ID
In the result of Example 4, the value of the bit vector (cf. Example 3) is also displayed in the last column. The bit vector of the last row in the above example, for instance—which shows the sum totals—is "1 1" and thus has the value of 3, returned by the GROUPING ID function. The bit vector of the subtotals for the jobs is "1 0" (see rows 13 to 17 in the above example) and has the value 2. The bit vector for the subtotals of the departments (rows 4, 8
and 12 in the above example) is "0 1," so the GROUPING_ID function returns the value 1.
The subtotals for departments and jobs have the bit vector "0 0," and GROUPING_ID returns the value 0. The two columns with the information of the GROUPING function are now no longer necessary for evaluation of the superaggregate rows and can be omitted.
SQL> SELECT deptno,job
2 ,COUNT (*)
3 ,SUM (sal)
4 ,GROUPING_ID(deptno, job) AS grp_id
5 FROM emp
6 GROUP BY CUBE (deptno, job);
DEPTNO JOB COUNT(*) SUM(SAL) GRP_ID
---------- --------- ---------- ---------- ----------
10 CLERK 1 1300 0
10 MANAGER 1 2450 0
10 PRESIDENT 1 5000 0
10 3 8750 1
20 ANALYST 2 6000 0
20 CLERK 2 1900 0
20 MANAGER 1 2975 0
20 5 10875 1
30 CLERK 1 950 0
30 MANAGER 1 2850 0
30 SALESMAN 4 5600 0
30 6 9400 1
ANALYST 2 6000 2
CLERK 4 4150 2
MANAGER 3 8275 2
PRESIDENT 1 5000 2
SALESMAN 4 5600 2
14 29025 3
18 rows selected.
Example 5: CUBE with GROUPING_ID
One possible use of this function would be the following example.
SQL> SELECT DECODE(GROUPING_ID(deptno, job),
2 1, 'Sub-total Department',
3 2, 'Sub-total Job',
4 3, 'Total',
5 null) AS totals
6 ,deptno
7 ,job
8 ,COUNT (*)
9 ,SUM (sal)
10 FROM emp
11 GROUP BY CUBE (deptno, job);
TOTALS DEPTNO JOB COUNT(*) SUM(SAL)
------------------------ ------ --------- -------- --------
10 CLERK 1 1300
10 MANAGER 1 2450
10 PRESIDENT 1 5000
Sub-total Department 10 3 8750
20 ANALYST 2 6000
20 CLERK 2 1900
20 MANAGER 1 2975
Sub-total Department 20 5 10875
30 CLERK 1 950
30 MANAGER 1 2850
30 SALESMAN 4 5600
Sub-total Department 30 6 9400
Sub-total Job ANALYST 2 6000
Sub-total Job CLERK 4 4150
Sub-total Job MANAGER 3 8275
Sub-total Job PRESIDENT 1 5000
Sub-total Job SALESMAN 4 5600
Total 14 29025
18 rows selected.
Example 6: CUBE with DECODE of GROUPING_ID
Conclusion
The GROUPING_ID function allows the user to quickly and easily identify and analyze the super aggregate rows for GROUP BY expressions and thus to display and evaluate the level of aggregation in one column, saving space and making it easier to read.
Markus Jägle e-mail: markus.jaegle@trivadis.com
Trivadis Projektentwicklung GmbH Tel: +49 761 45571 54
Sasbacher Str. 2 Fax: +49 761 45571 30
D-79111
No comments :
Post a Comment