Rollup, Cube, Grouping

Title GROUPING_ID

Author: Markus Jägle (markus.jaegle@trivadis.com)

Info Technical background info (April 2002)

Source from the Oracle9i Data Warehousing Guide and the Trivadis courses New Features Oracle9i and Advanced SQL.

 

Introduction

Oracle8i introduced the GROUP BY functions ROLLUP and CUBE. With the same version, Oracle also offered the GROUPING function to analyze the aggregates created by the ROLLUP and CUBE functions. But since this function sometimes hogs large amounts of memory, a new function was developed with the version Oracle9i to solve this problem: the GROUPING_ID function.

 

ROLLUP, CUBE and GROUPING

The ROLLUP function introduced with Oracle8i allows the creation of sorted subtotals at every level of an aggregation up to the sum total. ROLLUP first calculates the aggregates specified in the GROUP BY clause. Subtotals of the higher levels are then created progressively, navigating from right to left through the grouping attributes. Finally, the sum total is output (see following example).

 

SQL> SELECT deptno, job, COUNT(*), SUM(sal)

2 FROM emp

3 GROUP BY ROLLUP (deptno, job);

DEPTNO JOB COUNT(*) SUM(SAL)

---------- --------- ---------- ----------

10 CLERK 1 1300

10 MANAGER 1 2450

10 PRESIDENT 1 5000

10 3 8750

20 ANALYST 2 6000

20 CLERK 2 1900

20 MANAGER 1 2975

20 5 10875

30 CLERK 1 950

30 MANAGER 1 2850

30 SALESMAN 4 5600

30 6 9400

14 29025

13 rows selected.

Example 1: ROLLUP

 

The CUBE function, also introduced with Oracle8i, forms all possible combinations of subtotals, including the sum total (asymmetrical aggregation). This allows very simple cross-classified table reports, such as are necessary in drill-down functionalities of client tools. With multi-dimensional analyses, the CUBE function generates all the subtotals that can be formed for a cube with the specified dimensions (see following example).

SQL> SELECT deptno, job, COUNT(*), SUM(sal)

2 FROM emp

3 GROUP BY CUBE(deptno, job);

DEPTNO JOB COUNT(*) SUM(SAL)

---------- --------- ---------- ----------

10 CLERK 1 1300

10 MANAGER 1 2450

10 PRESIDENT 1 5000

10 3 8750

20 ANALYST 2 6000

20 CLERK 2 1900

20 MANAGER 1 2975

20 5 10875

30 CLERK 1 950

30 MANAGER 1 2850

30 SALESMAN 4 5600

30 6 9400

ANALYST 2 6000

CLERK 4 4150

MANAGER 3 8275

PRESIDENT 1 5000

SALESMAN 4 5600

14 29025

18 rows selected.

Example 2: CUBE

 

With the help of the GROUPING function, it is possible to identify the super aggregate rows within a GROUP BY expression (see ROLLUP and CUBE). The following example shows how the GROUPING function is used. GROUPING returns the value 1 if the row is an aggregate/subtotal; otherwise, the value 0 is returned.

SQL> SELECT deptno

2 ,job

3 ,COUNT (*)

4 ,SUM (sal)

5 ,GROUPING(deptno) AS group_dept

6 ,GROUPING(job) AS group_job

7 FROM emp

8 GROUP BY CUBE (deptno, job);

DEPTNO JOB COUNT(*) SUM(SAL) GROUP_DEPT GROUP_JOB

---------- --------- ---------- ---------- ---------- ----------

10 CLERK 1 1300 0 0

10 MANAGER 1 2450 0 0

10 PRESIDENT 1 5000 0 0

10 3 8750 0 1

20 ANALYST 2 6000 0 0

20 CLERK 2 1900 0 0

20 MANAGER 1 2975 0 0

20 5 10875 0 1

30 CLERK 1 950 0 0

30 MANAGER 1 2850 0 0

30 SALESMAN 4 5600 0 0

30 6 9400 0 1

ANALYST 2 6000 1 0

CLERK 4 4150 1 0

MANAGER 3 8275 1 0

PRESIDENT 1 5000 1 0

SALESMAN 4 5600 1 0

14 29025 1 1

18 rows selected.

Example 3: CUBE with GROUPING

 

The value 1 in the GROUP_DEPT and GROUP_JOB columns of this statement's result shows clearly whether the respective row is a total. If, for example, there is a 1 in the GROUP_JOB column, then we are looking at the subtotal of all JOBs of a DEPTNO (as, for instance, in the 4th line of the example). The sum total row can be recognized in that 1 is returned in all GROUPING functions for all GROUP BY columns (see last row in the

example).

 

GROUPING_ID

Since the information of the GROUPING function must be output for the identification of the super aggregate rows for each GROUP BY column, this can hog a lot of memory if the query result is to be stored in tables, e.g. with materialized views. A select statement, for example, that uses a GROUP BY across four columns requires four GROUPING functions in order to be analyzed.

 

The GROUPING_ID function offers a possibility for saving memory in identifying the super aggregate rows. Similar to the GROUPING function, this function identifies these rows and returns the GROUP BY level as a bit vector.

 

GROUPING_ID takes all the 1's and 0's that were generated using the GROUPING function and combines them to form a bit vector. The bit vector is treated as a binary number, and the value is returned by the GROUPING_ID function. If, for example, a grouping is carried out with the expression CUBE (a,b), the following values are possible.

 

Aggregation Level Bit Vector GROUPING_ID

a, b 0 0 0

a 0 1 1

b 1 0 2

Sum total 1 1 3

The following SQL statements should illustrate this functionality with a simple example. In

these statements, only the CUBE function is used, but GROUPING_ID can also be used in

the same way in combination with the ROLLUP function.

SQL> SELECT deptno,job

2 ,COUNT (*)

3 ,SUM (sal)

4 ,GROUPING(deptno) AS grp_dept

5 ,GROUPING(job) AS grp_job

6 ,GROUPING_ID(deptno, job) AS grp_id

7 FROM emp

8 GROUP BY CUBE (deptno, job);

DEPTNO JOB COUNT(*) SUM(SAL) GRP_DEPT GRP_JOB GRP_ID

------ --------- -------- -------- -------- ------- ------

10 CLERK 1 1300 0 0 0

10 MANAGER 1 2450 0 0 0

10 PRESIDENT 1 5000 0 0 0

10 3 8750 0 1 1

20 ANALYST 2 6000 0 0 0

20 CLERK 2 1900 0 0 0

20 MANAGER 1 2975 0 0 0

20 5 10875 0 1 1

30 CLERK 1 950 0 0 0

30 MANAGER 1 2850 0 0 0

30 SALESMAN 4 5600 0 0 0

30 6 9400 0 1 1

ANALYST 2 6000 1 0 2

CLERK 4 4150 1 0 2

MANAGER 3 8275 1 0 2

PRESIDENT 1 5000 1 0 2

SALESMAN 4 5600 1 0 2

14 29025 1 1 3

18 rows selected.

Example 4: CUBE with GROUPING and GROUPING_ID

 

In the result of Example 4, the value of the bit vector (cf. Example 3) is also displayed in the last column. The bit vector of the last row in the above example, for instance—which shows the sum totals—is "1 1" and thus has the value of 3, returned by the GROUPING ID function. The bit vector of the subtotals for the jobs is "1 0" (see rows 13 to 17 in the above example) and has the value 2. The bit vector for the subtotals of the departments (rows 4, 8

and 12 in the above example) is "0 1," so the GROUPING_ID function returns the value 1.

 

The subtotals for departments and jobs have the bit vector "0 0," and GROUPING_ID returns the value 0. The two columns with the information of the GROUPING function are now no longer necessary for evaluation of the superaggregate rows and can be omitted.

 

SQL> SELECT deptno,job

2 ,COUNT (*)

3 ,SUM (sal)

4 ,GROUPING_ID(deptno, job) AS grp_id

5 FROM emp

6 GROUP BY CUBE (deptno, job);

DEPTNO JOB COUNT(*) SUM(SAL) GRP_ID

---------- --------- ---------- ---------- ----------

10 CLERK 1 1300 0

10 MANAGER 1 2450 0

10 PRESIDENT 1 5000 0

10 3 8750 1

20 ANALYST 2 6000 0

20 CLERK 2 1900 0

20 MANAGER 1 2975 0

20 5 10875 1

30 CLERK 1 950 0

30 MANAGER 1 2850 0

30 SALESMAN 4 5600 0

30 6 9400 1

ANALYST 2 6000 2

CLERK 4 4150 2

MANAGER 3 8275 2

PRESIDENT 1 5000 2

SALESMAN 4 5600 2

14 29025 3

18 rows selected.

Example 5: CUBE with GROUPING_ID

 

One possible use of this function would be the following example.

SQL> SELECT DECODE(GROUPING_ID(deptno, job),

2 1, 'Sub-total Department',

3 2, 'Sub-total Job',

4 3, 'Total',

5 null) AS totals

6 ,deptno

7 ,job

8 ,COUNT (*)

9 ,SUM (sal)

10 FROM emp

11 GROUP BY CUBE (deptno, job);

TOTALS DEPTNO JOB COUNT(*) SUM(SAL)

------------------------ ------ --------- -------- --------

10 CLERK 1 1300

10 MANAGER 1 2450

10 PRESIDENT 1 5000

Sub-total Department 10 3 8750

20 ANALYST 2 6000

20 CLERK 2 1900

20 MANAGER 1 2975

Sub-total Department 20 5 10875

30 CLERK 1 950

30 MANAGER 1 2850

30 SALESMAN 4 5600

Sub-total Department 30 6 9400

Sub-total Job ANALYST 2 6000

Sub-total Job CLERK 4 4150

Sub-total Job MANAGER 3 8275

Sub-total Job PRESIDENT 1 5000

Sub-total Job SALESMAN 4 5600

Total 14 29025

18 rows selected.

Example 6: CUBE with DECODE of GROUPING_ID

 

Conclusion

The GROUPING_ID function allows the user to quickly and easily identify and analyze the super aggregate rows for GROUP BY expressions and thus to display and evaluate the level of aggregation in one column, saving space and making it easier to read.

 

Markus Jägle e-mail: markus.jaegle@trivadis.com

Trivadis Projektentwicklung GmbH Tel: +49 761 45571 54

Sasbacher Str. 2 Fax: +49 761 45571 30

D-79111 Freiburg Internet: www.trivadis.com

Conditional compilation of package

This tip was found in Oracle forums. With due respect to the author I am replicating the idea of the message below:

 

Oracle introduced conditional compilation of packages/procedures from 10g.

 

Conditional compilation allows PL/SQL code to be tailored to specific environments by selectively altering the source code based on compiler directives. It is considered a new feature of Oracle 10g Release 2, but is available in Oracle 10g Release 1 (10.1.0.4.0).

 

Compiler flags are identified by the "$$" prefix, while conditional control is provided by the $IF-$THEN-$ELSE syntax.

 

$IF boolean_static_expression $THEN text

  [ $ELSIF boolean_static_expression $THEN text ]

  [ $ELSE text ]

$END

 

Find below a simple example of procedure that uses conditional compilation. The objective of this package is to invalidate the package by not modifying the source/underlying objects.

 

SQL> create or replace package p is $IF ($$x) $then INVALID $else $end end;
 
Package created.
 
SQL> select status from user_objects where object_name='P';
 
STATUS
-------
VALID
 
SQL> alter package p compile PLSQL_CCFLAGS='x:true';
 
Warning: Package altered with compilation errors.
 
SQL> select status from user_objects where object_name='P';
 
STATUS
-------
INVALID

 

Thus without modifying the actual contents of the package we have been successful in invalidating the package.

 

More reads on:

Oracle Forum Topic

Oracle-Base

Oracle collections - Introduction

Collections

The online Oracle 9i PL/SQL User Guide introduces collections as; "A collection is an ordered group of elements, all of the same type. It is a general concept that encompasses lists, arrays, and other familiar datatypes. Each element has a unique subscript that determines its position in the collection."

 

Retrieving Values

When you first start using PL/SQL, you generally end up declaring variables, then retrieving a value from a table to go into the variable. If you're going to retrieve a range of values from a table, you declare a cursor, retrieve the values one at a time from the database, and process them sequentially. Sometimes though, you want to retrieve a load of values all in one go, and load them into an array, so that you can carry out some sort of operation on the group of values as a whole.

 

PL/SQL Table

With Oracle 7, you could create an index-by table, or 'PL/SQL Table', that consist of a series of value pairs; an index value, and a scalar datatype (such as varchar2, or number). You referred to an individual PL/SQL Table entry by using the index, i.e. CUSTOMER_NAME(10). What made it interesting was that, as well as using scalar datatypes, you could also create PL/SQL tables using PL/SQL records, which could consist of a number of individual columns. By creating a PL/SQL record type based off of an existing table (for example, by using the SCOTT.EMP%ROWTYPE), you could load a table row, or an entire table, into a variable and process it within your PL/SQL package. 

 

For example:

 

DECLARE
TYPE
EmpTabTyp IS TABLE OF emp%ROWTYPE
    INDEX BY BINARY_INTEGER;
emp_tab EmpTabTyp;
 

BEGIN
/
* Retrieve employee record. */
SELECT
* INTO emp_tab(7468) FROM emp WHERE empno = 7468;
...
END;

 

The above example would set up a PL/SQL table type that consists of PL/SQL records with the same definition as the columns in SCOTT.EMP, and then create a variable based on this type. It would then retrieve a single row from the EMP table, and then place it into the EMP_TAB variable, using the index value of 7468. Note that you don't need to initialize the variable before it's used and you can randomly put values into the variable using any valid BINARY_INTEGER value.

 

Collections and its composite types

As of Oracle 8, PL/SQL Tables are renamed 'Collections' and supplemented by two new composite types: Nested Tables, and VARRAYs

 

Nested tables

Nested tables extend the functionality of index-by tables by adding extra collection methods (known as table attributes for index-by tables), and, in a new development, nested tables can also be store in database tables and can be directly manipulated using SQL. Collectively, both types are known known as PL/SQL Tables.

To declare a nested table, you use the syntax (note the lack of 'INDEX BY BINARY INTEGER')

TYPE type_name IS TABLE OF element_type [NOT NULL];

 

Nested Tables and Index-By Tables

In practical terms, one major difference between nested tables and index-by tables, is that you have to initialize nested tables, using a constructor, definining how many elements can initially be stored in it (although you can later EXTEND the nested table); however, as mentioned above, you can store nested tables within the database (embedded in database columns), which is a discrete advantage over index-by tables. So, if you want to put together a database that is object orientated, and you need the data to be persistent, nested tables are the way to go.

 

VARRAYs

A VARRAY (variable length array) is a data type that would be familiar to java or C programmers. A varray has a maximum size, which you must specify in its type definition. Its index has a fixed lower bound of 1 and an extensible upper bound. Thus, a varray can contain a varying number of elements, from zero (when empty) to the maximum specified in its type definition.

 

Index-By Tables - Nested Tables - VARRAYS

So, how do you choose which of these three types (index-by tables, nested tables and varrays) to use. Arrays in other languages become VARRAYs in PL/SQL. Sets and bags in other languages become nested tables in PL/SQL. Hash tables and other kinds of unordered lookup tables in other languages become associative arrays in PL/SQL.

 

Loading Table

The original method for loading the table as put forward in the question was:

 

CREATE TABLE t1 AS
SELECT
*
FROM all_objects
WHERE 1=0;

CREATE OR REPLACE PROCEDURE test_proc IS

BEGIN
    FOR x IN (SELECT * FROM all_objects)
    LOOP
        INSERT INTO t1
        (owner, object_name, subobject_name, object_id,
        data_object_id, object_type, created, last_ddl_time,
        timestamp, status, temporary, generated, secondary)
        VALUES
        (x.owner, x.object_name, x.subobject_name, x.object_id,
        x.data_object_id, x.object_type, x.created,
 

x.data_object_id, x.object_type, x.created,
        x.last_ddl_time, x.timestamp, x.status, x.temporary,
        x.generated, x.secondary);
    END LOOP;
COMMIT
;
END test_proc;

 

test_proc producedure

The test_proc producedure declares a cursor that points to the resultset from SELECT * FROM ALL_OBJECTS It then starts at record one, and inserts into the t1 table the columns from the first row in the cursor Then, it loops back and gets the next row of data, until all rows from the cursor have been retrieved. The data is then committed, and the procedure ends.

 

Nested Table to hold data - bulk collect to load Data

The first solution put forward uses a nested table to hold the data from the ALL_OBJECTS table, and does something called BULK COLLECT to load all of the source tables' data into the nested table.

 

CREATE OR REPLACE PROCEDURE fast_proc (p_array_size IN PLS_INTEGER DEFAULT 100)
IS

TYPE ARRAY IS TABLE OF all_objects%ROWTYPE;
l
_data ARRAY;

CURSOR c IS
SELECT
*
FROM all_objects;

BEGIN
    OPEN c;
    LOOP
    FETCH c BULK COLLECT INTO l_data LIMIT p_array_size;

    FORALL i IN 1..l_data.COUNT
    INSERT INTO t2 VALUES l_data(i);

    EXIT WHEN c%NOTFOUND;
    END LOOP;
    CLOSE c;
END
fast_proc;
/

Ignore the fact that the table's called ARRAY - it's a nested table, not a varray.

 

Bulk Loading Data into nested table

The second example answer is a variation on this, that does much the same thing with slightly more compact code;

 

SQL> create or replace procedure fast_proc is
        type TObjectTable is table of ALL_OBJECTS%ROWTYPE;
        ObjectTable$ TObjectTable;
        begin
        select * BULK COLLECT INTO ObjectTable$
        from ALL_OBJECTS;

       forall x in ObjectTable$.First..ObjectTable$.Last
       insert into t1 values ObjectTable$(x) ;
       end;
/

 

Again, a nested table is declared, but this time the cursor is dispensed with, and the data is just bulk loaded directly into the nested table. Again, the FORALL statement is used afterwards to run through the nested table. If you need to process lots of rows in one go, loading the data into memory first, use a collection and BULK COLLECT the data into them.

 

How to create view without underlying table

This tip will enable to create a view even if you do not have an underlying table already present in your database.

 

In this article you will learn to

  • Create view without a table
  • Creating a table for that view
  • How to make the view to work

 

Consider the following example:-

 

CREATE OR REPLACE FORCE VIEW force_view AS SELECT * FROM force_table;

 

Now check whether the view is created or not:

 

SELECT object_name, object_type, status, temporary, generated, secondary FROM user_objects WHERE object_name='FORCE_VIEW';

 

OBJECT_NAME OBJECT_TYPE STATUS      TEMPORARY   GENERATED   SECONDARY

----------- ----------- --------   ---------   ---------  ---------

FORCE_VIEW  VIEW        INVALID     N           N           N

 

So this ensures that the view is created, but what about the status? It says the view created is invalid.

 

Now we will try to execute the view:

 

SELECT * FROM force_view;

 

Error starting at line 1 in command:

select * from force_view

Error at Command Line:1 Column:14

Error report:

SQL Error: ORA-04063: view "RND.FORCE_VIEW" has errors

04063. 00000 -  "%s has errors"

*Cause:    Attempt to execute a stored procedure or use a view that has

           errors.  For stored procedures, the problem could be syntax errors

           or references to other, non-existent procedures.  For views,

           the problem could be a reference in the view's defining query to

           a non-existent table.

           Can also be a table which has references to non-existent or

           inaccessible types.

*Action:   Fix the errors and/or create referenced objects as necessary

 

This is the error which I get when running the view. So now you will understand that we will be just able to create the view, but it is useless until it has an underlying table attached to it.

 

Now we will create table for this view:

 

CREATE TABLE force_table (a NUMBER, b VARCHAR2(10));

 

Now will check again the status for the view.

 

SELECT object_name, object_type, status, temporary, generated, secondary FROM user_objects WHERE object_name='FORCE_VIEW';

 

OBJECT_NAME OBJECT_TYPE STATUS      TEMPORARY   GENERATED   SECONDARY

----------- ----------- --------   ---------   ---------  ---------

FORCE_VIEW  VIEW        INVALID     N           N           N

 

The status is still INVALID. Now we need to re-compile the view to make it VALID.

 

ALTER VIEW force_view COMPILE;

 

OBJECT_NAME OBJECT_TYPE STATUS      TEMPORARY   GENERATED   SECONDARY

----------- ----------- --------   ---------   ---------  ---------

FORCE_VIEW  VIEW        VALID       N           N           N