Tuesday, August 6, 2013

T-SQL ROLLUP and CUBE

T-SQL ROLLUP and CUBE

Introduction

First of all I salute all the new and reward MVP in the month of April 2015. Congratulating them for their achievement. Hope the community can learn a lot of new thing from them.
After my blog related to “Group By Grouping Set” request came from my friends circle to completing the article by providing something related to CUBE, ROLLUP and COMPUTE. So in this article we are trying to learn something related to it. Hope it will be informative.

CUBE and ROLLUP
Generally CUBE and ROLLUP is used in reporting purpose and they do the Subtotal and Grand total. CUBE generates a result set that shows aggregates for all combinations of values in the selected columns and ROLLUP generates a result set that shows aggregates for a hierarchy of values in the selected columns.

Let’s take an Example to understand it
We have a table like this.

Class
Section
Roll
Marks
1
A
1
40
1
A
2
30
1
A
3
20
1
B
1
40
1
B
2
30
1
B
3
30
2
A
1
20
2
A
2
60
2
A
3
40
2
B
1
20
2
B
2
30
2
B
3
20

If we make the WITH ROLLUP we can get the Output like this

Class
Section
Marks

1
A
90

1
B
100

1
NULL
190
90 + 100
2
A
120

2
B
70

2
NULL
190
120 + 70
NULL
NULL
380
190 + 190



If we make the WITH CUBE we can get the Output like this

Class
Section
Marks

1
A
90

1
B
100

1
NULL
190
90 + 100
2
A
120

2
B
70

2
NULL
190
120 + 70
NULL
NULL
380
190 + 190
NULL
A
210
90   +  120
NULL
B
170
100 +  70

Let’s take a practical example

Step- 1 [ Create Base Table ]

IF OBJECT_ID(N'[dbo].[tbl_EXAMPLETABLE]', N'U')IS NOT NULL
   BEGIN
      DROP TABLE [dbo].[tbl_EXAMPLETABLE];
   END
GO

CREATE TABLE [dbo].[tbl_EXAMPLETABLE]
    (
        CLASS        INT       NOT NULL,
        SECTION      CHAR(1)   NOT NULL,
        ROLL         INT       NOT NULL,
        MARKS        INT       NOT NULL
   );   

Step-2 [ Insert Records in Base Table ]

INSERT INTO [dbo].[tbl_EXAMPLETABLE]
    (CLASS, SECTION, ROLL, MARKS)
VALUES(1,  'A', 1, 40),
      (1,  'A',   2, 30),
      (1,  'A',   3, 20),
      (1,  'B',   1, 40),
      (1,  'B',   2, 30),
      (1,  'B',   3, 30),
      (2,  'A',   1, 20),
      (2,  'A',   2, 60),
      (2,  'A',   3, 40),
      (2,  'B',   1, 20),
      (2,  'B',   2, 30),
      (2,  'B',   3, 20);


Step – 4 [ Make the WITH ROLLUP ]

SELECT   CLASS, SECTION, SUM(MARKS) AS MARKS
FROM     [dbo].[tbl_EXAMPLETABLE]
GROUP BY CLASS, SECTION WITH ROLLUP;

Output



Step-5 [ Make the WITH CUBE  ]

SELECT   CLASS, SECTION, SUM(MARKS) AS MARKS
FROM     [dbo].[tbl_EXAMPLETABLE]
GROUP BY CLASS, SECTION WITH CUBE;

Output





Some Useful Aggregation (WITH ROLLUP/WITH CUBE/GROUPING SETS)


Introduction

In this article I am trying to provide some example related to aggregation function.

WITH ROLLUP

WITH ROLLUP allows us to do the multiple level of aggregation within a single SQL statement. To understand it, I directly jumped to an example.
Create and Insert Records in a Table Objects

IF OBJECT_ID(N'dbo.tbl_EMPLOYEE',N'U') IS NOT NULL
   BEGIN
     DROP TABLE [dbo].[tbl_EMPLOYEE];
   END
GO

CREATE TABLE [dbo].[tbl_EMPLOYEE]
       (EMPID        INT   NOT NULL,
        EMPSALYEAR   INT   NOT NULL,
        EMPSAL       INT   NOT NULL);
GO

INSERT INTO  [dbo].[tbl_EMPLOYEE]
       (EMPID, EMPSALYEAR, EMPSAL)
VALUES (101, 2012, 2000),
       (101, 2013, 3000),
       (101, 2014, 4000),
       (102, 2012, 5000),
       (102, 2013, 5500),
       (102, 2014, 6000),
       (103, 2012, 1000),
       (103, 2013, 3000),
       (103, 2014, 7000);                  

SELECT * FROM [dbo].[tbl_EMPLOYEE]
GO

EMPID       EMPSALYEAR  EMPSAL
----------- ----------- -----------
101         2012        2000
101         2013        3000
101         2014        4000
102         2012        5000
102         2013        5500
102         2014        6000
103         2012        1000
103         2013        3000
103         2014        7000

Now we use Group By in it

SELECT EMPSALYEAR, SUM(EMPSAL) AS EMPSAL
FROM   [dbo].[tbl_EMPLOYEE]
GROUP BY EMPSALYEAR;

GO

EMPSALYEAR  EMPSAL
----------- -----------
2012        8000
2013        11500
2014        17000

Now we want a GRAND TOTAL at the end.
We can do it like this

SELECT CONVERT(VARCHAR, EMPSALYEAR) EMPSALYEAR,
       SUM(EMPSAL) AS EMPSAL
FROM   [dbo].[tbl_EMPLOYEE]
GROUP BY EMPSALYEAR
UNION ALL
SELECT 'GRAND TOTAL' EMPSALYEAR,
       SUM(EMPSAL) AS EMPSAL
FROM   [dbo].[tbl_EMPLOYEE]
GO

EMPSALYEAR                     EMPSAL
------------------------------ -----------
2012                           8000
2013                           11500
2014                           17000
GRAND TOTAL                    36500

But it is performance overhead for SQL Server.
We can do it easily by WITH ROLLUP

SELECT EMPSALYEAR, SUM(EMPSAL) AS EMPSAL
FROM   [dbo].[tbl_EMPLOYEE]
GROUP BY EMPSALYEAR
WITH ROLLUP
GO

EMPSALYEAR                     EMPSAL
------------------------------ -----------
2012                           8000
2013                           11500
2014                           17000
NULL                           36500

By using little bit formatting

SELECT CASE WHEN GROUPING(EMPSALYEAR) = 0
            THEN CONVERT(VARCHAR, EMPSALYEAR)
            ELSE 'GRAND TOTAL'
       END AS EMPSALYEAR, SUM(EMPSAL) AS Sales
FROM [dbo].[tbl_EMPLOYEE]
GROUP BY EMPSALYEAR WITH ROLLUP;

EMPSALYEAR                     Sales
------------------------------ -----------
2012                           8000
2013                           11500
2014                           17000
GRAND TOTAL                    36500

WITH CUBE

The WITH CUBE clause gives us to compute multiple levels of aggregation in a single statement.

SELECT EMPID, EMPSALYEAR, SUM(EMPSAL) AS EMPSAL
FROM [dbo].[tbl_EMPLOYEE]
GROUP BY EMPID, EMPSALYEAR WITH CUBE
GO

EMPID       EMPSALYEAR  EMPSAL
----------- ----------- -----------
101         2012        2000
102         2012        5000
103         2012        1000
NULL        2012        8000
101         2013        3000
102         2013        5500
103         2013        3000
NULL        2013        11500
101         2014        4000
102         2014        6000
103         2014        7000
NULL        2014        17000
NULL        NULL        36500
101         NULL        9000
102         NULL        16500
103         NULL        11000

New Syntax of WITH ROLLUP and WITH CUBE

SELECT EMPID, EMPSALYEAR, SUM(EMPSAL) AS Sales
FROM   [dbo].[tbl_EMPLOYEE]
GROUP BY ROLLUP(EMPID, EMPSALYEAR)
GO

EMPID       EMPSALYEAR  Sales
----------- ----------- -----------
101         2012        2000
101         2013        3000
101         2014        4000
101         NULL        9000
102         2012        5000
102         2013        5500
102         2014        6000
102         NULL        16500
103         2012        1000
103         2013        3000
103         2014        7000
103         NULL        11000
NULL        NULL        36500



SELECT EMPID, EMPSALYEAR, SUM(EMPSAL) AS Sales
FROM  [dbo].[tbl_EMPLOYEE]
GROUP BY CUBE(EMPID, EMPSALYEAR)
GO

EMPID       EMPSALYEAR  Sales
----------- ----------- -----------
101         2012        2000
102         2012        5000
103         2012        1000
NULL        2012        8000
101         2013        3000
102         2013        5500
103         2013        3000
NULL        2013        11500
101         2014        4000
102         2014        6000
103         2014        7000
NULL        2014        17000
NULL        NULL        36500
101         NULL        9000
102         NULL        16500
103         NULL        11000

GROUPING SETS

The GROUPING SETS syntax is more powerful.  It allows us to specify precisely which aggregations we want to compute.

SELECT EMPID, EMPSALYEAR, SUM(EMPSAL) AS EMPSAL
FROM   [dbo].[tbl_EMPLOYEE]
GROUP BY GROUPING SETS((EMPID, EMPSALYEAR), (EMPID),())
GO

EMPID       EMPSALYEAR  EMPSAL
----------- ----------- -----------
101         2012        2000
101         2013        3000
101         2014        4000
101         NULL        9000
102         2012        5000
102         2013        5500
102         2014        6000
102         NULL        16500
103         2012        1000
103         2013        3000
103         2014        7000
103         NULL        11000
NULL        NULL        36500

SELECT EMPID, EMPSALYEAR, SUM(EMPSAL) AS EMPSAL
FROM   [dbo].[tbl_EMPLOYEE]
GROUP BY GROUPING SETS((EMPID, EMPSALYEAR), (EMPID),(EMPSALYEAR), ())
GO

EMPID       EMPSALYEAR  EMPSAL
----------- ----------- -----------
101         2012        2000
102         2012        5000
103         2012        1000
NULL        2012        8000
101         2013        3000
102         2013        5500
103         2013        3000
NULL        2013        11500
101         2014        4000
102         2014        6000
103         2014        7000
NULL        2014        17000
NULL        NULL        36500
101         NULL        9000
102         NULL        16500
103         NULL        11000



Hope you like it.




WITH ROLLUP


ROLLUP operator is used to generate subtotals and totals. The ROLLUP generates a result set that shows aggregate for a hierarchy of values in the selected columns.
Without going in the theoretical details we are going to an example to understand the power of ROLLUP operators.

Step-1 [ Create a Table ]
CREATE TABLE #my_tempTab
           (Level   VARCHAR(50) NOT NULL,
            sName   VARCHAR(50) NULL,
            Marks   INT         NULL)

Step-2 [ Now Insert data in it ]
INSERT INTO  #my_tempTab
             (Level,  sName,  Marks)
VALUES       ('A', 'Joydeep', 30),
             ('B', 'Tufan', 30), 
             ('C', 'Sangram', 30), 
             ('A', 'Sudip', 30), 
             ('B', 'Mayank', 30), 
             ('C', 'Tanay', 30), 
             ('A', 'Subrata', 30), 
             ('B', 'Saikat', 30), 
             ('C', 'Partha', 30), 
             ('A', 'Sukamal', 30)

Step-3 [ Now Use GROUP BY Clause ]
SELECT Level,  SUM(Marks)  Marks
FROM   #my_tempTab                              
GROUP BY Level
ORDER BY 1
Output:
 Level      Marks
 A          120
 B          90
 C          90

Step-4 [ Now we create a report with Individual Level SUM and with this we need Grand Total ]
SELECT ISNULL(Level, 'GRAND TOTAL') Level,SUM(Marks)  Marks
FROM   #my_tempTab                              
GROUP BY Level WITH ROLLUP
ORDER BY 1
Output:
Level                                     Marks
A                                             120
B                                             90
C                                             90
GRAND TOTAL                       300

So we understand that how the ROLLUP works.

Hope you like it.