Tuesday, August 18, 2015

Aggregate Transformation & Multicast Transformation

Aggregate Transformation & Multicast Transformation


Aggregate Transformation

- It is same as GROUP BY in SQL
- Using this Transformation, we can get COUNT(*), COUNT DISTINCT, MAX, MIN, AVE, SUM
- Column can be part of GROUP BY / AGGREGATION Function

Multicast Transformation

- It distributes input into more than one output
- Unlike Conditional Split, it is not splitting the output, instead it is sending the all the records to all output branches

EXAMPLE

Pre-requisite

Following script has to be executed

CREATE TABLE AggregateDemo (ADID INT IDENTITY(1,1), StudentID INT,Subject varchar(10), Mark int)
GO
insert into AggregateDemo (StudentID,Subject,Mark)
values (1,'Maths',100),
(1,'Science',100),
(1,'English',90),
(2,'Maths',99),
(2,'Science',99),
(2,'English',95),
(3,'Maths',95),
(3,'Science',100),
(3,'English',98)
GO
CREATE TABLE AggregateDemo_TotalMarkPerStudent (StudentID INT,TotalMark int)
GO
CREATE TABLE AggregateDemo_MaxMarkPerSubject (Subject varchar(10),MaxMark int)
GO


Steps


1. Add Data Flow Task in Control Flow
2. Drag source in DFT and it should point to AggregateDemo
3. Add Multicast Transformation and connect it with source
4. Add 2 Aggregate Transformations and connect them with Multicast Transformation
5. In 1st Aggregate, do following settings.


6. In 2nd Aggregate, do following settings.


7. Add 2 destinations and connect them with 2 aggregate transformations. These 2 dimensions should point to  AggregateDemo_TotalMarkPerStudent & AggregateDemo_MaximumMarkPerSubject respectively.

8. Overall package should look like this.

9. Execute the package. It will look like this.
10. Here you can see that Multicast is transforming all 9 rows in 2 output branches





SSIS Aggregate Data Flow Transformation

Introduction
As all SQL Developers knows that the Aggregate function performs very important roles in SQL statement. The aggregate function works on Attributes or Columns on a Table Objects. Here in this article we are trying to discuss about the Aggregate Data Flow Transformation control. It works on activity like SUM, AVERAGE, and GROUP BY etc.

Case Study
We have MS SQL Table objects namedtbl_STUDTMARKSDTLS it contains the student details and there subject marks details. It also contains the Total marks columns or attributes. That contains the total of each subject marks.

 By using Aggregate Data Flow Transformation control we want to find the Count, Sum, Min, Max of the Total marks and want to store it on a table object named tbl_STUDTMARKS_STATS.

Aggregate Function Details

Aggregate Function
Details
AVERAGE
Give the Average value of the Columns or Attributes value
GROUP BY
Derived the dataset into Group
SUM
Give the SUM value of the Columns or Attributes value
COUNT
Give the number of item in Groups
COUNT DISTINCT
Give the Unique NON NULL Item
MINIMUM
Give the Minimum Number in a Group
MAXIMUM
Give the Maximum Number in a Group


How we configure it

Step-1 [ The Source and Destination Table Objects ]

-- Creating Base table
IF OBJECT_ID(N'tbl_STUDTMARKSDTLS', N'U') IS NOT NULL
   BEGIN
       DROP TABLE tbl_STUDTMARKSDTLS;
   END
GO
CREATE TABLE tbl_STUDTMARKSDTLS
       (STUDID          INT          NOT NULL IDENTITY PRIMARY KEY,
        STUDNAME        VARCHAR(50)  NOT NULL,
        ENGMARKS        DECIMAL(20,2)NOT NULL,
        MATHMARKS       DECIMAL(20,2)NOT NULL,
        BIOLOGYMARKS    DECIMAL(20,2)NOT NULL,
        TOTALMARKS      DECIMAL(20,2)NOT NULL);
GO
-- Inserting Records
INSERT INTO tbl_STUDTMARKSDTLS 
       (STUDNAME, ENGMARKS, MATHMARKS, BIOLOGYMARKS, TOTALMARKS)
VALUES ('Student-A', 70.00, 90.00, 67.00, 227.00),
       ('Student-B', 75.00, 95.00, 62.00, 232.00),               
       ('Student-C', 35.00, 55.00, 22.00, 112.00);
GO      
-- The Student details records
SELECT  STUDNAME, ENGMARKS, MATHMARKS, BIOLOGYMARKS, TOTALMARKS AS TOTAL
FROM    tbl_STUDTMARKSDTLS;

STUDID STUDNAME    ENGMARKS    MATHMARKS    BIOLOGYMARKS              TOTAL
1              Student-A           70.00                     90.00                     67.00                                     227.00
2              Student-B           75.00                     95.00                     62.00                                     232.00
3              Student-C           35.00                     55.00                     22.00                                     112.00

-- Destinaton Table Object
IF OBJECT_ID(N'tbl_STUDTMARKS_STATS', N'U') IS NOT NULL
   BEGIN
     DROP TABLE tbl_STUDTMARKS_STATS;
   END
GO  
CREATE TABLE tbl_STUDTMARKS_STATS
      ( COUNT_TOTALMARKS  DECIMAL(20,2)NOT NULL,
        SUM_TOTALMARKS    DECIMAL(20,2)NOT NULL,
        MIN_TOTALMARKS    DECIMAL(20,2)NOT NULL,
        MAX_TOTALMARKS    DECIMAL(20,2)NOT NULL);
GO         

Step-2 [ SSIS Data Flow Task Details ]


Step-3 [ OLE DB Source Editor Configuration ]


Step-4 [ Aggregate Transformation Editor Configuration ]


Step-5 [ OLE DB Destination Editor Configuration ]



Step-6 [ Run the Package ]


SELECT * FROM tbl_STUDTMARKS_STATS;

COUNT_TOTALMARKS  SUM_TOTALMARKS       MIN_TOTALMARKS        MAX_TOTALMARKS
3.00                                        571.00                                   112.00                                   232.00



Hope you like it.