Aggregate Transformation & Multicast Transformation
Aggregate Transformation
- It is same as GROUP BY in SQL
- Using this Transformation, we can get COUNT(*), COUNT DISTINCT, MAX, MIN, AVE, SUM
- Column can be part of GROUP BY / AGGREGATION Function
Multicast Transformation
- It distributes input into more than one output
- Unlike Conditional Split, it is not splitting the output, instead it is sending the all the records to all output branches
EXAMPLE
Pre-requisite
Following script has to be executed
CREATE TABLE AggregateDemo (ADID INT IDENTITY(1,1), StudentID INT,Subject varchar(10), Mark int)
GO
insert into AggregateDemo (StudentID,Subject,Mark)
values (1,'Maths',100),
(1,'Science',100),
(1,'English',90),
(2,'Maths',99),
(2,'Science',99),
(2,'English',95),
(3,'Maths',95),
(3,'Science',100),
(3,'English',98)
GO
CREATE TABLE AggregateDemo_TotalMarkPerStudent (StudentID INT,TotalMark int)
GO
CREATE TABLE AggregateDemo_MaxMarkPerSubject (Subject varchar(10),MaxMark int)
GO
Steps
1. Add Data Flow Task in Control Flow
2. Drag source in DFT and it should point to AggregateDemo
3. Add Multicast Transformation and connect it with source
4. Add 2 Aggregate Transformations and connect them with Multicast Transformation
5. In 1st Aggregate, do following settings.
6. In 2nd Aggregate, do following settings.
7. Add 2 destinations and connect them with 2 aggregate transformations. These 2 dimensions should point to AggregateDemo_TotalMarkPerStudent & AggregateDemo_MaximumMarkPerSubject respectively.
8. Overall package should look like this.
9. Execute the package. It will look like this.
10. Here you can see that Multicast is transforming all 9 rows in 2 output branches
SSIS Aggregate Data Flow Transformation
As all SQL Developers knows that the Aggregate function performs very important roles in SQL statement. The aggregate function works on Attributes or Columns on a Table Objects. Here in this article we are trying to discuss about the Aggregate Data Flow Transformation control. It works on activity like SUM, AVERAGE, and GROUP BY etc.
Case Study
We have MS SQL Table objects namedtbl_STUDTMARKSDTLS it contains the student details and there subject marks details. It also contains the Total marks columns or attributes. That contains the total of each subject marks.
By using Aggregate Data Flow Transformation control we want to find the Count, Sum, Min, Max of the Total marks and want to store it on a table object named tbl_STUDTMARKS_STATS.
Aggregate Function Details
Aggregate Function
|
Details
|
AVERAGE
|
Give the Average value of the Columns or Attributes value
|
GROUP BY
|
Derived the dataset into Group
|
SUM
|
Give the SUM value of the Columns or Attributes value
|
COUNT
|
Give the number of item in Groups
|
COUNT DISTINCT
|
Give the Unique NON NULL Item
|
MINIMUM
|
Give the Minimum Number in a Group
|
MAXIMUM
|
Give the Maximum Number in a Group
|
How we configure it
Step-1 [ The Source and Destination Table Objects ]
-- Creating Base table
IF OBJECT_ID(N'tbl_STUDTMARKSDTLS', N'U') IS NOT NULL
BEGIN
DROP TABLE tbl_STUDTMARKSDTLS;
END
GO
CREATE TABLE tbl_STUDTMARKSDTLS
(STUDID INT NOT NULL IDENTITY PRIMARY KEY,
STUDNAME VARCHAR(50) NOT NULL,
ENGMARKS DECIMAL(20,2)NOT NULL,
MATHMARKS DECIMAL(20,2)NOT NULL,
BIOLOGYMARKS DECIMAL(20,2)NOT NULL,
TOTALMARKS DECIMAL(20,2)NOT NULL);
GO
-- Inserting Records
INSERT INTO tbl_STUDTMARKSDTLS
(STUDNAME, ENGMARKS, MATHMARKS, BIOLOGYMARKS, TOTALMARKS)
VALUES ('Student-A', 70.00, 90.00, 67.00, 227.00),
('Student-B', 75.00, 95.00, 62.00, 232.00),
('Student-C', 35.00, 55.00, 22.00, 112.00);
GO
-- The Student details records
SELECT STUDNAME, ENGMARKS, MATHMARKS, BIOLOGYMARKS, TOTALMARKS AS TOTAL
FROM tbl_STUDTMARKSDTLS;
STUDID STUDNAME ENGMARKS MATHMARKS BIOLOGYMARKS TOTAL
1 Student-A 70.00 90.00 67.00 227.00
2 Student-B 75.00 95.00 62.00 232.00
3 Student-C 35.00 55.00 22.00 112.00
-- Destinaton Table Object
IF OBJECT_ID(N'tbl_STUDTMARKS_STATS', N'U') IS NOT NULL
BEGIN
DROP TABLE tbl_STUDTMARKS_STATS;
END
GO
CREATE TABLE tbl_STUDTMARKS_STATS
( COUNT_TOTALMARKS DECIMAL(20,2)NOT NULL,
SUM_TOTALMARKS DECIMAL(20,2)NOT NULL,
MIN_TOTALMARKS DECIMAL(20,2)NOT NULL,
MAX_TOTALMARKS DECIMAL(20,2)NOT NULL);
GO
Step-2 [ SSIS Data Flow Task Details ]
Step-3 [ OLE DB Source Editor Configuration ]
Step-4 [ Aggregate Transformation Editor Configuration ]
Step-5 [ OLE DB Destination Editor Configuration ]
Step-6 [ Run the Package ]
SELECT * FROM tbl_STUDTMARKS_STATS;
COUNT_TOTALMARKS SUM_TOTALMARKS MIN_TOTALMARKS MAX_TOTALMARKS
3.00 571.00 112.00 232.00
Hope you like it.