Control Flow vs Data Flow
Control Flow
|
Data Flow
| |
Step
|
1st step for creating package
|
2nd step for creating package, it will be part of control flow
|
Made up of
|
Containers and tasks connected with precedence constraints
|
Source, transformation and destinations
|
About
|
It is about data preparation and managing the process
|
It is about ETL activities
|
Smallest unit
|
Task
|
Component
|
Data movement
|
It does not move the data from one task to another task
|
It moves the data from source to destination
|
Processing
|
Task1 has to be completed before Task2 begins
|
Multiple components can be processed at the same time
|
Count
|
Only 1 control flow is available
|
There can be multiple data flow tasks inside control flow
|
Error output
|
Explicit Error output is not there but it can be achieved by editing precedence constraint
|
Error output is available in source/transformation/destination
|
Introduction to Control Flow and Data Flow
When someone learning the SSIS, he/she must face a common question like What is the difference between Control Flow and Data Flow. As the Control flow and the Data flow are two basic elements of SSIS.
We already get a glance about Control Flow and Data Flow in my previous article Understanding the BIDS and in this article now I am trying to differentiate between Control Flow and Data flow. Most of the SSIS developer spent a lot of time to dealing with these two elements.
So let's Start
When we open the Business Intelligence Development Studio or BIDS (in MS SQL Server 2005/2008 2008R2/SQL Server Denali (CTP3)) we find that there are two tabs in the name of Control Flow and Data Flow.
Control Flow
Control flow controls the package flow based on completions, success or tasks failure. The smallest units of the control flow are a task. It does not move data from one task to another. It just maintains the control of data flow. Tasks are run in series if connected with precedence or in parallel. Package control flow is made up of containers and tasks connected with precedence constraints to control package flow.
Data Flow
Data Flow deals with the Actual data movement. Here the multiple components can be process data at the same time. The smallest unit of the data flow is called components. The Data flow is made up of source (From where the data is collected, may be Excel, Flat file, FTP, SQL Server etc), Transform (Modification or manipulation of data, such as data type conversion, convert to smaller to upper etc) and the destination(where the data stores, may be SQL server etc).
So we can tell that the data flow is the child of the control flow. A SSIS package at least contains a control flow. Control Flow may or may not required a data flow.
Real life Example
Don't be afraid the bookish things. To understand it properly, here I am mention a simple scenario or story.
A post man named xyz deliver the postage door to door. Let's assume, he has 3 letter for person –A , B and C. Here he person B address is far from post office. The Person B address is between A and C and Person C address is near the post office. So the postman decide to go to the Person B address first and provide him the letter, then the person B and at the end person C.
Here the decision taken by the postmen how to distribute the postage material is the Control Flow and actual handover the postage material is the Data Flow.