Friday, July 17, 2015

Container in SSIS

SSIS Control Flow Containers


Introduction


To continuing my journey on SSIS, this article contains related to Control Flow Containers. It is important to understand the containers of control flow. As per me very SSIS developer must have a solid concept related to the Containers of Control flow.


Type of Control Flow

There are three primary types of control flow objects.

1.    Control flow Tasks: Workflow objects that perform operational-level jobs.
2.    Control flow Containers: It’s grouping mechanisms for tasks and other containers.
3.    Constraints: Allow to connects tasks and define execution ordering and precedence.       

Control flow containers

When we open the SSIS BIDS for control flow tables and the Control Flow Items in the tools box we can see the three containers placed at the top of the others.



Microsoft places it as the top positions they are frequently used by the SSIS Developers.

Why it’s needed

A SSIS package must contains a single control flow tasks. However most of the time a package contains several control flow tasks and they coordinate each other. So in this condition we must organize the tasks and for that we use the control flow containers. So a control flow contains used to group the tasks together. The control flow containers group the tasks together for parallelization, ordering, logging and transactions.

We can think the control flow containers as a folder and contains the other sub folder and file. Here the control flow containers contain different tasks like file in the folder and others containers like subfolder within folder.

How to add containers

To add a control flow containers just drag it from control flow item tool bars to control flow work space of BIDS.


Type of Control Flow Containers
There are three primary types of control flow containers

1    .    Sequence Containers
2    .    For Loop Containers
3    .    ForEach Loop Containers


      Sequence Containers

      Most commonly used containers by SSIS developer. It’s used to organize subordinate tasks by grouping them together and let us applies transactions or assigns logging to the containers.

       For Loop Container

     Provide the same functionality as sequence container except that they can let us run the tasks within them multiple times, based on any condition. Think it as a loop. For example for loop that runs from 1 to 100 times.

      ForEach Loop Conatiners

      Allow us looping but instead of providing a condition expression we loop over a set of objects such as files in a folder.

     If we takes a simple example think we have 10 clients and they upload there data to an FTP server by 10 different txt file. The SSIS program must check the format of each txt file before read and store the data in Table objects of MS SQL Server.

Common Properties of Containers

There are some common properties of all the containers are mentioned bellow. We can find the details in MSDN http://msdn.microsoft.com/en-us/library/ms137728(v=sql.100).aspx

Let’s take a simple example to understand the Control Flow containers.

Example of Sequence Containers
1    .    Drag the Sequence Containers from Control Flow Items tools bar to Control Flow work 
         space. Drag 3 Execute SQL Task to the Sequence Containers.
2    .    Edit the Execute SQL Task by right click each tasks and select Edit from popup menu.
3    .    In SQL StatementàConnection Select the OLEDB Connection and in SQL Statement  
         property just Write SELCT 1.








4   .    In the 2nd Execute SQL Tasks for the property SQL statement write something which in not a 
        sql statement to create Error.
5   .    Now go to the property of the Sequence containers and set the transaction option property to 
        Required.
6   .    Now run the package.



Related Tropics


1    .    Understand the BIDS


2   .    SSIS Connection Manager
        http://www.sqlknowledgebank.blogspot.in/2013/02/ssis-connection-manager.html

3   .    Introduction to Control Flow and Data Flow
        http://www.sqlknowledgebank.blogspot.in/2013/02/introduction-to-control-flow-and-data.html




Description – Moving ahead from Import and Export Wizard, we will now discuss on the topicContainers in SSIS. In my previous post on SSIS and It’s Architecture, I discussed about Packages, Control and Data flow, Connection Managers, Containers, etc. Let’s recall some of the concepts and wrap it shortly:-
  1. Package: – A basic unit of Design, Deployment and Execution.
  2. Connection Managers: – A logical representation of connections stored in the package. These cannot be shared between the packages.
  3. Data Flow: – It is defined as the complete process of Extracting data into server’s memory, transforming it and Load the transformed data onto Destination.
  4. Control Flow: – It acts as the brain of a package. A package consists of a single control flow. It contains Containers, Tasks, Variables, Precedence constraints and Event handlers.
So, we are clear with the basics now. For further details, you can jump to my article on SSIS and its Architecture .  Now, let’s unite our concentration on the main point- Containers. They are classified as follows:-
  1. Task Host Containers: – This is an abstract concept like an interface. They are not visible in our toolbox and provides services to a single task.
  2. Sequence Container: – It allows you to group tasks into logical subject areas. Within the development environment, you can then collapse or expand this container for usability.
  3. For Loop Container: – The basic function of this is to loop through a series of tasks contained in a container until a condition is met or for predetermined number of times.
  4. For Each Loop Container: – It Loops through a series of files or records in a data set, and then execute the tasks in the container for each record in the collection.
Each container has some common properties that affect the usage of these features. Understanding these properties and what they do helps a lot in the developing SSIS packages.
PropertyDescription
DelayValidationA Boolean value that indicates whether validation of the container is delayed until run time
DisableA Boolean value that indicates whether the container runs
DisableEventHandlersA Boolean value that indicates whether the event handlers associated with the container run
FailPackageOnFailureA Boolean value that specifies whether the package fails if an error occurs in the container.
FailParentOnErrorA Boolean value that specifies whether the parent container fails if an error occurs in the container.
IsolationLevelThe isolation level of the container transaction. The values are Unspecified, Chaos, ReadUncommitted, ReadCommitted, RepeatableRead, Serializable, and Snapshot.
MaximumErrorCountThe maximum number of errors that can occur before a container stops running.
TransactionOptionThe transactional participation of the container. The values are NotSupported, Supported, Required.
We will cover all the above mentioned Containers by implementing them separately with reference to simple examples. I am mentioning the links below, Click on the respective link who’s Implementation you want to see :-
Click Here for the   ==>  Implementation of SEQUENCE Container in SSIS.
With this we complete our post on Container in SSIS. I hope you liked it. Please provide your feedback as comments below.




Foreach Loop Container



Description – Earlier in SQL Server 2000, Data Transformation Service (DTS) was used to loop overdata files of a given type present in a directory and to import them into the destination (In simple words, moving or copying files based on the file name from one directory to another) was very hard to perform. But with the release of SQL Server 2005, this has been solved by adding Foreach Loop Container. So, now you would be having an idea what this container performs and we will learn how it performs that.
Example Scenario :- We will create one Source folder containing some data files (say text files) in it. Then we will create a package and with the help of Foreach Loop Container, these files will be copiedto another folder which will act as our Destination.
STEPS TO FOLLOW :-
Step 1. Create a Source folder (I have created a folder named Chander) containing some text files and Create an empty Destination Folder where you want to copy these files.
Step 2. Create a package using Business Intelligence Development Studio (BIDS).
Step 3. Create a variable with the name Filename with Scope-Package, Data type-String and Value-Default. This variable will hold the name of the file that SSIS is working on during each iteration of the loop.Variable for Foreach loop container
Step 4. Next, drag a Foreach Loop Container onto the Control Flow and double click on the container to configure it. Go to Collection tab, and specify the following fields as:-
  • Select the option Foreach File Enumerator under Foreach Loop Editor.
  • In the Enumerator configuration group, Go to folder field and click Browse and set the folder property to the source folder that has data files in it.
  • Set the Files property to default i.e. < *.*>.
Collection Tab for Foreach loop container
Step 5. Go to Variable mapping tab, select the variable (in this example it’s Filename) you created earlier from the Variable dropdown box, and then accept the default of 0 for the index. Click OK to save the settings and return to the Control Flow tab in the Package Designer.
Variable tab for Foreach loop container
Step 7. Drag a File System Task into the container’s box. Double-click the new task to configure it in the editor that pops up as:-
  • Set the operation field to Copy file.
  • Select <New Connection> for the Destination Connection property. When the Connection Manager dialog opens, select Existing Folder (as we have created Destination folder already) and Browse the folder
  • Set the IsSourcePathVariable property field to True.
  • Set the Source Variable field to User::FileName.
Destination Connection for Foreach loop container
Step 8. We are now ready to strike the execute button to see how the files are transferred from one directory to another directory. During the execution of our package, we will see each file beingpicked up from the Source folder and getting copied to the Destination folder.
NOTE :-  If you had set the Overwrite Destination property to True in the File System Task, the file would be overwritten if there was a conflict of duplicate filenames.
Execute for Foreach loop container
With these we finish our post on Implementing Foreach Loop Container in SSIS. I hope you liked my article. Please share your feedback below as Comments.

Sequence Container:


Description – Sequence container group related tasks in a package to show what the complex package is doing in a clear and simple way. The task of Sequence container is to have multiple separate control flows group together in a SSIS package. Each container will contain one or more tasks and will run within the control flow of overall package. Its not at all compulsory that every package should use Sequence container but there are some benefits of using this container. Some of them are described below:-
Benefits:-
  1. They can be huge helpful when developing and debugging SSIS packages.
  2. If a package has many tasks then it is easier to group the tasks in Sequence Containers and you can collapse and expand this container for usability.
  3. Instead of setting property for each individual task, we group tasks together that require similar property settings.
  4. Providing scope for variables that a group of related tasks and containers use.
  5. If one task fails to succeed inside the container then the process is aborted and all the tasks that were completed successfully get rolled back for that container. This depends on Transaction option property which can be Required, Supported and Not Supported according to your configuration ( By default it’s Supported). It means it creates a Transaction around the components inside Container.
So, now let’s learn how to Implement Sequence Container. Example Scenario:– We will create one table named Friends in SSMS and one package in BIDS. Then we will insert some rows into our table using Multiple Execute SQL Tasks contained in a Sequence container. STEPS TO FOLLOW :- STEP 1.  Let’s create a table, say Friends with the fields name and location. Open SQL Server Management Studio (SSMS) and Click on New Query. Select your present working database (I am using Chander as my database) and write the following script in it :-
database
STEP 2. Drag 4 different Execute SQL Tasks and Rename them to Insert_1, Insert_2, Insert_3 and Insert_4 respectively by double clicking the Execute SQL Task. Configure the SQL Task Editor as:-
  • Set Connection field to your working database (I am using Chander as my database).
  • Click on SQL statement field and write the following queries inside the respective Execute SQL Tasks. I am showing for Insert_1 Execute SQL Task. You can perform the same for other remaining Execute SQL Tasks.
 For Insert_1 Execute SQL Task :-
 Insert
For Insert_2 Execute SQL Task :-
For Insert_3 Execute SQL Task :-
For Insert_4 Execute SQL Task :-
NOTE :- We are inserting null in the location field for Insert_4 Execute SQL Task. The purpose is that this task will fail as we have mentioned not null for the Location field in our Friends Table (STEP 1). This will halt the execution of our package which we will see when we will execute it. STEP 3. Now, click on Sequence container and Drag it onto Control flow pane. Rename this container to “Create table and Insert all rows if succeed else Rollback”.
Transaction_property
STEP 4. Now, put all the 4 Execute SQL Tasks into the Sequence container. By Defualt Transaction option property for the Container is Supported. Set this property to Required and execute the package.NOTE:–  Before I could even get the package to fail for the reason I wanted to I got this error. Error:-The SSIS Runtime has failed to start the distributed transaction due to error 0x8004D01B “The Transaction Manager is not available.”. The DTC transaction failed to start. This could occur because the MSDTC Service is not running. Sol:- So to solve this error and to enable the service on my Machine, Try these steps :- Control Panel ->Administrative Tools ->Component Services ->Computers ->My Computer ->Properties and start the Distributed Transaction Coordinator and if it is already running then stop it and start again.
error
STEP 5. As expected, 4th Insert fails and the container fails too. So, it will not create any table in our database. If we go to our Insert_4 Execute SQL Task and specify location for that field  then it willexecute our container. So, let’s make it work.
executed
STEP 6. It turns to Green, means it really does work. As long as the tasks is set to supported, and the items inside the child package are set to supported as well they will inherit the transaction created by the parent container. Let’s check our database to see the table created by the package.
inserted_data
With this we are finished with Implementing Sequence container in SSIS. There are many ways you can try, to play with this depending upon your requirement as well as your imagination. I hope this was not difficult to implement and made you realized how Sequence container can be used to perform number of things. For more information, I would like you to go through this helpful article – Different ways of using Sequence Container.


For Loop Container:



Description – Before implementing the For Loop Container, I would like to tell you that i like this container the most out of all the mentioned Containers. Let’s not waste much of our time and get started with the example on For Loop Container.
Example Scenario :-We will create a table with the name Data having fields Sr_No and Date. Then, with the help of For Loop Container, we will increment the values of  Sr_No from 1 to 5 (iterative process) and insert them into our Data table.
STEPS TO FOLLOW :-
STEP 1. Open SQL Server Management Studio (SSMS) and Create a table named Data with fields as Sr_No and Date, in your working database (in my case it’s chander)  using the simple script :-
Database
STEP 2. Open Business Intelligence Development Studio (BIDS). Now, Create a package and edit its name to cK_For_Loop_Container.dtsx (You can have any desired name).
start
STEP 3. Add a New OLEDB Connection to the Connection Manager. Click New button and configure OLEDB Connection Mangaer  pointing to your working database (as in this example, I am usingchander as my database) and Click OK.
oledb
STEP 4. Open the cK_For_Loop_Container.dtsx package, Create a new Variable, and name it (say Sr_No).
  • To open the Variables window, Right click in the design pane and select Variables or click on the Variables icon on the top right of your Package designer screen. Once the window is open, click the Add Variable button. Accept defaults for the Variable i.e Data Type – int32, Scope – Package level and a value  to be 1.
variable
STEP 5. Go to For Loop Container in the toolbox and Drag it to the Control Flow and Double click it to open the editor. It will have the following fields:-

(1). InitExpression :- This expression is Optional and evaluated once at the beginning only. It is used toinitialize a variable that will be used in the For Loop Container.
  • Set the InitExpression option to – @Sr_No= 1. This will initialize the loop by setting the Counter variable to 1.
(2). EvalExpression  :-  This expression is Required and also evaluated before any work is performed inside the container. This is the expression that determines if the loop continues or terminates. If the expression entered evaluates to TRUE, the loop executes again. If it evaluates to FALSE, the loop ends.
  • Set the EvalExpression option to – @Sr_No <=5.
(3). AssignExpression :– This is the last expression used in For Loop and is optional. It changes the value of the variable used in the EvalExpression.
  • Set it to – @Sr_No = @Sr_No + 1 for the AssignExpression and Click OK.
for loop_1
STEP 6. Drag a Script Task into the For Loop Container and double-click the script task to edit it. In the General tab, name the task Pop up the message.
genera_tab
STEP 7. In the Script tab, set the Read Only Variables to Sr_No. Finally, click Edit Script to open the Visual Studio designer. By typing Sr_No for that option, you’re going to pass in the variable parameter to be used by the Script Task.
script_tab
STEP 8. When you click Edit Script, the Visual Studio design environment will open. Replace the Main () subroutine with the following code. This code will read the variable and pop up a message box that displays the value of the variable:-
  • For C#
  • For VB
Save and exit the Visual Studio design environment, then Click OK to exit the Script Task.
STEP 9. Drag an ExecuteSQLtask into the For Loop container and double click the task to edit it.
  • Set the Name as Insert Sr_No values into DB.
  • Set the Description as Execute SQL Task.
  • Set Connection Type as OLEDB and set Connection to the connection created in STEP 3.
  • Set SQLSourceType as Direct Input and in the SQLStatement, put the following insert statement :-     Insert Into dbo.Data Values (?,?)
execute_sql
STEP 10. Inside the Execute SQL Task editor, Go to Parameter mapping tab and create two parameters as shown in the figure.
parameter
STEP 11. Connect the success precedence constraint from Script task to the ExecuteSQLtask.
final
STEP 12. Execute the package, you should see message box popup with each new value of Sr_No variable. That is, you should see five pop-up boxes one at a time, starting at iteration 1 and proceeding through iteration 4. Only one pop-up will appear at any given point. After the loop is complete, the For Loop Container, the ScriptTask and the ExecuteSQL task will all be green. Check the rows in the Data table.
Select * from dbo.Data
Data


This completes the implementation of For Loop Container. I hope you find this post interesting. Your comments are invited and I will be happy to solve your queries.


Foreach Loop Container in SSIS


- ForEach Loop Conainter works same as For Loop Container, but only difference is - it will loop the flow
  for each item within a collection
- Here collection can be set of files, ADO.Net resultset or anything

EXAMPLE

Pre-requisite
- Have 3-4 .txt files with simple ID,Name column and 2-3 rows in each file.
- You can put those files in different folder also
- In database, execute the following code


CREATE TABLE ForEachLoopDemoFiles (ID INT, FilePath VARCHAR(255),FileName VARCHAR(50))
Go
INSERT INTO ForEachLoopDemoFiles  (ID, FilePath, FileName)
VALUES (1,'C:\ForEachLoopContainerDemo\Folder1','a.txt'),
(2,'C:\ForEachLoopContainerDemo\Folder2','b.txt'),
(3,'C:\ForEachLoopContainerDemo\Folder3','c.txt')
Go

create TABLE ForEachLoopDemoLoad (ID int, Name varchar(10),FileInfo nvarchar(255))


How to deal with it?

- Create 3 variables.


 - In control flow task, create 1 Execute SQL Task which goes further to For Each Loop Container which has internally Data Flow Task

- In Execute SQL Task, set the following property. Resultset should be Full result set


 - In Execute SQL Task, set the result set like this. So whole result set will be stored in FileInfo object variable

 - In Foreach loop editor, set the following properties

 - In Foreach loop editor, few more settings in Variable Mappings


 - Data Flow Task should look like this
- In Flatfile connection manager, go to properties --> Expression --> ConnectionString and set following value.

@[User::FilePath] + "\\" +  @[User::FileName]

We can use Expression builder also for this.

- Derived column we can set like this.


- In Destination, we can do normal mapping
- Run the package and check the result in "ForEachLoopDemoLoad" table




For Loop Container of SSIS



- For Loop Conainter is a set of multiple task or other container
- It will loop the flow based on the condition given
- It works exactly as For Loop of .Net
- 3 properties need to be set
--> InitExpression  : @couter = 1
--> EvalExpression : @counter <= 10
--> AssignExpression : @counter = @counter + 1




Foreach Loop with Dynamic Flat File Connection


Introduction

One of my friends working with SSIS told me that he is using a Foreach Loop Containers to connect with a flat file from a specified directory. The directory contains more than one flat file with the name with date. That means the flat file name comes with date extension like this File2652013.txt and each day one flat file is stored within this directory. He needs to read the current dated flat file only.

To do this he use one of my preciously posted article named "SSIS Dynamic Flat file Connection" (Link: http://www.sqlknowledgebank.blogspot.in/2013/05/ssis-dynamic-flat-file-connection.html)Posted at date: 1st May 2013.

He is able to read current date file but his problem is due to Foreach Loop containers. As the folder contains multiple flat file the loop moving according to that and read the current dated flat file multiple time and hence records are duplicated in the destination table. As he is not using any primary key in the destination table, he gets the duplicate records in the table and if he use the primary key constraint in the destination he get a duplicate key error.

How to solve this problem
To solve this problem he needs a break like statement to break the loop (in case any other programming language an IF condition and Break statement). Here in SSIS it is not possible. To solve this problem I am using a Script Task and a Package level variable. The initial value of the package level variable in "0". When it passes to the script task it check the value of the variable. If  it is "0" then the script result is Success others it failure. In this way I control the looping of the For each Loop.

So let's Start

Step-1 [ The source flat file details and Destination table objects  ]
We have 2 source flat file in the folder path "F:\Practice_SQL\SSIS\SSIS Examples\SourceFlatFile"



IF OBJECT_ID(N'tbl_PRODUCTDETAILS', N'U') IS NOT NULL
   BEGIN
     DROP TABLE tbl_PRODUCTDETAILS;
   END
GO
CREATE TABLE tbl_PRODUCTDETAILS
       (PRODID      VARCHAR(50)   NOT NULL,
        PRODNAME    VARCHAR(100)  NOT NULL,
        PRODQTY     DECIMAL(20,0) NOT NULL,
        PRODUOM     VARCHAR(7)    NOT NULL);
G0

Step-2 [ Create the Package level variable ]



Step-3 [ SSIS Control Flow and Data Flow ]



Step-4 [ Foreach Loop Containers Editors ]



Step-5 [ Script Tasks Editor ]



        public void Main()
        {
            Int32 intCount = 0;
            intCount =Convert.ToInt32(Dts.Variables["v_CountLoop"].Value);
            if (intCount == 0)
            {
                Dts.Variables["v_CountLoop"].Value = 1;
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            else
            {
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }

Step-6 [ Flat File connection String Expression ]



@[User::v_FilePath]  + "File"+(DT_STR,4,1252)DAY(getdate())+(DT_STR,4,1252)MONTH(getdate())+(DT_STR,4,1252)YEAR(getdate())+".txt"          

Step-7 [ Running the Package ]



Step-8 [ Destination Output ]

SELECT * FROM tbl_PRODUCTDETAILS;

PRODID      PRODNAME                                       PRODQTY            PRODUOM
PROD1         PWB PLYWOOD 1x4x19MM              200                         PCS
PROD2         PWB PLYWOOD 2x4x19MM              300                         PCS
PROD3         PWB PLYWOOD 3x4x19MM              300                         PCS
   



Hope you like it.