Monday, August 17, 2015

Transactions in SSIS

Transactions in SSIS

What is the transaction support feature in SSIS?

When you execute SSIS package, every task executes in its own Transaction. We can group tasks together to execute as a single Transaction.

This can be achieved by setting Transaction property. In this way we can ensure either all of the tasks complete successfully and can roll back if any of the task fails.

What properties do you need to configure in order to use the transaction feature in SSIS?

  • Suppose you want to execute 5 tasks in a single transaction, in this case you can place all 5 tasks in a Sequence Container and set the TransactionOption and IsolationLevel properties appropriately.
    • The TransactionOption property expects one of these three values:
      • Supported - The container/task does not create a separate transaction, but if the parent object has already initiated a transaction then participate in it
      • Required - The container/task creates a new transaction irrespective of any transaction initiated by the parent object
      • NotSupported - The container/task neither creates a transaction nor participates in any transaction initiated by the parent object
  • Isolation level dictates how two more transaction maintains consistency and concurrency when they are running in parallel. To learn more about Transaction and Isolation Level.
Transaction Isolation levels:
The transaction isolation level determines the duration that locks are held. We'll use SQL Server as an example. The following transaction isolation levels are available in SQL Server:
  • READ UNCOMMITTED - reads do not acquire share locks and they don't wait on locks. This is often referred to as a dirty read because you can read modified data that hasn't been committed yet and it could get rolled back after you read it.
  • READ COMMITTED - reads acquire share locks and wait on any data modified by a transaction in process. This is the SQL Server default.
  • REPEATABLE READ - same as READ COMMITTED but in addition share locks are retained on rows read for the duration of the transaction. In other words any row that is read cannot be modified by another connection until the transaction commits or rolls back.
  • SERIALIZABLE - same as REPEATABLE READ but in addition no other connection can insert rows if the new rows would appear in a SELECT statement already issued. In other words if you issue a select statement in a transaction using the SERIALIZABLE isolation level you will get the same exact result set if you issue the select statement again within the same transaction.
The built-in transaction support in SSIS makes use of the Distributed Transaction Coordinator (MSDTC) service which must be running. MSDTC also allows you to perform distributed transactions; e.g. updating a SQL Server database and an Oracle database in the same transaction. If you execute an SSIS package that utilizes the built-in transaction support and MSDTC is not running, you will get an error message like the following:
Error: 0xC001401A at Transaction: The SSIS Runtime has failed  to start the distributed transaction due to error 0x8004D01B  "The Transaction Manager is not available.". The DTC transaction  failed to start. This could occur because the MSDTC Service is not running.
Note also that the SSIS package elements also have an IsolationLevel property with a default of Serializable. As discussed above in the section on Transaction Isolation Levels, this setting impacts the duration of locks as well as whether shared locks are acquired.

When I enabled transactions in an SSIS package, it failed with this exception: "The Transaction Manager is not available. The DTC transaction failed to start." What caused this exception and how can it be fixed?

  • SSIS uses the MS DTC (Microsoft Distributed Transaction Coordinator) Windows Service for transaction support. As such, you need to ensure this service is running on the machine where you are actually executing the SSIS packages or the package execution will fail with the exception message as indicated in this question.

What is event handling in SSIS?

  • Like many other programming languages, SSIS and its components raise different events during the execution of the code. You can write an even handler to capture the event and handle it in a few different ways. For example consider you have a data flow task and before execution of this data flow task you want to make some environmental changes such as creating a table to write data into, deleting/truncating a table you want to write, etc. Along the same lines, after execution of the data flow task you want to cleanup some staging tables. In this circumstance you can write an event handler for the OnPreExcute event of the data flow task which gets executed before the actual execution of the data flow. Similar to that you can also write an event handler for OnPostExecute event of the data flow task which gets executed after the execution of the actual data flow task. Please note, not all the tasks raise the same events as others. There might be some specific events related to a specific task that you can use with one object and not with others.

Transaction in SSIS

- Transaction in SSIS is supported and it is in-built integrated with SSIS
- We can have individual transaction at package, container or task level.
- There is a property called "TransactionOption" at all these levels.
- 3 values are there for this property
        (A) Supported
                - This is default
                - It means if transaction is already there before running the package, then it will join into that tran.
                - But it will not create any new transaction
        (B) Not supported
                - It will neither create any new transaction nor join into existing transaction
        (C) Required
                - If transaction exists before running a package, then it will join into that transaction
                - otherwise it will create new transaction
- Design
        - We can set "Required" value at package level and "Supported" value at all the containers/task level


Transaction and SAVEPOINT


Transaction is a very important part of SQL developer. In this article I am taking about the SAVEPOINT of MS SQL Server and how we use the power of save point. Hope it will be interesting.
When we are working with T-SQL statement we are using the TRANSACTION that protects all my T-SQL statements within it.
Generally it contains BEGIN TRAN, COMMIT TRAN and ROLLBACK TRAN.
The block of transaction is mentioned bellow.
BEGIN TRY
   BEGIN TRAN
     <... T-SQL to Execute...>
   COMMIT TRAN
END TRY
BEGIN CATCH
   ROLLBACK TRAN 
END CATCH
So anything between BEGIN TRAN and COMMIT TRAN goes wrong (means error occurs) it directly go to the CATCH portion and ROLL BACK the transaction. So all the T-SQL statements must be executed properly,  to get the COMMIT TRAN.
Think about the situation where error is not define and came frequently and we are executing some long T-SQL statement. In this situation we don't want complete rollback of all the transaction but want to rollback only the error portion T-SQL executions.
In this type of situation we must use the SAVE POINT options.
We can use savepoints in rolling back portions of transactions to predefined locations. A T-SQL savepoint defines a location to which a transaction can return if part of the transaction is conditionally canceled.

Keep in mind that SQL updates and rollbacks generally are expensive operations. So savepoints are useful only in situations where errors are unlikely and checking the validity of an update beforehand is relatively costly.
An example of savepoint is mentioned bellow.
SET NOCOUNT OFF;
GO
USE AdventureWorks2008R2;
GO
CREATE TABLE InvCtrl
    (WhrhousID      int,
    PartNmbr      int,
    QtyInStk      int,
    ReordrPt      int,
    CONSTRAINT InvPK PRIMARY KEY
    (WhrhousID, PartNmbr),
    CONSTRAINT QtyStkCheck CHECK (QtyInStk > 0) );
GO
CREATE PROCEDURE OrderStock
    @WhrhousID int,
    @PartNmbr int,
    @OrderQty int
AS
    DECLARE @ErrorVar int;
    SAVE TRANSACTION StkOrdTrn;  ß Point 1 [ Prepare the Savepoint]
    UPDATE InvCtrl SET QtyInStk = QtyInStk - @OrderQty
        WHERE WhrhousID = @WhrhousID
        AND PartNmbr = @PartNmbr;
    SELECT @ErrorVar = @@error;
    IF (@ErrorVar = 547)
    BEGIN
        ROLLBACK TRANSACTION StkOrdTrnß Point 2 [Rollbacking the Savepoint]
        RETURN (SELECT QtyInStk
                FROM InvCtrl
                WHERE WhrhousID = @WhrhousID
                AND PartNmbr = @PartNmbr);
    END
    ELSE
        RETURN 0;
GO
Whenever we roll back a transaction to a savepoint, it must proceed to completion or be canceled altogether. Therefore a COMMIT or a complete ROLLBACK should always follow a rollback to savepoint, because the resources used during the transaction (namely the SQL locks) are held until the completion of the transaction. When part of a transaction rolls back to a savepoint, resources continue to be held until either the completion of the transaction or a rollback of the complete transaction. In other words, even after rollback to a midpoint, the transaction is considered open and must be closed by either committing work or rolling back the entire transaction.

Hope that the article is quite informative and thanking you to provide your valuable time on it.