Tuesday, July 28, 2015

Performance Considerations for SSIS

1.Packet size in connection should be equal to 32767
2.Consider using NOLOCK in source table
3.Select only columns you need in source query
4.Use Shared lookup cache in lookup transfer
5.Consider is the transfermation fall under syn/asynchronous type
6.Data types as narrow as possible forless memory usage
7.Do not perform excessive casting
8.sort:puch to source queries when possible-use sort transform for sorting cross database joins
9.Use merge instead of SCD
10.Use group by instead of aggregation
11.insert into instead of a data flow on a single sql instance
12.unnecessary delta detection vs. reload
13.use sql server destination
14.commit size 0 == fastest
15.drop some indexes based on load growth %
 clustered indexes:+inf%(dont drop)
 single nonclustered index: >=~100%
 multiple nonclusted index: >=~10%(varies)
16.truncate,not delete
17. In “Table or view” access mode, the OLE DB source adapter calls OpenRowset to get column metadata at Validate phase. The data returned from OpenRowset include more than just column metadata. Thus the adapter issues “SET ROWCOUNT 1” statement to retrieve column metadata information. “SET ROWCOUNT 1” causes an inefficient execution plan (i.e. Nested Loop) to be cached and later used in the subsequent Execute phase.
In “SQL command” access mode, the OLE DB source adapter calls “sp_prepare” to get column metadata at Validate phase, and “sp_execute” at Execute phase. The execution plan used at Execute phase is Hash Match which is more efficient than Nested Loop.
18. Avoid many small buffers. Tweak the values for DefaultMaxBufferRows and DefaultMaxBufferSize to get as many records into a buffer as possible. To optimize the Pipeline, the goal is to pass as many records as possible through the fewest number of buffers, especially when dealing with large data volumes.
19. DefaultMaxBufferRows and DefaultMaxBufferSize are configured separately for each Data Flow task. When integrating data from multiple data sources using a single Data Flow task, these two settings will only impact the data source components and transformations in that task.
20. Do not increase buffer size to the point where paging to disk starts to occur.
21. Design the package in such a way that it does a full pull of data only in the beginning or on-demand, next time onward it should do the incremental pull, this will greatly reduce the volume of data load operations, especially when volumes are likely to increase over the lifecycle of an application. For this purpose, use upstream enabled CDC (Change Data Capture) feature of SQL Server 2008; for previous versions of SQL Server incremental pull logic.
22. It is recommended to set two of the data flow task properties viz. BufferTempStoragePath and BLOBTempStoragePath instead of using default values. The location provided should be of some fast drives. It is also recommended that BufferTempStoragePath and BLOBTempStoragePath should point to drives which are on separate spindles in order to maximize I/O throughput


SSIS Performance Counters


SQL Server Integration Services provide a set of performance counters. Among them the following few are helpful when you tune or debug your package:

·    Buffers in use

·    Flat buffers in use

·    Private buffers in use

·    Buffers spooled

·    Rows read

·    Rows written



“Buffers in use”, “Flat buffers in use” and “Private buffers in use” are useful to discover leaks. During package execution time, you will see these counters fluctuating. But once the package finishes execution, their values should return to the same value as what they were before the execution. Otherwise, buffers are leaked.



“Buffers spooled” has an initial value of 0. When it goes above 0, it indicates that the engine has started memory swapping. (“Set BLOBTempStoragePath and BufferTempStoragePath to Fast Drives”) to set Data Flow Task properties BLOBTempStoragePath and BufferTempStoragePath appropriately for maximal I/O bandwidth.

BLOBTempStoragePath and BufferTempStoragePath are two properties on Data Flow Task. They define where on the hard disk(s) to page buffer data to. BLOBs (text, next, image) are written to BLOBTempStoragePath. All other buffer data will be swapped to BufferTempStoragePath. See the figure below for where these two properties are exposed in designer. 



Now what could cause a buffer to swap? There are two possible causes. The first one is when a memory allocation fails. The second one is when Windows signals the low memory resource notification event. Both will trigger SSIS to reduce its working set. SSIS does so by moving buffer data to disk. 

When BLOBTempStoragePath is not set, the paths as defined by the system variables TEMP and TMP will be used. The same rule applies to BufferTempStoragePath. 

For better performance, we recommend both BLOBTempStoragePath and BufferTempStoragePath point to fast drives. We also suggest that the drives for BufferTempStoragePath and BlobTempStoragePath be on separate spindles in order to maximize I/O throughput.

“Rows read” and “Rows written” show how many rows the entire Data Flow has processed. They give you an overall idea about the execution progress.


Performance tuning in SSIS


The performance strategy’s four key performance factors are defined as follows:

  • Operations – To identify potential tuning opportunities, first break down your integration solution into specific tasks or units of work and then fully define all data operations that are part of each task. Some of these operations may be easily identifiable, but some of the best performance tuning opportunities may be less obvious.

1.         To reduce the size of the data set and improve the performance of the overall data load, you should configure data types early in the sequence of operations

2.         To further tune file parsing, you might also take advantage of the SSIS fast parsing mode in operation . This mode uses a fast parsing routine for simple data loads in which the data contains common date formats, does not require locale-specific parsing, does not have special characters in currency columns, and does not have hexadecimal characters.

  • Volume – With the operations fully defined, identify the anticipated data volumes at each stage of the process and think about how the volumes impact the performance of data operations. This is also a great opportunity to consider how data volumes are expected to change over time.

1.         When you sharpen data volumes, you should first assess all the columns that are part of your data integration operation. Many times it is tempting to just load in an entire source file even though you may only need half of the columns. Unnecessary columns take up machine resources and should be eliminated where possible.]

2.         Another factor to consider is whether your solution requires incremental or full data loads. From a volume perspective, full data loads tend to be costly. If you can replace full data loads with incremental data loads, you will greatly reduce the volume of your data load operations. This is especially so when volumes are likely to increase over the lifecycle of an application.]



Shrinking Data Set Size : Consider size of the data set if you define the columns more accurately with schema (data type and size)

Maximizing Throughput : maximize throughput by configuring SSIS buffer settings such as DefaultMaxBufferSize and DefaultMaxBufferRows

Defining Parallel Operations : SSIS supports parallel processing of packages, tasks, and transformations. When you design parallel processes

  • Application – Given the operations and volume specifications, select the SQL Server application and/or technology that best fits the job. There is always more than one way to accomplish a task; the key is to identify which application provides you the right amount of functionality with the least amount of overhead.

BULK INSERT / bcp Usage Guidelines - Use BULK INSERT / bcp when your scenario has the following characteristics:

  • A single data source that is a file.
  • A single destination that is SQL Server
  • No data transformation requirements such as direct load from source to destination.
  • No workflow management. Note that workflow management is not really applicable to BULK INSERT functionality, because its sole job is to load data into SQL Server.

SSIS Usage Guidelines - Use SSIS when your scenario has the following characteristics:

  • Multiple heterogeneous data sources and destinations.
  • Data transformation requirements: aggregations, lookups, and new columns.
  • Incorporation of other management tasks such as e-mail and File Transfer Protocol (FTP).
  • Workflow management to control the order of many tasks and transformations



  • Location – The final factor is location, which refers to the run-time environment for the data integration operations. Location is last because the parameters for the other performance factors will influence how you optimally configure the run-time environment.

Data Destination Server – When your destination is SQL Server, executing SSIS operations on the destination data server provides significant performance advantages. The biggest advantage in this situation is the ability to use the SQL Server Destination component in an SSIS package. This component optimizes in-memory data loading and results in an 8 to 15 percent performance improvement over the standard OLE database destination component. Keep in mind that the SQL Server destination component does not support data type conversions. As a result, you will definitely have to complete conversion operations in an SSIS step prior to loading data into the destination. This is so that the data types in memory directly line up with the SQL Server data types.

Data Source Server – When you execute SSIS operations on the data source server, you can gain performance benefits by reducing the source data set size before transferring data across the network to load into the destination. For example, if you need to aggregate the data, the number of output records will likely be smaller than the number of input records. In addition, you can reduce the size of a data set by properly configuring data types early in the SSIS process.

Dedicated SSIS Server – Using a dedicated SSIS server is the best solution when you have a variety of heterogeneous data sources and destinations with large volumes and complex transformations. On this dedicated server, SSIS can take full advantage of all machine resources and will not suffer from contention issues with other applications.

For the tradeoffs, once again you will not be able to use the SSIS SQL Server Destination component. In addition, from a network perspective, you will have to transfer data twice: from the source onto the SSIS server and then from the SSIS server into the destination. If your network bandwidth can easily support this, then this option provides the most SSIS flexibility. It also allows for performing complex transformations without adversely affecting other applications or SQL Server.

SSIS Engine Overview

architecture consists of two major components: the run-time engine and the data flow engine.

The run-time engine


The run-time engine is a highly parallel control flow engine that coordinates the execution of tasks or units of work within SSIS and manages the engine threads that carry out those tasks. For the most part, the performance of the run-time engine is most heavily influenced by conditions external to SSIS, such as network bandwidth, and interaction with external systems such as database servers, FTP servers, or email servers. When SSIS runs an Execute SQL Task, for example, it sends a call to the target database and then waits for a response from the database server before it continues. In this scenario, the performance of the Execute SQL Task is more dependent on the performance of the query execution than on the SSIS run-time engine.

The data flow engine


When you use SSIS for data integration, in addition to the run-time engine, you use the data flow engine that manages the data pipeline. The data flow engine is invoked by a special task in SSIS called the Data Flow task. When the Data Flow task executes, the SSIS data flow engine extracts data from one or more data sources, performs any necessary transformations on the extracted data, and then delivers that data to one or more destinations.

With data integration solutions, you will likely spend a large part of your performance tuning time optimizing the data flow engine. Like the run-time engine, the data flow engine is influenced by external conditions; however, within SSIS, there are a variety of settings that you can manipulate to tune the data flow engine’s performance based on the requirements of the data integration operations.

Data flow

1.         set source connection manager packet size from 0 to 32767

2.         use network affinity at the operating system level. At high throughputs, you can sometimes improve performance this way.

For the network itself, you may want to work with your network specialists to enable jumbo frames to increase the default payload of 1,500 bytes to 9,000 bytes. By enabling jumbo frames, you will further decrease the amount of network operation required to move large data sets.

3.         memory setting is very very important

OS 10%

65-75% for SQL - depending on your needs

Rest for SSIS

Some other facters wil change the above numbers like

SSAS memory needed

SSD

etc...

CHECK your memory settings

Buffer Usage

Row Transformations

Partially blocking transformations

Blocking transformations

Execution Trees

Execution trees demonstrate how your package uses buffers and threads. At run time, the data flow engine breaks down Data Flow task operations into execution trees. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may execute on a different thread. When a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline, additional memory is required to handle the data transformation; however, it is important to note that each new tree may also give you an additional worker thread.

Execution trees are enormously valuable in understanding buffer usage. You can display execution trees for your own packages by turning on package logging, enabling logging for the Data Flow task, and then selecting the Pipeline Execution Tree event. Note that you will not see the execution trees until you execute the package. When you do execute the package, the execution trees appear in the Log Events window in Business Intelligence (BI) Development Studio

Evaluating Design Alternatives

Once you master how different transformation types influence package execution, you can make better performance design choices.

Buffer Sizing


In addition to using row transformations where possible to limit the number of buffers that are created and used, within SSIS you have the ability to influence buffer sizing; that is, the number of records that are read into a buffer. Your overall goal is to pass as many records as possible through a single buffer while efficiently utilizing memory.

Influencing buffer sizing


At execution time before the data is read from the sources, SSIS automatically tunes buffer sizes to achieve maximum memory utilization based on a series of input parameters. To help SSIS do the best job it can when sizing buffers, you need to be aware of the following input parameters.

·         Estimated Row Size – Estimated Row Size is not a specific SSIS setting. Rather, it is something that SSIS calculates based on the metadata that it collects about your source data at design time. You can shrink the row size by identifying the smallest possible data types for all of your columns as early in the data flow as possible. This is especially important for flat file data sources because each column is automatically read into SSIS as a string data type unless you configure the column’s data type explicitly.

·         DefaultMaxBufferRows – DefaultMaxBufferRows is a configurable setting of the SSIS Data Flow task that is automatically set at 10,000 records. SSIS multiplies the Estimated Row Size by the DefaultMaxBufferRows to get a rough sense of your dataset size per 10,000 records. You should not configure this setting without understanding how it relates to DefaultMaxBufferSize.

·         DefaultMaxBufferSize – DefaultMaxBufferSize is another configurable setting of the SSIS Data Flow task. TheDefaultMaxBufferSize is automatically set to 10 MB by default. As you configure this setting, keep in mind that its upper bound is constrained by an internal SSIS parameter called MaxBufferSize which is set to 100 MB and can not be changed.

·         MinBufferSize – While MinBufferSize is not configurable, it is still important to know about it because SSIS uses this internal parameter to gauge whether you have set the DefaultMaxBufferSize too low. MinBufferSize is defined by the granularity of your operating system’s virtual memory allocation functionality. Typically, this is set to 65,536 bytes, but it differs from machine to machine.

Buffer guidelines


In practice, you must test these settings based on your own environment, but you can start with the following general guidelines.

·         Reduce your Estimated Row Size as much as possible by removing any unnecessary columns and configuring data types correctly. Any opportunity that you have to reduce the size of the source data set before operations begin saves memory resources.

·         Start with the SSIS default settings for DefaultMaxBufferRows and DefaultMaxBufferSize. Turn on package logging with theBufferSizeTuning property enabled. This property adds information to the log that shows you where SSIS has adjusted the buffer size.

·         Tweak the values for DefaultMaxBufferRows and DefaultMaxBufferSize to get as many records into a buffer as possible. Setting these values too low causes SSIS to create many small buffers instead of fewer but larger buffers, which is a great scenario if you have enough memory.

·         As you tweak the DefaultMaxBufferRows and DefaultMaxBufferSize, realize that once the MaxBufferSize is exceeded, the setting for MaxNumberofRows no longer matters because SSIS always scales down the number of records per buffer to maximize memory utilization.

·         Note that the DefaultMaxBufferRows and DefaultMaxBufferSize are configured separately for each Data Flow task. When you integrate data from multiple data sources using a single Data Flow task, these two settings will only impact the data source components and transformations in that task. Also note that the determination of how many rows per buffer is done per buffer type.

Parallelism


Parallelism is a great technique to improve the performance of your data integration operations. SSIS natively supports the parallel execution of packages, tasks, and transformations. The trick to successful parallelism is to configure operations within the constraints of your system resources.

Configurable settings


Within SSIS, the control flow for each package is controlled by a setting called MaxConcurrentExecutables, which specifies the maximum number of SSIS threads that can execute in parallel per package. By default, this is set to -1, which translates to the number of logical machine processors plus 2.

If SSIS runs on a dedicated server and you have a lot of operations that run in parallel, you will likely want to increase this setting if some of the operations do a lot of waiting for external systems to reply. On the other hand, if you do not have a dedicated SSIS machine and your data integration application runs alongside several other applications, you may need to reduce this setting to avoid resource conflicts.

Design approaches

As you design packages for parallelism, you need to decide whether to run some or all of the operations in the package in parallel.

Consider the tradeoffs of different design approaches that apply parallelism to a package that reads data from a source database, aggregates the data four different ways, and then loads each aggregated data set into a different destination table.

1.Parallelize Destination Operations

2.Partially Parallelize Operations

3.Parallelize All Operations

4. Optimize the Slowest



hybrid design approach demonstrates how you can apply parallelism to the specific operations that can benefit the most from the performance gain without potentially wasting machine resources.

Gaining Visibility


As you change and enhance your design, you will also want to take advantage of the ability to monitor and log metadata about package executions.

·         SSIS Logging – SSIS allows you to log both tasks and packages to various logging providers such as XML, SQL Server, or text files. Logging allows you to view the performance of a package over time and to track performance as machine resources change and data volumes increase. Be aware that SSIS provides rich logging support which can be expensive to turn on completely. Review the logging events available and only log those events necessary.

·         SSIS Performance Counters – SSIS provides several performance counters that you can use to gain visibility into how resources are utilized during package execution. For example, you can view the number of rows read and the number of buffers in use as the package executes. One specific performance counter that is particularly useful is Buffers Spooled. If Microsoft Windows® runs out of physical memory during package execution or the process executing the package runs out of virtual memory, SSIS begins to spool buffers to files. Once this occurs, performance will degrade significantly, so it is a good idea to monitor this setting and make sure that you have enough memory for your operations.

·         SQL Server Profiler – When you extract data from or load data into SQL Server, you can use SQL Server Profiler to review the operations and query plans that are happening behind the scenes in SQL Server. As you monitor database activity, you may find tuning opportunities in the SQL Server RDBMS, such as the need to modify your indexing scheme.

Other Performance considerations


1.Packet size in connection should equal to 32767

2.Consider using NOLOCK in source table

3.Select only columns you need in source query

4.Use Shared lookup cache in lookup transfer

5.Consider is the transformation fall under syn/asynchronous type

6.Data types as narrow as possible for less memory usage

7.Do not perform excessive casting

8.sort:puch to source queries when possible-use sort transform for sorting cross database joins

9.Use merge instead of SCD

10.Use group by instead of aggregation

11.insert into instead of a data flow on a single sql instance

12.unnecessary delta detection vs. reload

13.use sql server destination

14.commit size 0 == fastest

15.drop some indexes based on load growth %

 clustered indexes:+inf%(dont drop)

 single nonclustered index: >=~100%

 multiple nonclusted index: >=~10%(varies)       

16.truncate,not delete

17. In “Table or view” access mode, the OLE DB source adapter calls OpenRowset to get column metadata at Validate phase. The data returned from OpenRowset include more than just column metadata. Thus the adapter issues “SET ROWCOUNT 1” statement to retrieve column metadata information. “SET ROWCOUNT 1” causes an inefficient execution plan (i.e. Nested Loop) to be cached and later used in the subsequent Execute phase.

In “SQL command” access mode, the OLE DB source adapter calls “sp_prepare” to get column metadata at Validate phase, and “sp_execute” at Execute phase. The execution plan used at Execute phase is Hash Match which is more efficient than Nested Loop.


18. Avoid many small buffers. Tweak the values for DefaultMaxBufferRows and DefaultMaxBufferSize to get as many records into a buffer as possible. To optimize the Pipeline, the goal is to pass as many records as possible through the fewest number of buffers, especially when dealing with large data volumes.

19. DefaultMaxBufferRows and DefaultMaxBufferSize are configured separately for each Data Flow task. When integrating data from multiple data sources using a single Data Flow task, these two settings will only impact the data source components and transformations in that task.

20. Do not increase buffer size to the point where paging to disk starts to occur.

21. Design the package in such a way that it does a full pull of data only in the beginning or on-demand, next time onward it should do the incremental pull, this will greatly reduce the volume of data load operations, especially when volumes are likely to increase over the lifecycle of an application. For this purpose, use upstream enabled CDC (Change Data Capture) feature of SQL Server 2008; for previous versions of SQL Server incremental pull logic.

22. It is recommended to set two of the data flow task properties viz. BufferTempStoragePath and BLOBTempStoragePath instead of using default values. The location provided should be of some fast drives. It is also recommended that BufferTempStoragePath and BLOBTempStoragePath should point to drives which are on separate spindles in order to maximize I/O throughput



Conclusion


Performance tuning is a continuous balancing act between design decisions and resource availability. To help you manage this balancing act, SSIS provides a flexible data architecture that you can use to build high performance data integration solutions. By understanding how SSIS performance architecture leverages memory and CPU, you can make more informed design decisions that take advantage of performance opportunities to maximize your system resources. You can then more accurately understand scaling requirements as your solutions grow in the future.