Tuesday, July 28, 2015

Installing SQL Server 2012 Analysis Services Tabular Mode


In this article, I will show you how to Install SQL Server 2012 Analysis Services(Tabular Mode).
1. First of all Run SQL Server 2012 Setup and go to Installation. Then click onNew SQL Server stand-alone installation or add features to an existing installation.
1-Installing SQL Server 2012 Analysis Services Tabular Mode
2. It will run Setup Support Rules.
2-Installing SQL Server 2012 Analysis Services Tabular Mode
Then click on OK button.
3. In next window click on Install button. It will again run Setup Support Rules.
3-Installing SQL Server 2012 Analysis Services Tabular Mode
If all rules run successfully, click on Next Button.
4. In next step, select Perform a new installation of SQL Server 2012 if you want to install new installation otherwise select Add features to and existing instance of SQL Server 2012.
Here we select 2nd option. Then click on Next button.
4-Installing SQL Server 2012 Analysis Services Tabular Mode
5. In next step, Select Analysis Services from Feature Selection. Then click on Next Button.
5-Installing SQL Server 2012 Analysis Services Tabular Mode
6. In next step, it will show you Disk usage Summary. Now click on Next Button.
6-Installing SQL Server 2012 Analysis Services Tabular Mode
7. In next step, select startup mode of SQL Server Analysis Services. Here we select Automatic mode. Then click on Next Button.
7-Installing SQL Server 2012 Analysis Services Tabular Mode
8. In next step, select Tabular Mode. Also specify users which have administrative permissions to access analysis services. Then click on OK button.
8-Installing SQL Server 2012 Analysis Services Tabular Mode
9. Now it will run Installation Configuration Rules.
9-Installing SQL Server 2012 Analysis Services Tabular Mode
Then click on Next Button.
10. Now all ready to Install Analysis Services .Click on Install Button.
10-Installing SQL Server 2012 Analysis Services Tabular Mode
11. After completion of installation, you will see screen like below.
11-Installing SQL Server 2012 Analysis Services Tabular Mode
Congratulations! We successfully completed installation of Analysis Services.

Parallel execution in SSIS



Parallel execution in SSIS improves performance on computers that have multiple physical or logical processors. To support parallel execution of different tasks in a package, SSIS uses two properties: MaxConcurrentExecutables and EngineThreads.

If SSIS runs on a dedicated server and you have a lot of operations that run in parallel, you will likely want to increase this setting if some of the operations do a lot of waiting for external systems to reply. On the other hand, if you do not have a dedicated SSIS machine and your data integration application runs alongside several other applications, you may need to reduce this setting to avoid resource conflicts.

The MaxConcurrentExecutables property is a property of the package. This property defines how many tasks can run simultaneously by specifying the maximum number of executables that can execute in parallel per package. The default value is -1, which equates to the number of physical or logical processors plus 2.

Please note that if your box has hyper threading turned on, it is the logical processor rather than the physically present processor that is counted.

The EngineThreads property is a property of each Data Flow task. This property defines how many threads the data flow engine can create and run in parallel. The EngineThreads property applies equally to both the source threads that the data flow engine creates for sources and the worker threads that the engine creates for transformations and destinations. Therefore, setting EngineThreads to 10 means that the engine can create up to ten source threads and up to ten worker threads. The default is 5 in SQL Server 2005 and 10 in SQL Server 2008, with a minimum value of 2.

One thing we want to be clear about EngineThreads is that it governs both source threads (for source components) and work threads (for transformation and destination components). Source threads and work threads are both engine threads created by the Data Flow’s scheduler.

One other thing to consider: If you are using the Execute Package Task, the child package to be executed can be run in-process or out-of-process by use of the ExecuteOutOfProcess property. If a child package is executed out-of-process, you will see another dtshost.exe process start. These processes will remain “live”, using up resources, for quite a while after execution is complete.

If executing in-process, a bug in a task of the child package will cause the master package to fail. Not so if executing out-of-process. On 32-bit systems a process is able to consume up to 2GB of virtual memory. Executing out-of-process means each process can claim its own 2GB portion of virtual memory. Therefore if you are simply using many packages to structure your solution in a more modular fashion, executing in-process is probably the way to go because you don’t have the overhead of launching more processes.

A thread will process one buffer at a time, executing it against all transforms in the execution tree before working on the next buffer in the flow, at which point it would pass the current buffer to another thread executing another execution tree and it would pull a new data buffer from its buffer list which was queued from an upstream component (either a data source or the last asynchronous transform before this execution tree started).

However, the general rule is to not run more threads in parallel than the number of available processors. Running more threads than the number of available processors can hinder performance because of the frequent context-switching between threads.

MaxConcurrent

·         This is a property on the ForEachLoop which says how many instances of the loop contents can be run in parallel.

Example:

Suppose we have a package with 3 Data Flow Tasks. Each task has 10 flows in the form of “OLE DB Source -> SQL Server Destination”.

Set MaxConcurrentExecutables to 3, then all 3 Data Flow Tasks will run simultaneously.



Now whether all 10 flows in each individual Data Flow Task get started concurrently is a different story. This is controlled by the second property: EngineThreads.



The EngineThreads is a property of the Data Flow Task that defines how many work threads the scheduler will create and run in parallel. Its default value is 5.  

If we set EngineThreads to 10 on all 3 Data Flow Tasks, then all the 30 flows will start off at once.

Logging details from sysssislog


select Package,
source,StepId GroupId,
 STUFF(CONVERT(CHAR(8), DATEADD(SECOND, ABS(TimeInMinutes), '19000101'), 8), 1, 2, CAST(TimeInMinutes / 3600 AS VARCHAR(12))) Duration,
 TimeInMinutes TimeInSeconds,
 [Status]from (
 selectA.starttime, 
A.id, 
c.source Package,
 A.[source],
 DENSE_RANK() OVER (order by A.executionID ) StepID,
 CONVERT(VARCHAR(100),DATEDIFF(SECOND,A.starttime,ISNULL(B.endtime,GETDATE()))) TimeInMinutes,
 CASE WHEN B.endtime IS NULL THEN ' (running)' ELSE '' END AS [Status]
from 
(select * from sysssislog where event = 'OnPreExecute') As A LEFT JOIN 
(select * from sysssislog where event = 'OnPostExecute') As B ON A.sourceid = B.sourceid and A.executionid = B.executionid
LEFT JOIN (select * from 
(select source,executionid,Row_Number() over (PARTITION by executionid order by executionid) RowId from sysssislog where event = 'OnPreExecute' )
 where RowId=1) C on C.source=A.source and C.executionid=a.executionid
) t 
order by StepID,id

SSIS Interview Questions and Answers - Part2


SSIS Interview Questions



• What do we mean by dataflow in SSIS?
Data flow is nothing but the flow of data from the corresponding sources to the referred destinations. In this process, the data transformations make changes to the data to make it ready for the data warehouse.





• What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the
object where you want to set the breakpoint and select the ‘Edit Breakpoints…’ option.





• Can you name 5 or more of the native SSIS connection managers?





1) OLEDB connection – Used to connect to any data source requiring an OLEDB connection (i.e.,
SQL Server 2000)
2) Flat file connection – Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
3) ADO.Net connection – Uses the .Net Provider to make a connection to SQL Server 2005 or other
connection exposed through managed code (like C#) in a custom task
4) Analysis Services connection – Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
5) File connection – Used to reference a file or folder. The options are to either use or create a file or folder
6) Excel





• What is the use of Bulk Insert Task in SSIS?
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports only OLE DB connections for destination database.





• What is Conditional Split transformation in SSIS?
This is just like IF condition which checks for the given condition and based on the condition evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and MANY outputs. Conditional Split transformation is used to send paths to different outputs based on some conditions. For example, we can organize the transform for the students in a class who have marks greater than 40 to one path and the students who score less than 40 to another path.





• How do you eliminate quotes from being uploaded from a flat file to SQL Server?
This can be done using TEXT QUALIFIER property. In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.





• Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName – Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it’s a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage – Determines if/how checkpoints are used. Choose from these options: Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints – Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.





• What are the different values you can set for CheckpointUsage property ?
There are three values, which describe how a checkpoint file is used during package execution:
1) Never: The package will not use a checkpoint file and therefore will never restart.
2) If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
3) Always: The package will always use a checkpoint file to restart, and if one does not exist, the package will fail.





• What is the ONLY Property you need to set on TASKS in order to configure CHECKPOINTS to RESTART package from failure?
The one property you have to set on the task is FailPackageOnFailure. This must be set for each task or container that you want to be the point for a checkpoint and restart. If you do not set this property to true and the task fails, no file will be written, and the next time you invoke the package, it will start from the beginning again.





• Where can we set the CHECKPOINTS, in DataFlow or ControlFlow ?
Checkpoints only happen at the Control Flow; it is not possible to checkpoint transformations or restart inside a Data Flow. The Data Flow Task can be a checkpoint, but it is treated as any other task.





• Can you explain different options for dynamic configurations in SSIS?
1) XML file
2) custom variables
3) Database per environment with the variables
4) Use a centralized database with all variables





• What is the use of Percentage Sampling transformation in SSIS?
Percentage Sampling transformation is generally used for data mining. This transformation builds a random sample of set of output rows by choosing specified percentage of input rows. For example if the input has 1000 rows and if I specify 10 as percentage sample then the transformation returns 10% of the RANDOM records from the input data.





• What is the use of Term Extraction transformation in SSIS?
Term Extraction transformation is used to extract nouns or noun phrases or both noun and noun phrases only from English text. It extracts terms from text in a transformation input column and then writes the terms to a transformation output column. It can be also used to find out the content of a dataset.





• What is Data Viewer and what are the different types of Data Viewers in SSIS?
A Data Viewer allows viewing data at a point of time at runtime. If data viewer is placed before and after the Aggregate transform, we can see data flowing to the transformation at the runtime and how it looks like after the transformation occurred. The different types of data viewers are:





1. Grid
2. Histogram
3. Scatter Plot
4. Column Chart.





• What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data(wrong type of data or JUNK data) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.





• Which are the different types of Control Flow components in SSIS?
The different types of Control Flow components are: Data Flow Tasks, SQL Server Tasks, Data Preparation Tasks, Work flow Tasks, Scripting Tasks, Analysis Services Tasks, Maintenance Tasks, Containers.





• What are containers? What are the different types of containers in SSIS?
Containers are objects that provide structures to packages and extra functionality to tasks. There are four types of containers in SSIS, they are: Foreach Loop Container, For Loop Container, Sequence Container and Task Host Container.





• What are the different types of Data flow components in SSIS?
There are 3 data flow components in SSIS.
1. Sources
2. Transformations
3. Destinations





• What are the different types of data sources available in SSIS?
There are 7 types of data sources provided by SSIS: a.) Data Reader source b.) Excel source c.) Flat file source d.) OLEDB source e.) Raw file source f.) XML source g.) Script component





• What is SSIS Designer?
It is a graphical tool for creating packages. It has 4 tabs: Control Flow, Data Flow, Event Handlers and Package Explorer.





• What is Control Flow tab?
It is the tab in SSIS designer where various Tasks can be arranged and configured. This is the tab where we provide and control the program flow of the project.





• What is Data Flow tab?
This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where we can extract data from sources, transform the data and then load them into destinations.





• What is the function of control flow tab in SSIS?
On the control flow tab, the tasks including dataflow task, containers and precedence constraints that connect containers and tasks can be arranged and configured.





• What is the function of Event handlers tab in SSIS?
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Failes or Stops or Starts ..





• What is the function of Package explorer tab in SSIS?
This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.





• What is Solution Explorer?
It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other miscellaneous files can be viewed and accessed for modification.





• How do we convert data type in SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.





• How are variables useful in ssis package?
Variables can provide communication among objects in the package. Variables can provide communication between parent and child packages. Variables can also be used in expressions and scripts. This helps in providing dynamic values to tasks.





• Explain Aggregate Transformation in SSIS?
It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in datagrid, and also operations for each Output Alias in Operation columns of the same datagrid. Some of operation functions listed below :
• Group By
• Average
• Count
• Count Distinct : count distinct and non null column value
• Min, Max, Sum
In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low, medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn On Division By Zero. If you check Warn On Division By Zero, the component will give warning instead of error. Key Scale option will optimize transformation cache to certain number of key threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify particular number of keys here. Default value is unspecified. Similar to number of keys for Count Distinct scale option. It is used to optimize number of distinct value written to memory, default value is unspecified. Auto Extend Factor is used when you want some portion of memory is used for this component. Default value is 25% of memory.





• Explain Audit Transformation ?
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this transformation are :
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4. VersionID : GUID version of the package
5. Execution StartTime
6. MachineName
7. UserName
8. TaskName
9. TaskID : uniqueidentifier type of the data flow task that contains audit transformation.





• Explain Character Map Transformation?
It transforms some character. It gives options whether output result will override the existing column or add to new column. If you define it as new column, specify new column name. Operations available here are:
1. Uppercase
2. Lowercase
3. Byte reversal : such as from 0×1234 to 0×4321
4. Full width
5. Half width
6. Hiragana/katakana/traditional Chinese/simplified Chinese
7. Linguistic casing





• Explain Conditional split Transformation ?
It functions as if…then…else construct. It enables send input data to a satisfied conditional branch. For example you want to split product quantity between less than 500 and greater or equal to 500. You can give the conditional a name that easily identifies its purpose. Else section will be covered in Default Output Column name.
After you configure the component, it connect to subsequent transformation/destination, when connected, it pops up dialog box to let you choose which conditional options will apply to the destination transformation/destination.





• Explain Copy column Transformation?
This component simply copies a column to another new column. Just like ALIAS Column in T-Sql.





• Explain Data conversion Transformation?
This component does conversion data type, similar to TSQL function CAST or CONVERT. If you wish to convery the data from one type to another then this is the best bet. But please make sure that you have COMPATABLE data in the column.





• Explain Data Mining query Transformation?
This component does prediction on the data or fills gap on it. Some good scenarios uses this component is:
1. Take some input columns as number of children, domestic income, and marital income to predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping cart.
3. Filling blank data or default values when customer doesn’t fill some items in the questionnaire.





• Explain Derived column Transformation?
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.





• Explain Merge Transformation?
Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.





• Explain Merge Join Transformation?
Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.





• Explain Multicast Transformation?
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.





• Explain Percentage and row sampling Transformations?
This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.





• Explain Sort Transformation?
This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.





• Explain Union all Transformation?
It works in opposite way to Merge transformation. It can take output from more than 2 input paths and combines into single output path.





• What r the possible locations to save SSIS package?
You can save a package wherever you want.
SQL Server
Package Store
File System





• What is a package?
A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.





• What is a workflow in SSIS?
A workflow is a set of instructions on how to execute tasks.
(It is a set of instructions on how to execute tasks such as sessions, emails and shell commands. a workflow is created form work flow mgr.





• What is the diff between control flow Items and data flow Items?
The control flow is the highest level control process. It allows you to manage the run-time process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow task to the package control flow.





• What are the main component of SSIS(project-architecture)?
SSIS archItecture has 4 main components
1.ssis service
2.ssis runtime engine & runtime executables
3.ssis dataflow engine & dataflow components
4.ssis clients





• Different components in SSIS package?
Control flow
Data flow
Event handler
Package explorer





• What are Connection Managers?
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.





• What is environment variable in SSIS?
An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.





• How to provide securIty to packages?
We can provide security in two ways
1. Package encryption
2. Password protection.





• What are Precedence constraints?
Constraints that link executable, container, and tasks wIthin the package control flow and specify condItion that determine the sequence and condItions for determine whether executable run.





• What is Design time Deployment in SSIS ?





When you run a package from with in BIDS,it is built and temporarily deployed to the folder. By default the package will be deployed to the BIN folder in the Package’s Project folder and you can configure for custom folder for deployment. When the Package’s execution is completed and stopped in BIDS,the deployed package will be deleted and this is called as Design Time Deployment.





What are the TrasactionOptions in SSIS


•Required - if a transaction exists join it else start a new one
•Supported - if a transaction exists join it (this is the default)
•NotSupported - do not join an existing transaction








Data flow Transformation categories in SSIS





Transformations are defined as a core component in the data flow of a package in SSIS. It is that part of the data flow to which we apply our business logic to manipulate and modify the input data into the required format beforeloading it to the destination. All the Data flow transformations are broadly classified into 2 types:-





Type 1 – Synchronous Transformations.





Type 2 – Asynchronous Transformations.





What is the difference between Synchronous and Asynchronous transformations?







Synchronous Transformations


Asynchronous Transformations



Processes each incoming row, modifies according to the required format and forward it.


Stores all the rows into the memory before it begins the process of modifying input data to the required output format.



No. of input rows = No. of output rows.


No. of input rows != No. of output rows



Output rows are in sync with Input rows i.e. 1:1 relationship.


Output rows are not in sync with Input rows



Less memory is required as they work on row by row basis.


More memory is required to store the whole data set as input and output buffers do not use the same memory.



Does not block the data flow in the pipeline.


Are also known as “Blocking Transformations” as they block the data flow in the pipeline until all the input rows are read into the memory.



Runs quite faster due to less memory required.


Runs generally slow as memory requirement is very high.



E.g. – Data Conversion Transformation- Input rows flow into the memory buffers and the same buffers come out in the required data format as Output.


E.g. – Sort Transformation- where the component has to process the complete set of rows in a single operation.






Further Asynchronous transformations are divided into 2 categories:-



Partially blocking transformations creates new memory buffers for the output of the transformation such as theUnion All transformation.
Fully blocking transformations performs the same operation but cause a full block of the data such as the Sort and Aggregate Transformations.







Non-Blocking transformations


Semi-blocking transformations


Blocking transformations



Audit


Data Mining Query


Aggregate



Character Map


Merge


Fuzzy Grouping



Conditional Split


Merge Join


Fuzzy Lookup



Copy Column


Pivot


Row Sampling



Data Conversion


Unpivot


Sort



Derived Column


Term Lookup


Term Extraction



Lookup


Union All






Multicast









Percent Sampling









Row Count









Script Component









Export Column









Import Column









Slowly Changing Dimension









OLE DB Command


1.       What is BI?
Business intelligence is a collection of skills, technologies and practices, by which we can gather data, analyze data and provide data for forecasting and better decision making.

Business intelligence (BI) is a broad category of applications and technologies for gathering, storing, analyzing, and providing access to data to help enterprise users make better business decisions.
Collect data: - Data in an enterprise can be stored in various formats. Now these formats can vary from normalized structured RDBMS to excel sheets or probably unstructured file formats. So the first step in BI is to collect all these unstructured and scattered data and bring them in to one uniform format.

This can be achieved in 3 steps:-

1.       Extract: - In this step we read unorganized data from these sources by understanding their data structure.

2.       Transform: - In this step we transform the data in to a standard format.

3.       Load: - Finally we load the standard format in to a data warehouse and data mart (In the next part we have explained what data warehouse, data mart and OLAP?).



Analyze data: - Once the data is loaded in to Data ware house, you run tools, algorithms so that you can analyze and forecast information.


View data: - Once you have analyzed the data you would like view it. Now again how people want to view data can vary from simple tabular format to complex graphical chart. So in this section we would need good reporting tools to achieve the same.

2.       What is MS BI?
3.       What is SSIS?
SQL Server Integration Services (SSIS) is a component of SQL Server which can be used to perform a wide range of Data Migration and ETL operations. SSIS is a component in MSBI process of SQL Server. 
This is a platform for Integration and Workflow applications. It is known for a fast and flexible OLTP and OLAP extensions used for data extraction, transformation, and loading (
ETL). The tool may also be used to automate maintenance of SQL Server databases and multidimensional data sets.

4.    What is ETL?

5.    What is a package?
A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.

6.    What are the main components of SSIS (project-architecture)?
SSIS architecture has 4 main components
1.SSIS service
2.SSIS runtime engine & runtime executables
3.SSIS dataflow engine & dataflow components
4.SSIS clients’

7.       What is the Data Flow Engine?
The Data Flow Engine, also called the SSIS pipeline engine, is responsible for managing the flow of data from the source to the destination and performing transformations (lookups, data cleansing etc.).  Data flow uses memory oriented architecture, called buffers, during the data flow and transformations which allows it to execute extremely fast. This means the SSIS pipeline engine pulls data from the source, stores it in buffers (in-memory), does the requested transformations in the buffers and writes to the destination. The benefit is that it provides the fastest transformation as it happens in memory and we don't need to stage the data for transformations in most cases.

8.    Different components in SSIS package?
1. Control flow
2.data flow
3.event handler
4.package explorer

9.     What are the tools associated with SSIS?
We use Business Intelligence Development Studio (BIDS) and SQL Server Management Studio (SSMS) to work with Development of SSIS Projects.
We use SSMS to manage the SSIS Packages and Projects.

10.  2. What are the Features of SSIS?
1) Improved Scripting with help of VSTA scripting engine
2) Ehnanced ADO.NET wizard
3) Import/Export Wizard used for transfering data from Excel to table and from one table to another table.

11.  What is SSIS? How it is related with SQL Server.
SQL Server Integration Services (SSIS) is a component of SQL Server which can be used to perform a wide range of Data Migration and ETL operations. SSIS is a component in MSBI process of SQL Server.
This is a platform for Integration and Workflow applications. It is known for a fast and flexible OLTP and OLAP extensions used for data extraction, transformation, and loading (
ETL). The tool may also be used to automate maintenance of SQL Server databases and multidimensional data sets.


12.  What are the differences between DTS and SSIS
Data Transformation Services
SQL Server Integration Services
Limited Error Handling
Complex and powerful Error Handling
Message Boxes in ActiveX Scripts
Message Boxes in .NET Scripting
No Deployment Wizard
Interactive Deployment Wizard
Limited Set of Transformation
Good number of Transformations
NO BI functionality
Complete BI Integration

13.  What is a workflow in SSIS 2014?
Workflow is a set of instructions on to specify the Program Executor on how to execute tasks and containers within SSIS Packages.
14. What is a Task?
A task is very much like a method of any programming language which represents or carries out an individual unit of work. There are broadly two categories of tasks in SSIS, Control Flow tasks and Database Maintenance tasks. All Control Flow tasks are operational in nature except Data Flow tasks. Although there are around 30 control flow tasks which you can use in your package you can also develop your own custom tasks with your choice of .NET programming language

15.  What is the control flow?
A control flow consists of one or more tasks and containers that execute when the package runs. To control order or define the conditions for running the next task or container in the package control flow, we use precedence constraints to connect the tasks and containers in a package. A subset of tasks and containers can also be grouped and run repeatedly as a unit within the package control flow. SQL Server Integration Services (SSIS) provides three different types of control flow elements: Containers that provide structures in packages, Tasks that provide functionality, and Precedence Constraints that connect the executables, containers, and tasks into an ordered control flow.

16.  What is the dataflow?
Data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package. Data Sources, Transformations, and Data Destinations are the three important categories in the Data Flow.

17. Difference between control flow and data flow
Control flow deals with orderly processing of individual, isolated tasks, these tasks are linked through precedence constraints in random order. Also the output for task has finite outcome i.e., Success, Failure, or Completion. A subsequent task does not initiate unless its predecessor has completed. Data flow, on the other hand, streams the data in pipeline manner from its source to a destination and modifying it in between by applying transformations. Another distinction between them is the absence of a mechanism that would allow direct transfer of data between individual control flow tasks. On the other hand, data flow lacks nesting capabilities provided by containers.
Control Flow
Data Flow
Process Oriented
Data Oriented
Made up of
Tasks and Container
Source, Transformation and Destination
Connected through
Precedence constraint
Paths
Smallest unit
Task
Component
Outcome
Finite- Success, Failure, Completion
Not fixed


18.  What is the Difference between control flow Items and data flow Items?
the control flow is the highest level control process. It allows you to manage the run-time process the run time process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow Task to the package control flow.

19.  How does Error-Handling work in SSIS
When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.
Type of typical Errors in SSIS:
-Data Connection Errors, which occur in case the connection manager cannot be initialized with the connection string. This applies to both Data Sources and Data Destinations along with Control Flows that use the Connection Strings.
-Data Transformation Errors, which occur while data is being transformed over a Data Pipeline from Source to Destination.
-Expression Evaluation errors, which occur if expressions that are evaluated at run time perform invalid


20.  What is environment variable in SSIS?
An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.

21.    Containers: provide structure and scope to your package
Types of containers:
i. Task host container: the Task host container services a single Task.
ii. Sequence container: It can handle the flow of subset of a package and can help you drive a package into smaller more manageable process.
Uses:-

1. Grouping Tasks so that you can disable a part of the package that no longer needed.
2. Narrowing the scope of the variable to a container.
3. Managing the property of multiple Tasks in one step by setting the properties of the container.
iii. For loop container: evaluates an expression and repeats Its workflow until the expression evaluates to false.
iv. For each loop container: defines a control flow repeatedly by using an enumerator.
For each loop container repeats the control flow for each member of a specified enumerator.

22.    Tasks: It provides the functionality to your package.

  It is an individual unit of work.

23.    Event handler: It responds to raised events in your package.

24.    Precedence constraints: It provides ordinal relationship b/w various Items in your package.

25.    How to deploy the package?
To deploy the package first we need to configure some properties.

Ø Go to project tab->package properties->we get a window, configure deployment Utility as "True"

Ø  Specify the path as "bin/deployment"

26.    Connection manager:
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.

27.     Tell the Utility to execute (run) the package?
a) In BIDS a package that can be executed in debug mode by using the debug menu or toolbar or from solution explorer.
In production, the package can be executed from the command line or from Microsoft windows Utility or It can be scheduled for automated execution by using the SQL server agent.
i). Go to->debug menu and select the start debugging button
ii).press F5 key
iii).right click the package and choose execute package.
iv).command prompts utilities


a).DTExecUI
1. To open command prompt->run->type dtexecui->press enter
2. The execute package Utility dialog box opens.
3. in that click execute to run the package.
Wait until the package has executed successfully.


b).DTExec Utility
1.open the command prompt window.
2. Command prompt window->type dtexec /followed by the DTS, SQL, or file option and the package path, including package name.
3. If the package encryption level is encrypting sensitive with password or encrypt all with password, use the decrypt option to provide the password.
If no password is included, dtexec will prompt you for the password.
4. Optionally, provide additional command-line options
5. Press enter.
6. Optionally, view logging and reporting information before closing the command prompt window.
The execute package Utility dialog box opens.
7. In the execute package Utility dialog box, click execute package.
Wait until the package has executed successfully.
v).using SQL server mgmt studio to execute package
1. In SSMS right click a package, and then click run package.
Execute package Utility opens.
2. Execute the package as described previously.

28.     How can u design SCD in SSIS?
a) Def:-SCD explains how to capture the changes over the period of time.
This is also known as change data capture.
type1: It keeps the most recent values in the target. It does not maintain the history.
type2: It keeps the full history in the target database. For every update in the source a new record is inserted in the target.
type3: It keeps current & previous information in the target.

29.    How can u handle the errors through the help of logging in SSIS?
a) To create an on error event handler to which you add the log error execute SQL Task.

30.    What is a log file and how to send log file to mgr?
a) It is especially useful when the package has been deployed to the production environment, and you cannot use BIDS and VSA to debug the package.
SSIS enables you to implement logging code through the Dts. Log method.
When the Dts. Log method is called in the script, the SSIS engine will route the message to the log providers that are configured in the containing package.

31.    What is environment variable in SSIS?
a) An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.

32.     About multiple configurations?
a) It means including the xml configuration, environment variable, registry entry, parent package variable, SQL Server table, and direct and indirect configuration types.

33.    How to provide security to packages?
a) In two ways
1. Package encryption
2. Password protection.

34.    As per error handling in T/R, which one handle the better performance? Like fail component, redirect row or ignore failure?
a) Redirect row provides better performance for error handling.

35.    Staging area??
a) It is a temporary data storage location. Where various data T/R activities take place.

Staging area is a kitchen of data warehouse.

36.    Task??
a) An individual unit of work.

Types:-
1. Active x script Task
2. Analysis services execute DDL Task *
3. Analysis services processing Task *
4. Bulk insert Task *
5. Data flow Task *
6. Data mining query Task
7. Execute Dts 2000 package Task
8. Execute package Task *
9. Execute process Task
10. Execute SQL Task *
11. File system Task *
12. Ftp Task
13. Message queue Task
14. Script Task *
15. Send mail Task *
16. Web service Task
17. Wmi data reader Task
18. Wmi event Task
19. Xml Task

37.    Event handler & logging?

Even handler is the mechanism to raise a event based on specific scenario. For example if there is any failure in data load it will notify thru email or entry in error table etc.

Logging can be done based on event, in SSIS there are 12 events that can be logged at Task or package level. You can enable partial logging for one Task and enable much more detailed logging for billing Tasks.
Example:-

On error; On post validate; On progress; On warning
In SSIS different type of logging mechanism are there:

SQL profiler
Text files
SQL server
Window event log
Xml file

38.    Import & export wizard?
a) Easiest method to move data from sources like oracle, db2, SQL server.
Right click on database name->go to Task->import and export wizard
Select the source
Select the destination
Query copy of tables
Execute
Finish

39.    What solution explorer?
Once you creating project with specific project name then if you want to add
data source/-data source views/packages/ miscellaneous; then this window will help to organize different files under one structure.

40.    Precedence constraints?
a) Constraints that link executable, container, and Tasks within the package control flow and specify condition that determine the sequence
and conditions for determine whether executable run.

41.   What do we mean by dataflow in SSIS?
Data flow is nothing but the flow of data from the corresponding sources to the referred destinations. In this process, the data transformations make changes to the data to make it ready for the data warehouse.

42.  What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the
object where you want to set the breakpoint and select the ‘Edit Breakpoints…’ option.


43.  Can you name 5 or more of the native SSIS connection managers?
1) OLEDB connection – Used to connect to any data source requiring an OLEDB connection (i.e.,
SQL Server 2000)
2) Flat file connection – Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
3) ADO.Net connection – Uses the .Net Provider to make a connection to SQL Server 2005 or other
connection exposed through managed code (like C#) in a custom task
4) Analysis Services connection – Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
5) File connection – Used to reference a file or folder. The options are to either use or create a file or folder
6) Excel
44.  What is the use of Bulk Insert Task in SSIS?
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports only OLE DB connections for destination database.

45.  Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName – Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it’s a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage – Determines if/how checkpoints are used. Choose from these options: Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints – Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.

46.  What are the different values you can set for CheckpointUsage property ?
There are three values, which describe how a checkpoint file is used during package execution:
1)
 Never: The package will not use a checkpoint file and therefore will never restart.
2)
 If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
3)
 Always: The package will always use a checkpoint file to restart, and if one does not exist, the package will fail.
47.   What is the ONLY Property you need to set on TASKS in order to configure CHECKPOINTS to RESTART package from failure?
The one property you have to set on the task is
 FailPackageOnFailure. This must be set for each task or container that you want to be the point for a checkpoint and restart. If you do not set this property to true and the task fails, no file will be written, and the next time you invoke the package, it will start from the beginning again.

48.  Where can we set the CHECKPOINTS, in DataFlow or ControlFlow ?
Checkpoints only happen at the Control Flow; it is not possible to checkpoint transformations or restart inside a Data Flow. The Data Flow Task can be a checkpoint, but it is treated as any other task.
49.  Can you explain different options for dynamic configurations in SSIS?
1) XML file
2) custom variables
3) Database per environment with the variables
4) Use a centralized database with all variables

50.  Which are the different types of Control Flow components in SSIS?
The different types of Control Flow components are: Data Flow Tasks, SQL Server Tasks, Data Preparation Tasks, Work flow Tasks, Scripting Tasks, Analysis Services Tasks, Maintenance Tasks, Containers.

51.  What is SSIS Designer?
It is a graphical tool for creating packages. It has 4 tabs: Control Flow, Data Flow, Event Handlers and Package Explorer.

52.  • What is Control Flow tab?
It is the tab in SSIS designer where various Tasks can be arranged and configured. This is the tab where we provide and control the program flow of the project.

53.  • What is Data Flow tab?
This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where we can extract data from sources, transform the data and then load them into destinations.
54.  What is the function of control flow tab in SSIS?
On the control flow tab, the tasks including dataflow task, containers and precedence constraints that connect containers and tasks can be arranged and configured.

55.  • What is the function of Event handlers tab in SSIS?
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Fails or Stops or Starts.

56.  • What is the function of Package explorer tab in SSIS?
This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.

57.  What is Solution Explorer?
It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other miscellaneous files can be viewed and accessed for modification.

58.  How do we convert data type in SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.

59.  How are variables useful in SSIS package?
Variables can provide communication among objects in the package. Variables can provide communication between parent and child packages. Variables can also be used in expressions and scripts. This helps in providing dynamic values to tasks.


60.    Data pipeline?
a) The memory based, multithreaded, buffered t/r process flow data through an SSIS data flow Task during package execution.

61.    TRANSFORMATIONS??
It is an object that generates, modifies, or passes data.
1. AGGEGATE T/R:-It applies an aggregate function to grouped records and produces new output records from aggregated results.
2. AUDIT T/R:-the t/r adds the value of a system variable, such as machine name or execution instance GUID to a new output column.
3. CHARACTER MAP T/R:-this t/r makes string data changes such as changing data from lower case to upper case.
4. CONDITIONAL SPLIT:-It separate input rows into separate output data pipelines based on the Boolean expressions configured for each output.
5. COPY COLUMN:-add a copy of column to the t/r output we can later transform the copy keeping the original for auditing personal
6.DATA CONVERSION:-converts a columns data type to another data type.
7. DATA MINING QUERY:-perform a data mining query against analysis services.
8. DERIVED COLUMN:-create a new derive column calculated from expression.
9. EXPORT COLUMN:-It allows you to export a column from the data flow to a file.
10. FUZZY GROUPING:-perform data cleansing by finding rows that are likely duplicates.
11. FUZZY LOOKUP:-matches and standardizes data based on fuzzy logic.
Ex:-transform the name jon to john
12.IMPORT COLUMN:-reads the data from a file & adds It into a dataflow.
13. LOOKUP:-perform the lookup of data to be used later in a transform.
Ex:-t/T to lookup a city based on zip code.
1. Getting a related value from a table using a key column value
2. Update slowly changing dimension table
3.to check whether records already exist in the table.
14. MERGE:-merges two sorted data sets into a single data set into a single data flow.
15. MERGE JOIN:-merges two data sets into a single dataset using a join junction.
16. MULTI CAST:-sends a copy of two data to an additional path in the workflow.
17. ROW COUNT:-stores the rows count from the data flow into a variable.
18. ROW SAMPLING:-captures the sample of data from the dataflow by using a row count of the total rows in dataflow.
19. ROW SAMPLING:-captures the sample of the data from the data flow by using a row count of the total rows in data flow.
20. UNION ALL:-merge multiple data sets into a single dataset.
21. PIVOT:-converts rows into columns
22.UNPIVOT:-converts columns into rows

62.    Batch?
a) A batch is defined as group of sessions. Those are 2 types.
1. Parallel batch processing
2. Sequential batch processing

to execute a SSIS package we will use "execute package utility"

To deploy a SSIS package we will use "package deployment Utility”


63.  What are the Transformations available in SSIS?

AGGREGATE  - It applies aggregate functions to Record Sets to produce new output records from aggregated values.
AUDIT  - Adds Package and Task level Metadata - such as Machine Name, Execution Instance, Package Name, Package ID, etc..
CHARACTERMAP - Performs SQL Server column level string operations such as changing data from lower case to upper case.
CONDITIONALSPLIT– Separates available input into separate output pipelines based on Boolean Expressions configured for each output.
COPYCOLUMN - Add a copy of column to the output we can later transform the copy keeping the original for auditing.
DATACONVERSION - Converts columns data types from one to another type. It stands for Explicit Column Conversion.
DATAMININGQUERY– Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
DERIVEDCOLUMN - Create a new (computed) column from given expressions.
EXPORTCOLUMN– Used to export a Image specific column from the database to a flat file.
FUZZYGROUPING– Used for data cleansing by finding rows that are likely duplicates.
FUZZYLOOKUP -  Used for Pattern Matching and Ranking based on fuzzy logic.
IMPORTCOLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a data source. It is used for exact matches only.
MERGE - Merges two sorted data sets into a single data set into a single data flow.
MERGEJOIN - Merges two data sets into a single dataset using a join junction.
MULTICAST - Sends a copy of supplied Data Source onto multiple Destinations.
ROWCOUNT - Stores the resulting row count from the data flow / transformation into a variable.
ROWSAMPLING - Captures sample data by using a row count of the total rows in dataflow specified by rows or percentage.
UNIONALL - Merge multiple data sets into a single dataset.
PIVOT– Used for Normalization of data sources to reduce analomolies by converting rows into columns
UNPIVOT– Used for denormalizing the data structure by converts columns into rows incase of building Data Warehouses.

64.  How to log SSIS Executions?
SSIS includes logging features that write log entries when run-time events occur and can also write custom messages. This is not enabled by default. Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files. Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.

65.  How do you deploy SSIS packages?
SSIS Project BUILD provides a Deployment Manifest File. We need to run the manifest file and decide whether to deploy this onto File System or onto SQL Server [ msdb]. SQL Server Deployment is very faster and more secure then File System Deployment. Alternatively, we can also import the package from SSMS from File System or SQL Server.

66.  What are variables and what is variable scope?
Variables store values that a SSIS package and its containers, tasks, and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that sequence tasks and containers into a workflow can use variables when their constraint definitions include expressions. Integration Services supports two types of variables: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by Integration Services. You can create as many user-defined variables as a package requires, but you cannot create additional system variables.


67. If you want to send some data from Access database to SQL server database. What are different component of SSIS will you use?
In the data flow, we will use one OLE DB source, data conversion transformation and one OLE DB destination or SQL server destination. OLE DB source is data source is useful for reading data from Oracle, SQL Server and Access databases. Data Conversion transformation would be needed to remove datatype abnormality since there is difference in datatype between the two databases (Access and SQL Server) mentioned. If our database server is stored on and package is run from same machine, we can use SQL Server destination otherwise we need to use OLE DB destination. The SQL Server destination is the destination that optimizes the SQL Server.  

68.  Difference and similarity between merge and merge join transformation
Merge Transofrmations
Merge Join Transformation
The data from 2 input paths are merged into one
The data from 2 inputs are merged based on some common key.
Works as
UNION ALL
JOIN (LEFT, RIGHT OR FULL)
Supports
2 Datasets 
1 Dataset
Columns
Metadata for all columns needs to be same
Key columns metadata needs to be same.
Pre-requisites
Data must be sorted.
Merged columns should have same datatype i.e. if merged column is EmployeeName with string of 25 character in Input 1, it can be of less than or equal to 25 characters for merging to happen.
Data must be sorted.
Merged columns should have same datatype i.e. if merged column is EmployeeName with string of 25 character in Input 1, it can be of less than or equal to 25 characters for merging to happen.
Limitations
Only 2 input paths can be merged.
Does not support error handling.
Does not support error handling.
Use
Merging of data from 2 data source
Can create complex datasets using nesting merge transformation,
When data from 2 tables having foreign key relationship needs to present based on common key.


69.               What are the different types of Data flow components in SSIS?
There are 3 data flow components in SSIS.
1. Sources
2. Transformations
3. Destinations
70.  Explain Audit Transformation?
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX).
Auditing options that you can add to transformed data through this transformation are:
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4.VersionID : GUID version of the package
5. Execution StartTime
6.MachineName
7.UserName
8.TaskName
9.TaskID : uniqueidentifier type of the data flow task that contains audit transformation

71. Explain Copy column Transformation?
This component simply copies a column to another new column. Just like ALIAS Column in T-Sql.

72. Explain Derived column Transformation?
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.

73. Explain Multicast Transformation?
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.


74. What is a Transformation?
 A transformation simply means bringing in the data in a desired format. For example you are pulling data from the source and want to ensure only distinct records are written to the destination, so duplicates are  removed.  Anther example is if you have master/reference data and want to pull only related data from the source and hence you need some sort of lookup. There are around 30 transformation tasks available and this can be extended further with custom built tasks if needed.

75. How many difference source and destinations have you used?
It is very common to get all kinds of sources so the more the person worked with the better for you. Common ones are SQL Server, CSV/TXT, Flat Files, Excel, Access, Oracle, MySQL but also Salesforce, web data scrapping.

76.  What is Conditional Split transformation in SSIS?
This is just like IF condition which checks for the given condition and based on the condition evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and MANY outputs. Conditional Split transformation is used to send paths to different outputs based on some conditions. For example, we can organize the transform for the students in a class who have marks greater than 40 to one path and the students who score less than 40 to another path.

77.  How do you eliminate quotes from being uploaded from a flat file to SQL Server? 
This can be done using TEXT QUALIFIER property. In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.

78.  What is the use of Percentage Sampling transformation in SSIS?
Percentage sampling transformation is generally used for data mining. This transformation builds a random sample of set of output row’s by choosing specified percentage of input rows. For example if the input has 1000 rows and if I specify 10 as percentage sample then the transformation returns 10% of the RANDOM records from the input data.

79.  What is the use of Term Extraction transformation in SSIS?
Term Extraction transformation is used to extract nouns or noun phrases or both noun and noun phrases only from English text. It extracts terms from text in a transformation input column and then writes the terms to a transformation output column. It can be also used to find out the content of a dataset.

80.  What is Data Viewer and what are the different types of Data Viewers in SSIS?
A Data Viewer allows viewing data at a point of time at runtime. If data viewer is placed before and after the Aggregate transform, we can see data flowing to the transformation at the runtime and how it looks like after the transformation occurred. The different types of data viewers are:
1. Grid
2. Histogram
3. Scatter Plot
4. Column Chart.
81.   What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data (wrong type of data or JUNK data) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.


82.  Explain Aggregate Transformation in SSIS?
It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in datagrid, and also operations for each Output Alias in Operation columns of the same datagrid. Some of operation functions listed below :
• Group By
• Average
• Count
• Count Distinct : count distinct and non null column value
• Min, Max, Sum
In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low, medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn On Division By Zero. If you check Warn On Division By Zero, the component will give warning instead of error. Key Scale option will optimize transformation cache to certain number of key threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify particular number of keys here. Default value is unspecified. Similar to number of keys for Count Distinct scale option. It is used to optimize number of distinct value written to memory, default value is unspecified. Auto Extend Factor is used when you want some portion of memory is used for this component. Default value is 25% of memory.

83.  • Explain Audit Transformation?
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this transformation are:
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4. VersionID : GUID version of the package
5. Execution StartTime
6. MachineName
7. UserName
8. TaskName
9. TaskID : uniqueidentifier type of the data flow task that contains audit transformation.


84.  Explain Character Map Transformation?
It transforms some character. It gives options whether output result will override the existing column or add to new column. If you define it as new column, specify new column name. Operations available here are:
1. Uppercase
2. Lowercase
3. Byte reversal : such as from 0×1234 to 0×4321
4. Full width
5. Half width
6. Hiragana/katakana/traditional Chinese/simplified Chinese
7. Linguistic casing

85.  Explain Conditional split Transformation ?
It functions as if…then…else construct. It enables send input data to a satisfied conditional branch. For example you want to split product quantity between less than 500 and greater or equal to 500. You can give the conditional a name that easily identifies its purpose. Else section will be covered in Default Output Column name.
After you configure the component, it connect to subsequent transformation/destination, when connected, it pops up dialog box to let you choose which conditional options will apply to the destination transformation/destination.


86.  Explain Copy column Transformation?
This component simply copies a column to another new column. Just like ALIAS Column in T-Sql.

87.  Explain Data conversion Transformation?
This component does conversion data type, similar to TSQL function CAST or CONVERT. If you wish to convert the data from one type to another then this is the best bet. But please make sure that you have COMPATABLE data in the column.


88.  Explain Data Mining query Transformation?
This component does prediction on the data or fills gap on it. Some good scenarios uses this component is:
1. Take some input columns as number of children, domestic income, and marital income to predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping cart.
3. Filling blank data or default values when customer doesn’t fill some items in the questionnaire.

89.  Explain Derived column Transformation?
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.


90.  • Explain Merge Transformation?
Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.

91.  Explain Merge Join Transformation?
Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.


92.  Explain Multicast Transformation?
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.

93.  Explain Percentage and row sampling Transformations?
This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.

94.  Explain Sort Transformation?
This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.

95.  Explain Union all Transformation?
It works in opposite way to merge transformation. It can take output from more than 2 input paths and combines into single output path.

96.  What r the possible locations to save SSIS package?
You can save a package wherever you want.
SQL Server
Package Store
File System

97.  What are Connection Managers?
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.

98.   What is Design time Deployment in SSIS?
When you run a package from within BIDS, it is built and temporarily deployed to the folder. By default the package will be deployed to the BIN folder in the Package’s Project folder and you can configure for custom folder for deployment. When the Package’s execution is completed and stopped in BIDS, the deployed package will be deleted and this is called as Design Time Deployment.

99.  What is Manifest file in SSIS?

Manifest while if the utility which can be used to deploy the package using wizard on file system and SQL
Server database.


100.                       What is File System Deployment?

File system deployment means to save package file on local or network drive.
Question: how to back up or retrieve the SSIS package
if your package is deployed on SQL Server then you can back up the MSDB database as all the package on SQL
server deploys at MSDB.

101.                      What is Transaction in SSIS package and how to implement it? (SSIS Interview Questions)
Packages use transactions to bind the database actions that tasks perform into atomic units, and by doing this maintain data integrity. All Microsoft Integration Services container types—packages, the For Loop, For each Loop, and Sequence containers, and the task hosts that encapsulate each task—can be configured to use transactions. Integration Services provides three options for configuring transactions: Not SupportedSupported, and Required.
Required indicates that the container starts a transaction, unless one is already started by its parent container. If a transaction already exists, the container joins the transaction. For example, if a package that is not configured to support transactions includes a Sequence container that uses the required option, the Sequence container would start its own transaction. If the package were configured to use the required option, the Sequence container would join the package transaction.
Supported indicates that the container does not start a transaction, but joins any transaction started by its parent container. For example, if a package with four Execute SQL tasks starts a transaction and all four tasks use the Supported option, the database updates performed by the Execute SQL tasks are rolled back if any task fails. If the package does not start a transaction, the four Execute SQL tasks are not bound by a transaction, and no database updates except the ones performed by the failed task are rolled back.
Not Supported indicates that the container does not start a transaction or join an existing transaction. A transaction started by a parent container does not affect child containers that have been configured to not support transactions. For example, if a package is configured to start a transaction and a For Loop container in the package uses the Not Supported option, none of the tasks in the For Loop can roll back if they fail.
102.                      Difference between For Loop and For Each Loop
For Loop and For Each loop both are used for looping the object or tasks in SSIS package. But both of them work differently.
For Loop is like normal for loop where you defines the EVAL expression and task loops that no of times. For example if you want to iterate some task 20 times it will loop for 20 times and then stop.
While For Each Loop works for object level let say we have a folder in which there are multiple files and these files need be dumped into database. For this you need to iterate for each file which is possible through for each loop task. For Each loop task will iterate for each file in the folder. You can say its iterates for the collection. For each loop has multiple type of iterates.

103.                      Difference between Execute TSQL Task and Execute SQL Task
In SSIS there is one tasks Execute TSQL task which is similar to Execute SQL task. Will see what the difference between two is.
Execute TSQL Task

Execute SQL Task
Takes less memory
Takes more memory compare to ETL
Parameterized queries not support
Parameterized query possible
Output into variable not supported
supported
Only ADO.net connection supported
Supported multiple type of connection
Faster performance
Comparatively low

104.                      Difference Between Merge and Union All

The Merge Transform can merge data from two paths into a single output. The transform is
useful when you wish to break out your Data Flow into a path that handles certain errors and
then merge it back into the main Data Flow downstream after the errors have been handled. It’s also useful if you wish to merge data from two Data Sources.
The data must be sorted before the Merge Transform. You can do this by using the Sort
Transform prior to the merge or by specifying an ORDER BY clause in the source connection.
The metadata must be the same between both paths. For example, the CustomerID column
can’t be a numeric column in one path and a character column in another path.
The Union All Transform works much the same way as the Merge Transform, but it does not
Require the data be sorted. It takes the outputs from multiple sources or transforms and
combines them into a single result set.

105.                      SSIS Synchronous and Asynchronous

Transformations are broken into two main categories: synchronous and asynchronous.
Synchronous transformations are components like the Derived Column and Data Conversion
Transforms where rows flow into memory buffers in the transform and the same buffers come
out. No rows are held and typically these transforms perform very quickly with minimal impact
to your Data Flow.
There are two types of asynchronous transforms: fully blocking and partial blocking.
Partial blocking transforms, such as the Union All Transform, create new memory buffers for
the output other transform than what come into the transform. Full blocking transforms, such
as the Sort and Aggregate Transforms, do the same thing but cause a full block of the data. In
order to sort the data, SSIS must first see every single row of the data. If you have a 100MB
file, then you may require 200MB of RAM in order to process the Data Flow because of a fully
blocking transform. These fully blocking transforms represent the single largest slowdown in
SSIS and architecture decisions you must make.

106.                      Config Files
Config file in Sql Server Integration Service is used to provide inputs to connection manager different properties which package use to assign values at run time dynamically.
Using config file user need to make changes to the config file which package will take automatically at run time because of using it you don't need to every time made changes in package in case you are deploying package on multiple servers or locations.
There are multiple ways in which configuration values can be store.

XML configuration file
Store the configuration as an XML file. .
Environment variable
Store the configuration in one of the environment variables.
Registry entry
Store the configuration in the registry.
Parent package variable
Store the configuration as a variable in the package that contains the task.
SQL Server
Store the configuration in a table in SQL Server.

Here are the steps to enable the config file for package.
On the SSIS package, right click and select Package Configurations.
In the Package Configuration Organizer dialog box, select Enable package configurations, and then click Add.

107.                      What configuration options have you used?

 This is an important one. Configuration should always be dynamic and usually is done using XML and/or Environment Variable and SQL Table with all configurations.

108.                      How to quickly load data into sql server table?

 Fast Load option. This option is not set by default so most developers know this answer as otherwise the load is very slow.

109.                      Give example of handling data quality issues?

Data Quality is almost always a problem and SSIS handles it very well. Examples include importing customers from different sources where customer name can be duplicates. For instance you can have as company name: SQL Server Business Intelligence but also SQL Server BI or SQL Server BI LTD or SQL Server BI Limited or intelligence (with one l). There are different ways to handle it. Robust and time consuming is to create a table with or possible scenarios and update it after each update. You can also use fuzzy grouping which is usually easy to implement and will make usually very good decisions but it is not 100% accurate so this approach has to be justified. Other typical quality issues are nulls (missing values), outliers (dates like 2999 or types like 50000 instead of 5000 especially important if someone is adjusting the value to get bigger bonus), incorrect addresses and these are either corrected during ETL, ignored, re-directed for further manual updates or it fails the packages which for big processes is usually not practised.
110.                      When to use Stored Procedures?
This was one of the requested question in comment (at the bottom of the page). This one is very important but also tricky. ALL SSIS developers have SQL Server background and that is sometime not very good if they use SQL not SSIS approach.Let's start with when you typically use SPs. This is for preparing tables (truncate), audit tasks (usually part of SSIS framework), getting configuration values for loops and a few other general tasks.
During ETL extract you usually type simple SQL because it comes from other sources and usually over complication is not a good choice (make it dynamic) because any changes usually affect the package which has to be updated as well.
During Transformation phase (business rules, cleaning, core work) you should use Transformation tasks not Stored procedures! There are loads of tasks that make the package much easier to develop but also a very important reason is readability which is very important for other people who need to change the package and obviously it reduces risks of making errors. Performance is usually very good with SSIS as it is memory/flow based approach. So when to use Stored Procedures for transformations? If you don't have strong SSIS developers or you have performance reasons to do it. In some cases SPs can be much faster (usually it only applies to very large datasets). Most important is have reasons which approach is better for the situation.

111.                What is your approach for ETL with data warehouses (how many packages you developer during typical load etc.)?

This is rather generic question. A typical approach (for me) when building ETL is to. Have a package to extract data per source with extract specific transformations (lookups, business rules, cleaning) and loads data into staging table. Then a package do a simple merge from staging to data warehouse (Stored Procedure) or a package that takes data from staging and performs extra work before loading to data warehouse. I prefer the first one and due to this approach I occasionally consider having extract stage (as well as stage phase) which gives me more flexibility with transformation (per source) and makes it simpler to follow (not everything in one go). So to summarize you usually have package per source and one package per data warehouse table destination. There are might be other approach valid as well so ask for reasons.

112.                True or False - Using a checkpoint file in SSIS is just like issuing the CHECKPOINT command against the relational engine. It commits all of the data to the database. 
False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure. 

113.                Can you explain the what the Import\Export tool does and the basic steps in the wizard? 
The Import\Export tool is accessible via BIDS or executing the dtswizard command.
The tool identifies a data source and a destination to move data either within 1 database, between instances or even from a database to a file (or vice versa).
 

114.                What are the command line tools to execute SQL Server Integration Services packages? 
DTSEXECUI - When this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to execute an SSIS package. 
DTEXEC - This is a pure command line tool where all of the needed switches must be passed into the command for successful execution of the SSIS package. 

115.                Can you explain the SQL Server Integration Services functionality in Management Studio? 
You have the ability to do the following: 
Login to the SQL Server Integration Services instance 
View the SSIS log 
View the packages that are currently running on that instance 
Browse the packages stored in MSDB or the file system 
Import or export packages 
Delete packages 
Run packages

116.                Can you name some of the core SSIS components in the Business Intelligence Development Studio you work with on a regular basis when building an SSIS package? 
Connection Managers 
Control Flow 
Data Flow 
Event Handlers 
Variables window 
Toolbox window 
Output window 
Logging 
Package Configurations

117.                True or False: SSIS has a default means to log all records updated, deleted or inserted on a per table basis. 
False, but a custom solution can be built to meet these needs. 

Question 2 - What is a breakpoint in SSIS? How is it setup? How do you disable it? 
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an opportunity to review the status of the data, variables and the overall status of the SSIS package. 
10 unique conditions exist for each breakpoint. 
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option. 


Question 3 - Can you name 5 or more of the native SSIS connection managers? 
OLEDB connection - Used to connect to any data source requiring an OLEDB connection (i.e., SQL Server 2000) 
Flat file connection - Used to make a connection to a single file in the File System. Required for reading information from a File System flat file 
ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task 
Analysis Services connection - Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task 
File connection - Used to reference a file or folder. The options are to either use or create a file or folder 
Excel 
FTP 
HTTP 
MSMQ 
SMO 
SMTP 
SQLMobile 
WMI 


Question 4 - How do you eliminate quotes from being uploaded from a flat file to SQL Server? 
In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included. 
Additional information: How to strip out double quotes from an import file in SQL Server Integration Services
Question 5 - Can you name 5 or more of the main SSIS tool box widgets and their functionality? 
For Loop Container 
Foreach Loop Container 
Sequence Container 
ActiveX Script Task 
Analysis Services Execute DDL Task 
Analysis Services Processing Task 
Bulk Insert Task 
Data Flow Task 
Data Mining Query Task 
Execute DTS 2000 Package Task 
Execute Package Task 
Execute Process Task 
Execute SQL Task 
etc.

Question Difficulty = Difficult

Question 1 - Can you explain one approach to deploy an SSIS package? 
One option is to build a deployment manifest file in BIDS, then copy the directory to the applicable SQL Server then work through the steps of the package installation wizard 
A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package 
A third option is to login to SQL Server Integration Services via SQL Server Management Studio then navigate to the 'Stored Packages' folder then right click on the one of the children folders or an SSIS package to access the 'Import Packages...' or 'Export Packages...'option. 
A fourth option in BIDS is to navigate to File | Save Copy of Package and complete the interface. 



Question 2 - Can you explain how to setup a checkpoint file in SSIS? 
The following items need to be configured on the properties tab for SSIS package: 
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name. 
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist. 
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior. 

Question 3 - Can you explain different options for dynamic configurations in SSIS? 
Use an XML file 
Use custom variables 
Use a database per environment with the variables 
Use a centralized database with all variables 

Question 4 - How do you upgrade an SSIS Package? 
Depending on the complexity of the package, one or two techniques are typically used: 
Recode the package based on the functionality in SQL Server DTS 
Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package that is not accurate


Question 5 - Can you name five of the Perfmon counters for SSIS and the value they provide? 
SQLServer:SSIS Service 
SSIS Package Instances - Total number of simultaneous SSIS Packages running 
SQLServer:SSIS Pipeline 
BLOB bytes read - Total bytes read from binary large objects during the monitoring period. 
BLOB bytes written - Total bytes written to binary large objects during the monitoring period. 
BLOB files in use - Number of binary large objects files used during the data flow task during the monitoring period. 
Buffer memory - The amount of physical or virtual memory used by the data flow task during the monitoring period. 
Buffers in use - The number of buffers in use during the data flow task during the monitoring period. 
Buffers spooled - The number of buffers written to disk during the data flow task during the monitoring period. 
Flat buffer memory - The total number of blocks of memory in use by the data flow task during the monitoring period. 
Flat buffers in use - The number of blocks of memory in use by the data flow task at a point in time. 
Private buffer memory - The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period. 
Private buffers in use - The number of blocks of memory in use by the transformations in the data flow task at a point in time. 
Rows read - Total number of input rows in use by the data flow task at a point in time. 
Rows written - Total number of output rows in use by the data flow task at a point in time.














1.              
2.             What is ETL?  (Video Answer)
3.             What is Business Intelligence ( BI) ?
4.             Which version of SSIS have you worked with? (Video Answer)
9.             What is the Engine Thread property of Data Flow Task?
17.          What is the difference between Checkpoint and Breakpoint in SSIS?
18.          Will my package run successfully by using SQL Server Agent if I have data viewers and Breakpoint enabled?
19.          What are different ways to execute your SSIS Package? Can I run a SSIS Package by using a Stored Procedure? (Video Answer)
20.          What types of deployment are available for a SSIS Package? Explain all. (Video Answer)
21.          What is the difference between Package Deployment and Project Deployment? (Video Answer)
23.          To run your SSIS Package, the Integration Services Server and SQL Server should be installed on the same server. Is this a correct statement?
24.          What are the different ways to run your SSIS package on a schedule? (Video Answer)
26.          How do you debug/troubleshoot your SSIS package? (Video Answer)
27.          Explain the important steps for performance tuning of your SSIS Package.
32.          What are Attunity Drivers and why do we use them with SSIS? (Video Answer)
36.          What is a TransactionOption Property at the Package level?  Is this property only available at the package level, or it is also available at the Container Level or Task level?
38.          You have an excel file and you want to clear the first cell of sheet 1, which task will you? (Answer : To perform this you have to use Script Task)
40.          You need to load 500 million records in a table; this table has cluster and non-cluster indexes already created. What will be your approach to load the data?
41.          What are Asynchronous and Synchronous data flow components?
42.          What are Non-blocking, Semi-blocking and Fully-blocking components in SSIS?
43.          What is Balanced Data Distributor( BDD)?
48.          If you need to watch a directory for a specific file to be added, which Task will you use?  (Video Answer)
49.          What is For-each Item Enumerator in For-each Loop Container?  Why would you use it? (Video Answer)
67.          What is the difference between Package Level Connection Manager and Project Level Connection Manager? (Video Answer)
74.          How would you make your SSIS Package dynamic?  (Video Answer)
75.          Sometime when you execute your SSIS Package, You get an error " File is used by another process". Why did you get this error? and How can you avoid this error? (Video Answer)
76.          How would you implement data validation in your SSIS Package?
77.          What are the best practices to test SSIS Package? 
78.          We can load data from one database tables to another database tables by using TSQL, Why do we use SSIS instead ?
92.          How would you create monthly archive folders in SSIS Package to archive your loaded files?  
(Video Answer)
94.          When you are working as SSIS developer , why it is important to know C# or Visual basis language? (Video Answer)
95.          Do you think every task should be done by using components of ETL Tool or better to use combination of SQL Objects ( such as Stored Procedure, Queries etc.) with components available in ETL Tool?  (Video Answer)
96.          You just joined a company as ETL Developer, You hear this term "Staging Database" from your ETL colleagues. What does this term mean to you?  (Video Answer)
97.          What is ODS Database? What it has to do with you as ETL Developer? 
98.          What is the difference between Full Load and Incremental Load? What is your approach to load data into Staging Database? (Video Answer)
99.          What techniques do you use to perform incremental Load in SSIS or as ETL developer? 
102.        How would you load multiple excel files with same structure to SQL Server Table and Archive them after adding datetime to them? (Video Answer)

Sources

1.       Name a few of the Sources available in SSIS.
2.       If we need to read an XML Source file, which Source will we use in SSIS?
3.       What problems have you faced when you used Excel Source?
4.       Why do we add IMEX=1 to extended properties of Excel Connection Manager? (Video Answer)
5.       If one of the Excel columns has alphanumeric data and another developer has changed the registry setting to TypeGuessRow=0, what does that mean to you?(Video Answer)
6.       How will you handle a column (such as comments) that has more than 255 characters in Excel at row number 1000.  (Video Answer)
7.       What is fast parse property in Flat File Source? What are the requirements and limitations to use this property?  (Video Answer)
8.       Can I read a csv file with Raw File Source?  If not, what type of files can I read by using Raw File Source? (Video Answer)
9.     Can we use Stored Procedure in OLE DB Source?  If yes, how do you map the parameters? (Video Answer)
10. If you need to connect to TeraData, Which Data Source will you use? 
(Video Answer)
11. 
How would you load fixed width flat file to SQL server table?  (Video Answer)


Transformations

1.       What is a SSIS Transformation? ( Video Answer)
2.       If you need to get the Username, Package Name and Package Start Time, which transformation will you use? (Video Answer)
4.       If you need to convert Data Type from String to Integer, which transformation will you use? (Video Answer)
5.       If you need to add a new column(s), which transformation will you use? (Video Answer)
8.       What is Lookup Transformation, and why do we use this transformation when we load Fact Table?
9.       What are three modes of Lookup Transformation, and what criteria should be used to choose the correct mode?
10.   Can we insert record in Lookup Table by using Lookup Transformation?
11.   Should you use a drop down to choose our reference table in Lookup?  Is this a good practice or not? (Video Answer)
16.   What alternative methods would you like to use for OLE DB Command Transformation? (Video Answer)
19.   Which transformation can be used as Source, Destination or Transformation? (Video Answer)
20.   What is Slowly Changing Dimension, and which transformation can you use to load SCD tables?
22.   How will you load an SCD2 type Table by using SSIS?
24.   Which Transformation requires us to use SORT Transformation with it? (Video Answer)
25.   What are the alternatives of SORT Transformation if our source is SQL Server and we need to use Merge Join for multiple sources?
26.   What is IsSorted Property, and why do we use it? (Video Answer)
28.   Let’s say we have some reference data in Excel, and we want to use that Excel data in Lookup Transformation.  How can we achieve that without loading that data into a staging table or temp table?
29.   What is Cache Transformation, and which transformation can use Cache Transformation-loaded data?  (Answer- Cache Transformation)
32.   What is the difference between Copy Column Transformation and Derived Column Transformation? (Video Answer)
33.   Let’s say my table contains images, and I am reading data from a table in Data Flow task. Which transformation can help me to save those images to files?
35.   I have a source file that contains 1000 records, and I want to insert 15% of those records in TableA and the remaining records in TableB.  Which transformation should I use? (Video Answer)
36.   What is the difference between Row Sampling and Percent Sampling Transformations? (Video Answer)
37.   Which transformation can be used to extract nouns only, noun phrases only, or both nouns and noun phases from a text-input column?
38.   How is Term Lookup different from Lookup Transformation?
39.   To Pivot or Unpivot input data, which transformations are available in SSIS?  (Pivot Answer) - Unpivot Answer
41.   If we have source S1 and Source S2, and we need to merge them so that we get sorted output, which transformation would we use? (Video Answer)
42.   What is the difference between Merge and Union All Transformation? (Video Answer)
43.   In Merge Join Transformation, we can use Inner Join, Left Join and Full Outer Join.  If I have to use Cross Join, which transformation would I use? (Video Answer)
44.   What is FindString function in Derived Column Transformation? (Video Answer)

Destinations

1.       What is Raw File Destination, where and why do we use it? (Video Answer)
2.       What is the difference between OLE DB Destination and SQL Server Destination? (Video Answer)
3.       What is Recordset Destination and where do we need use?  (Video Answer)
6.       What is "Rows Per batch" and "Maximum Insert Commit Size" mean to you in OLE DB Destination? How do we use them and why do we need to change values from default values?
7.       If there is trigger on a table and we are loading data in that table by using SSIS, Access Mode properties is set to “ Table or View-fast Load”  , Will trigger fire? If not, then which Mode (
Table or View,
Table or View-Fast Load,
Table Name or view name variable,
Table name or view name variable –fast load,
SQL Command) can we  choose to load data and Trigger get fire as well?  (Video Answer)
8. 
How would you create Fixed Width column Text file by using SSIS?  (Video Answer)

Variables and Expressions

1.       What is variable in SSIS, what are data types available for variable?
3.       What are expressions in SSIS? Where and Why do we need to use them? (Video Answer)
5.       Can we write expression on Connection Managers? If yes, where would you need to do that? (Video Answer)
6.       What are System Variables in SSIS? Name few of them those you have used. (Video Answer)
7.       Can we write expression on Precedence Constraint? if yes, how would you do it? (Video Answer)
10.   What are Parameters in SSIS 2012? how are they different from Variables?
11. Can you create two variables with same in name in SSIS Package? 
(Video Answer)


    Logging

3.       If you use SQL Server Logging, which table will be used to store log information? (Video Answer)
4.       What are the few column names that sysssislog have to store log information? (Video Answer)
5.       Do you log all the events or prefer to choose few of them? (Video Answer)
6.       Name few of Events you like to use in logging? (Video Answer)
7.       What is custom Logging and how is it different from built in Logging? (Video Answer)
  9.       If we are using SQL Server Logging, how often the records get deleted from syssislog table? or do we  have to create some purge process? (Video Answer)
10- Junior ETL developer stops to your desk and asked your suggest or best practice which Logging type  he/she should use? Explain which logging type in SSIS package you will suggest to him/her and why? 
(Video Answer)
11- How to Enable Windows Event Log type logging in SSIS Package and What's your suggestion, should we use it not ? 
(Video Answer)
 12- Is it possible to have two types of logging e.g. text file logging and SQL Server Logging in same SSIS  Package? 
(Video Answer)
13-Explain different methods to make the log information created by SSIS Packages available to your developers?
 (Video Answer)
 14- What type of logging create .trc file when you enable Logging in SSIS Package? 
(Video Answer)
 15- Can we create xml log file with datetime on each SSIS Package execution, How would you do that?      
(Video Answer) 
 16- Quick demo how to provide permission to non admin account on Event Viewer in Windows 2008R2/2012. This can be used when we enable Windows Event Log in SSIS Package and need to provide read permissions to developers to read Event Viewer entries for SSIS events. ( Video Answer)

Configuration

1.       What is Configuration in SQL Server Integration Services(SSIS)? (Video Answer)
2.       Why do we even need Configuration? (Video Answer)
3.       What are configuration types available for us to use? (Video Answer)
4.       What is the difference between Direct and Indirect Configuration? (Video Answer)
5.       If we use SQL Server Configuration, what is table name created by configuration Wizard? (Video Answer)
6.       How many columns configuration table do have? which one are important for us?
(Video Answer)
7.       For Indirect Configuration, do you create User variable or System variable to hold connection string?  (Video Answer)
8.       What is Parent Child Configuration and where do we need to use that? ( Video Answer)
10.   Let’s say we have SQL Server Connection Name ‘MyConnection’,  For configuration which properties of  Connection manager will you choose (Description,Name, ServerName,Scope,ConnectionString etc.) ? (Video Answer)


  Control Flow Task

4.       If your database in Full Recovery mode, can you use Bulk Insert Task to load data? What are the requirements to use Bulk Insert Task? (Video Answer)
6.       What is the difference between Script task and Script component? (Video Answer)
7.       What script languages are available for you to do scripting in Script task and script component? (Video Answer)
8.       What is the difference between ReadOnlyVariables and ReadWriteVariable in Script task? (Video Answer)
9.       Can you do debugging in Script task or Script component in SSIS 2012?
10.   You have create 5 packages, you want to call all of them in one package, which task can you use to call them? (Video Answer)
11.   You have deployed your SSIS Package to Integration Services Server, Can you still call them in Parent Package? (Video Answer)
13.   Which task can you use to delete, rename, move files and folders? (Video Answer)
16.   Before you create your SSIS Package and load data into destination, you want to analyze your data, which task will help you to achieve that?
17.   What is WSDL and in which task you have to use it?
18.   You have loaded email addresses in Object type variable, now you want to send email to each of the address you have in Object type variable, which task will you use to loop through and send email one by one? (Video Answer)
19.   There are 10 files in folder, we want to load them one by one, which tasks are required to perform this job? (Video Answer)
20.   You have a lot of Tasks sitting in Control Flow task , you want to group them, which container can help you to achieve that? (Video Answer)
21.   You got this assignment where you want to perform the same task 10 times, which loop will you use to do that? (Video Answer)
22.   Name few of the tasks those are available to create Maintenance plan or those can perform DBA tasks such as Create indexes, take backup etc. (Video Answer)
23.   In Execute SQL Task, what is Result Set ( Single Row, Full Result Set,XML) ? (Video Answer)
24. 
Which task would you use to execute dot batch files in SSIS? (Video Answer)
25. What type of Containers are available in SSIS Package? Explain two of them in detail. 
(Video Answer)
26. 
You need to create a directory with Date in SSIS Package, Which Task would you use that?  (Video Answer)


      Event Handlers

1.             What are Event Handlers? (Video Answer)
2.             What type of Tasks we can use in Event Handler pane? (Video Answer)
3.             What is the relationship between Executable and Event Handler in Event Handler Pane? (Video Answer)
4.             If your SSIS Package failed on any task, You want to send an email, how would you use the event handler to do this? (Video Answer)
5.             You have multiple Tasks in your SSIS Package such as Task1,Task2 and Task 3. If Task 1 or Task 2 fails, You want to send an email to users but if Task 3 fails then you want to truncate all the tables which are loaded by your SSIS Package, How would you do that? (Video Answer)
6.             We can run any tasks in Control Flow Pane on Failure of a Task/s by using Precedence constraint. What are the advantages of using Event Handler over Precedence Constraints? (Video Answer)

      Security

1.             What are the different ways to secure your SSIS Packages?
2.             Who should have the permission to access SSIS Packages in Production, While you were working as developer, Did you had permission to access packages in Production environment?
3.             What are different ways to store the Packages in different environments such as DEV,QA,UAT and Production?
4.             Your SSIS Package contains important information such as Server Name, Passwords etc. You want to set a password to protect this SSIS Package so every time it opens it ask for password. How would you do that?
5.             What are three integration services security roles available in SSIS 2008/R2?
6.             Which Package property will help you to encrypt package to restrict access to its contents?
SQL Server Integration Services Interview Questions and Answers
1.