SSIS Interview Questions
• What do we mean by dataflow in SSIS?
Data flow is nothing but the flow of data from the corresponding sources to the referred destinations. In this process, the data transformations make changes to the data to make it ready for the data warehouse.
• What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the
object where you want to set the breakpoint and select the ‘Edit Breakpoints…’ option.
• Can you name 5 or more of the native SSIS connection managers?
1) OLEDB connection – Used to connect to any data source requiring an OLEDB connection (i.e.,
SQL Server 2000)
2) Flat file connection – Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
3) ADO.Net connection – Uses the .Net Provider to make a connection to SQL Server 2005 or other
connection exposed through managed code (like C#) in a custom task
4) Analysis Services connection – Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
5) File connection – Used to reference a file or folder. The options are to either use or create a file or folder
6) Excel
• What is the use of Bulk Insert Task in SSIS?
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports only OLE DB connections for destination database.
• What is Conditional Split transformation in SSIS?
This is just like IF condition which checks for the given condition and based on the condition evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and MANY outputs. Conditional Split transformation is used to send paths to different outputs based on some conditions. For example, we can organize the transform for the students in a class who have marks greater than 40 to one path and the students who score less than 40 to another path.
• How do you eliminate quotes from being uploaded from a flat file to SQL Server?
This can be done using TEXT QUALIFIER property. In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
• Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName – Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it’s a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage – Determines if/how checkpoints are used. Choose from these options: Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints – Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.
• What are the different values you can set for CheckpointUsage property ?
There are three values, which describe how a checkpoint file is used during package execution:
1) Never: The package will not use a checkpoint file and therefore will never restart.
2) If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
3) Always: The package will always use a checkpoint file to restart, and if one does not exist, the package will fail.
• What is the ONLY Property you need to set on TASKS in order to configure CHECKPOINTS to RESTART package from failure?
The one property you have to set on the task is FailPackageOnFailure. This must be set for each task or container that you want to be the point for a checkpoint and restart. If you do not set this property to true and the task fails, no file will be written, and the next time you invoke the package, it will start from the beginning again.
• Where can we set the CHECKPOINTS, in DataFlow or ControlFlow ?
Checkpoints only happen at the Control Flow; it is not possible to checkpoint transformations or restart inside a Data Flow. The Data Flow Task can be a checkpoint, but it is treated as any other task.
• Can you explain different options for dynamic configurations in SSIS?
1) XML file
2) custom variables
3) Database per environment with the variables
4) Use a centralized database with all variables
• What is the use of Percentage Sampling transformation in SSIS?
Percentage Sampling transformation is generally used for data mining. This transformation builds a random sample of set of output rows by choosing specified percentage of input rows. For example if the input has 1000 rows and if I specify 10 as percentage sample then the transformation returns 10% of the RANDOM records from the input data.
• What is the use of Term Extraction transformation in SSIS?
Term Extraction transformation is used to extract nouns or noun phrases or both noun and noun phrases only from English text. It extracts terms from text in a transformation input column and then writes the terms to a transformation output column. It can be also used to find out the content of a dataset.
• What is Data Viewer and what are the different types of Data Viewers in SSIS?
A Data Viewer allows viewing data at a point of time at runtime. If data viewer is placed before and after the Aggregate transform, we can see data flowing to the transformation at the runtime and how it looks like after the transformation occurred. The different types of data viewers are:
1. Grid
2. Histogram
3. Scatter Plot
4. Column Chart.
• What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data(wrong type of data or JUNK data) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.
• Which are the different types of Control Flow components in SSIS?
The different types of Control Flow components are: Data Flow Tasks, SQL Server Tasks, Data Preparation Tasks, Work flow Tasks, Scripting Tasks, Analysis Services Tasks, Maintenance Tasks, Containers.
• What are containers? What are the different types of containers in SSIS?
Containers are objects that provide structures to packages and extra functionality to tasks. There are four types of containers in SSIS, they are: Foreach Loop Container, For Loop Container, Sequence Container and Task Host Container.
• What are the different types of Data flow components in SSIS?
There are 3 data flow components in SSIS.
1. Sources
2. Transformations
3. Destinations
• What are the different types of data sources available in SSIS?
There are 7 types of data sources provided by SSIS: a.) Data Reader source b.) Excel source c.) Flat file source d.) OLEDB source e.) Raw file source f.) XML source g.) Script component
• What is SSIS Designer?
It is a graphical tool for creating packages. It has 4 tabs: Control Flow, Data Flow, Event Handlers and Package Explorer.
• What is Control Flow tab?
It is the tab in SSIS designer where various Tasks can be arranged and configured. This is the tab where we provide and control the program flow of the project.
• What is Data Flow tab?
This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where we can extract data from sources, transform the data and then load them into destinations.
• What is the function of control flow tab in SSIS?
On the control flow tab, the tasks including dataflow task, containers and precedence constraints that connect containers and tasks can be arranged and configured.
• What is the function of Event handlers tab in SSIS?
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Failes or Stops or Starts ..
• What is the function of Package explorer tab in SSIS?
This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.
• What is Solution Explorer?
It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other miscellaneous files can be viewed and accessed for modification.
• How do we convert data type in SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.
• How are variables useful in ssis package?
Variables can provide communication among objects in the package. Variables can provide communication between parent and child packages. Variables can also be used in expressions and scripts. This helps in providing dynamic values to tasks.
• Explain Aggregate Transformation in SSIS?
It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in datagrid, and also operations for each Output Alias in Operation columns of the same datagrid. Some of operation functions listed below :
• Group By
• Average
• Count
• Count Distinct : count distinct and non null column value
• Min, Max, Sum
In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low, medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn On Division By Zero. If you check Warn On Division By Zero, the component will give warning instead of error. Key Scale option will optimize transformation cache to certain number of key threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify particular number of keys here. Default value is unspecified. Similar to number of keys for Count Distinct scale option. It is used to optimize number of distinct value written to memory, default value is unspecified. Auto Extend Factor is used when you want some portion of memory is used for this component. Default value is 25% of memory.
• Explain Audit Transformation ?
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this transformation are :
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4. VersionID : GUID version of the package
5. Execution StartTime
6. MachineName
7. UserName
8. TaskName
9. TaskID : uniqueidentifier type of the data flow task that contains audit transformation.
• Explain Character Map Transformation?
It transforms some character. It gives options whether output result will override the existing column or add to new column. If you define it as new column, specify new column name. Operations available here are:
1. Uppercase
2. Lowercase
3. Byte reversal : such as from 0×1234 to 0×4321
4. Full width
5. Half width
6. Hiragana/katakana/traditional Chinese/simplified Chinese
7. Linguistic casing
• Explain Conditional split Transformation ?
It functions as if…then…else construct. It enables send input data to a satisfied conditional branch. For example you want to split product quantity between less than 500 and greater or equal to 500. You can give the conditional a name that easily identifies its purpose. Else section will be covered in Default Output Column name.
After you configure the component, it connect to subsequent transformation/destination, when connected, it pops up dialog box to let you choose which conditional options will apply to the destination transformation/destination.
• Explain Copy column Transformation?
This component simply copies a column to another new column. Just like ALIAS Column in T-Sql.
• Explain Data conversion Transformation?
This component does conversion data type, similar to TSQL function CAST or CONVERT. If you wish to convery the data from one type to another then this is the best bet. But please make sure that you have COMPATABLE data in the column.
• Explain Data Mining query Transformation?
This component does prediction on the data or fills gap on it. Some good scenarios uses this component is:
1. Take some input columns as number of children, domestic income, and marital income to predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping cart.
3. Filling blank data or default values when customer doesn’t fill some items in the questionnaire.
• Explain Derived column Transformation?
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.
• Explain Merge Transformation?
Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.
• Explain Merge Join Transformation?
Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.
• Explain Multicast Transformation?
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.
• Explain Percentage and row sampling Transformations?
This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.
• Explain Sort Transformation?
This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.
• Explain Union all Transformation?
It works in opposite way to Merge transformation. It can take output from more than 2 input paths and combines into single output path.
• What r the possible locations to save SSIS package?
You can save a package wherever you want.
SQL Server
Package Store
File System
• What is a package?
A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.
• What is a workflow in SSIS?
A workflow is a set of instructions on how to execute tasks.
(It is a set of instructions on how to execute tasks such as sessions, emails and shell commands. a workflow is created form work flow mgr.
• What is the diff between control flow Items and data flow Items?
The control flow is the highest level control process. It allows you to manage the run-time process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow task to the package control flow.
• What are the main component of SSIS(project-architecture)?
SSIS archItecture has 4 main components
1.ssis service
2.ssis runtime engine & runtime executables
3.ssis dataflow engine & dataflow components
4.ssis clients
• Different components in SSIS package?
Control flow
Data flow
Event handler
Package explorer
• What are Connection Managers?
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.
• What is environment variable in SSIS?
An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.
• How to provide securIty to packages?
We can provide security in two ways
1. Package encryption
2. Password protection.
• What are Precedence constraints?
Constraints that link executable, container, and tasks wIthin the package control flow and specify condItion that determine the sequence and condItions for determine whether executable run.
• What is Design time Deployment in SSIS ?
When you run a package from with in BIDS,it is built and temporarily deployed to the folder. By default the package will be deployed to the BIN folder in the Package’s Project folder and you can configure for custom folder for deployment. When the Package’s execution is completed and stopped in BIDS,the deployed package will be deleted and this is called as Design Time Deployment.
What are the TrasactionOptions in SSIS
•Required - if a transaction exists join it else start a new one
•Supported - if a transaction exists join it (this is the default)
•NotSupported - do not join an existing transaction
Data flow Transformation categories in SSIS
Transformations are defined as a core component in the data flow of a package in SSIS. It is that part of the data flow to which we apply our business logic to manipulate and modify the input data into the required format beforeloading it to the destination. All the Data flow transformations are broadly classified into 2 types:-
Type 1 – Synchronous Transformations.
Type 2 – Asynchronous Transformations.
What is the difference between Synchronous and Asynchronous transformations?
Synchronous Transformations
Asynchronous Transformations
Processes each incoming row, modifies according to the required format and forward it.
Stores all the rows into the memory before it begins the process of modifying input data to the required output format.
No. of input rows = No. of output rows.
No. of input rows != No. of output rows
Output rows are in sync with Input rows i.e. 1:1 relationship.
Output rows are not in sync with Input rows
Less memory is required as they work on row by row basis.
More memory is required to store the whole data set as input and output buffers do not use the same memory.
Does not block the data flow in the pipeline.
Are also known as “Blocking Transformations” as they block the data flow in the pipeline until all the input rows are read into the memory.
Runs quite faster due to less memory required.
Runs generally slow as memory requirement is very high.
E.g. – Data Conversion Transformation- Input rows flow into the memory buffers and the same buffers come out in the required data format as Output.
E.g. – Sort Transformation- where the component has to process the complete set of rows in a single operation.
Further Asynchronous transformations are divided into 2 categories:-
Partially blocking transformations creates new memory buffers for the output of the transformation such as theUnion All transformation.
Fully blocking transformations performs the same operation but cause a full block of the data such as the Sort and Aggregate Transformations.
Non-Blocking transformations
Semi-blocking transformations
Blocking transformations
Audit
Data Mining Query
Aggregate
Character Map
Merge
Fuzzy Grouping
Conditional Split
Merge Join
Fuzzy Lookup
Copy Column
Pivot
Row Sampling
Data Conversion
Unpivot
Sort
Derived Column
Term Lookup
Term Extraction
Lookup
Union All
Multicast
Percent Sampling
Row Count
Script Component
Export Column
Import Column
Slowly Changing Dimension
OLE DB Command
1. What is BI?
Business
intelligence is a collection of skills, technologies and practices, by which we
can gather data, analyze data and provide data for forecasting and better
decision making.
Business
intelligence (BI) is a broad category of applications and technologies for
gathering, storing, analyzing, and providing access to data to help enterprise
users make better business decisions.
Collect data: - Data
in an enterprise can be stored in various formats. Now these formats can vary
from normalized structured RDBMS to excel sheets or probably unstructured file
formats. So the first step in BI is to collect all these unstructured and
scattered data and bring them in to one uniform format.
This can be achieved in 3 steps:-
1.
Extract: - In this step we read unorganized data from
these sources by understanding their data structure.
2.
Transform: - In this step we transform the data in to a
standard format.
3.
Load: - Finally we load the standard format in to a data
warehouse and data mart (In the next part we have explained what data
warehouse, data mart and OLAP?).
Analyze data: - Once
the data is loaded in to Data ware house, you run tools, algorithms so that you
can analyze and forecast information.
View data: - Once you have analyzed the data you would like view
it. Now again how people want to view data can vary from simple tabular format
to complex graphical chart. So in this section we would need good reporting
tools to achieve the same.
2. What is MS BI?
3. What is SSIS?
SQL
Server Integration Services (SSIS) is a component of SQL Server which can be
used to perform a wide range of Data Migration and ETL operations. SSIS is a
component in MSBI process of SQL Server.
This is a platform for Integration and Workflow applications. It is known for a fast and flexible OLTP and OLAP extensions used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and multidimensional data sets.
This is a platform for Integration and Workflow applications. It is known for a fast and flexible OLTP and OLAP extensions used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and multidimensional data sets.
4. What is
ETL?
5. What is a
package?
A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.
A discrete executable unit of work composed of a collection of control flow and other objects, including data sources, transformations, process sequence, and rules, errors and event handling, and data destinations.
6. What are
the main components of SSIS (project-architecture)?
SSIS architecture has 4 main components
1.SSIS service
2.SSIS runtime engine & runtime executables
3.SSIS dataflow engine & dataflow components
4.SSIS clients’
SSIS architecture has 4 main components
1.SSIS service
2.SSIS runtime engine & runtime executables
3.SSIS dataflow engine & dataflow components
4.SSIS clients’
7.
What is the Data Flow Engine?
The Data Flow Engine, also called
the SSIS pipeline engine, is responsible for managing the flow of data from the
source to the destination and performing transformations (lookups, data
cleansing etc.). Data flow uses memory oriented architecture, called buffers,
during the data flow and transformations which allows it to execute extremely
fast. This means the SSIS pipeline engine pulls data from the source, stores it
in buffers (in-memory), does the requested transformations in the buffers and
writes to the destination. The benefit is that it provides the fastest
transformation as it happens in memory and we don't need to stage the data for
transformations in most cases.
8. Different
components in SSIS package?
1. Control flow
2.data flow
3.event handler
4.package explorer
1. Control flow
2.data flow
3.event handler
4.package explorer
9.
What are the tools associated with SSIS?
We use Business Intelligence Development Studio (BIDS) and SQL Server Management Studio (SSMS) to work with Development of SSIS Projects.
We use SSMS to manage the SSIS Packages and Projects.
We use Business Intelligence Development Studio (BIDS) and SQL Server Management Studio (SSMS) to work with Development of SSIS Projects.
We use SSMS to manage the SSIS Packages and Projects.
10. 2. What are the Features of SSIS?
1) Improved Scripting with help of VSTA scripting engine
2) Ehnanced ADO.NET wizard
3) Import/Export Wizard used for transfering data from Excel to table and from one table to another table.
1) Improved Scripting with help of VSTA scripting engine
2) Ehnanced ADO.NET wizard
3) Import/Export Wizard used for transfering data from Excel to table and from one table to another table.
11. What is SSIS? How it is related with
SQL Server.
SQL Server Integration Services (SSIS) is a component of SQL Server which can be used to perform a wide range of Data Migration and ETL operations. SSIS is a component in MSBI process of SQL Server.
This is a platform for Integration and Workflow applications. It is known for a fast and flexible OLTP and OLAP extensions used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and multidimensional data sets.
SQL Server Integration Services (SSIS) is a component of SQL Server which can be used to perform a wide range of Data Migration and ETL operations. SSIS is a component in MSBI process of SQL Server.
This is a platform for Integration and Workflow applications. It is known for a fast and flexible OLTP and OLAP extensions used for data extraction, transformation, and loading (ETL). The tool may also be used to automate maintenance of SQL Server databases and multidimensional data sets.
12. What are the differences between DTS and SSIS
Data Transformation
Services
|
SQL Server
Integration Services
|
Limited Error
Handling
|
Complex and
powerful Error Handling
|
Message Boxes in
ActiveX Scripts
|
Message Boxes in
.NET Scripting
|
No Deployment
Wizard
|
Interactive
Deployment Wizard
|
Limited Set of
Transformation
|
Good number of
Transformations
|
NO BI functionality
|
Complete BI
Integration
|
13. What is a workflow in SSIS 2014?
Workflow is a set of instructions on to specify the Program Executor on how to execute tasks and containers within SSIS Packages.
Workflow is a set of instructions on to specify the Program Executor on how to execute tasks and containers within SSIS Packages.
14. What is a Task?
A task is very much like a method of any
programming language which represents or carries out an individual unit of
work. There are broadly two categories of tasks in SSIS, Control Flow tasks and
Database Maintenance tasks. All Control Flow tasks are operational in nature
except Data Flow tasks. Although there are around 30 control flow tasks which
you can use in your package you can also develop your own custom tasks with
your choice of .NET programming language
15. What is the control flow?
A control flow consists of one or more tasks and
containers that execute when the package runs. To control order or define the
conditions for running the next task or container in the package control flow,
we use precedence constraints to connect the tasks and containers in a package.
A subset of tasks and containers can also be grouped and run repeatedly as a
unit within the package control flow. SQL Server Integration Services (SSIS)
provides three different types of control flow elements: Containers that
provide structures in packages, Tasks that provide functionality, and
Precedence Constraints that connect the executables, containers, and tasks into
an ordered control flow.
16. What is the dataflow?
Data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package. Data Sources, Transformations, and Data Destinations are the three important categories in the Data Flow.
Data flow consists of the sources and destinations that extract and load data, the transformations that modify and extend data, and the paths that link sources, transformations, and destinations The Data Flow task is the executable within the SSIS package that creates, orders, and runs the data flow. A separate instance of the data flow engine is opened for each Data Flow task in a package. Data Sources, Transformations, and Data Destinations are the three important categories in the Data Flow.
17. Difference between
control flow and data flow
Control flow deals with orderly processing of
individual, isolated tasks, these tasks are linked through precedence
constraints in random order. Also the output for task has finite outcome i.e.,
Success, Failure, or Completion. A subsequent task does not initiate unless its
predecessor has completed. Data flow, on the other hand, streams the data in
pipeline manner from its source to a destination and modifying it in between by
applying transformations. Another distinction between them is the absence of a
mechanism that would allow direct transfer of data between individual control
flow tasks. On the other hand, data flow lacks nesting capabilities provided by
containers.
Control Flow
|
Data Flow
|
|
Process Oriented
|
Data Oriented
|
|
Made up of
|
Tasks and Container
|
Source, Transformation and Destination
|
Connected through
|
Precedence constraint
|
Paths
|
Smallest unit
|
Task
|
Component
|
Outcome
|
Finite- Success, Failure, Completion
|
Not fixed
|
18. What is the Difference
between control flow Items and data flow Items?
the control flow is the highest level control process. It allows you to manage the run-time process the run time process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow Task to the package control flow.
the control flow is the highest level control process. It allows you to manage the run-time process the run time process activities of data flow and other processes within a package.
When we want to extract, transform and load data within a package. You add an SSIS dataflow Task to the package control flow.
19. How does Error-Handling work in SSIS
When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.
Type of typical Errors in SSIS:
-Data Connection Errors, which occur in case the connection manager cannot be initialized with the connection string. This applies to both Data Sources and Data Destinations along with Control Flows that use the Connection Strings.
-Data Transformation Errors, which occur while data is being transformed over a Data Pipeline from Source to Destination.
-Expression Evaluation errors, which occur if expressions that are evaluated at run time perform invalid
When a data flow component applies a transformation to column data, extracts data from sources, or loads data into destinations, errors can occur. Errors frequently occur because of unexpected data values.
Type of typical Errors in SSIS:
-Data Connection Errors, which occur in case the connection manager cannot be initialized with the connection string. This applies to both Data Sources and Data Destinations along with Control Flows that use the Connection Strings.
-Data Transformation Errors, which occur while data is being transformed over a Data Pipeline from Source to Destination.
-Expression Evaluation errors, which occur if expressions that are evaluated at run time perform invalid
20. What is environment variable in SSIS?
An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.
An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.
21. Containers: provide
structure and scope to your package
Types of containers:
i. Task host container: the Task host container services a single Task.
ii. Sequence container: It can handle the flow of subset of a package and can help you drive a package into smaller more manageable process.
Uses:-
Types of containers:
i. Task host container: the Task host container services a single Task.
ii. Sequence container: It can handle the flow of subset of a package and can help you drive a package into smaller more manageable process.
Uses:-
1. Grouping
Tasks so that you can disable a part of the package that no longer needed.
2. Narrowing the scope of the variable to a container.
3. Managing the property of multiple Tasks in one step by setting the properties of the container.
iii. For loop container: evaluates an expression and repeats Its workflow until the expression evaluates to false.
iv. For each loop container: defines a control flow repeatedly by using an enumerator.
For each loop container repeats the control flow for each member of a specified enumerator.
2. Narrowing the scope of the variable to a container.
3. Managing the property of multiple Tasks in one step by setting the properties of the container.
iii. For loop container: evaluates an expression and repeats Its workflow until the expression evaluates to false.
iv. For each loop container: defines a control flow repeatedly by using an enumerator.
For each loop container repeats the control flow for each member of a specified enumerator.
22.
Tasks: It provides
the functionality to your package.
It is an
individual unit of work.
23.
Event handler: It responds to
raised events in your package.
24.
Precedence constraints: It provides
ordinal relationship b/w various Items in your package.
25.
How to deploy the package?
To deploy the package first we need to configure some properties.
To deploy the package first we need to configure some properties.
Ø Go to project
tab->package properties->we get a window, configure deployment Utility as
"True"
Ø Specify the
path as "bin/deployment"
26.
Connection manager:
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.
27.
Tell the Utility to execute (run) the
package?
a) In BIDS a package that can be executed in debug mode by using the debug menu or toolbar or from solution explorer.
In production, the package can be executed from the command line or from Microsoft windows Utility or It can be scheduled for automated execution by using the SQL server agent.
i). Go to->debug menu and select the start debugging button
ii).press F5 key
iii).right click the package and choose execute package.
iv).command prompts utilities
a) In BIDS a package that can be executed in debug mode by using the debug menu or toolbar or from solution explorer.
In production, the package can be executed from the command line or from Microsoft windows Utility or It can be scheduled for automated execution by using the SQL server agent.
i). Go to->debug menu and select the start debugging button
ii).press F5 key
iii).right click the package and choose execute package.
iv).command prompts utilities
a).DTExecUI
1. To open command prompt->run->type dtexecui->press enter
2. The execute package Utility dialog box opens.
3. in that click execute to run the package.
Wait until the package has executed successfully.
b).DTExec Utility
1.open the command prompt window.
2. Command prompt window->type dtexec /followed by the DTS, SQL, or file option and the package path, including package name.
3. If the package encryption level is encrypting sensitive with password or encrypt all with password, use the decrypt option to provide the password.
If no password is included, dtexec will prompt you for the password.
4. Optionally, provide additional command-line options
5. Press enter.
6. Optionally, view logging and reporting information before closing the command prompt window.
The execute package Utility dialog box opens.
7. In the execute package Utility dialog box, click execute package.
Wait until the package has executed successfully.
v).using SQL server mgmt studio to execute package
1. In SSMS right click a package, and then click run package.
Execute package Utility opens.
2. Execute the package as described previously.
28.
How can u design SCD in SSIS?
a) Def:-SCD explains how to capture the changes over the period of time.
This is also known as change data capture.
type1: It keeps the most recent values in the target. It does not maintain the history.
type2: It keeps the full history in the target database. For every update in the source a new record is inserted in the target.
type3: It keeps current & previous information in the target.
a) Def:-SCD explains how to capture the changes over the period of time.
This is also known as change data capture.
type1: It keeps the most recent values in the target. It does not maintain the history.
type2: It keeps the full history in the target database. For every update in the source a new record is inserted in the target.
type3: It keeps current & previous information in the target.
29.
How can u handle the errors through the help of
logging in SSIS?
a) To create an on error event handler to which you add the log error execute SQL Task.
a) To create an on error event handler to which you add the log error execute SQL Task.
30.
What is a log file and how to send log file to mgr?
a) It is especially useful when the package has been deployed to the production environment, and you cannot use BIDS and VSA to debug the package.
SSIS enables you to implement logging code through the Dts. Log method.
When the Dts. Log method is called in the script, the SSIS engine will route the message to the log providers that are configured in the containing package.
a) It is especially useful when the package has been deployed to the production environment, and you cannot use BIDS and VSA to debug the package.
SSIS enables you to implement logging code through the Dts. Log method.
When the Dts. Log method is called in the script, the SSIS engine will route the message to the log providers that are configured in the containing package.
31.
What is environment variable in SSIS?
a) An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.
a) An environment variable configuration sets a package property equal to the value in an environment variable.
Environmental configurations are useful for configuring properties that are dependent on the computer that is executing the package.
32.
About multiple configurations?
a) It means including the xml configuration, environment variable, registry entry, parent package variable, SQL Server table, and direct and indirect configuration types.
a) It means including the xml configuration, environment variable, registry entry, parent package variable, SQL Server table, and direct and indirect configuration types.
33.
How to provide security to packages?
a) In two ways
1. Package encryption
2. Password protection.
a) In two ways
1. Package encryption
2. Password protection.
34.
As per error handling in T/R,
which one handle the better performance? Like fail component, redirect row or
ignore failure?
a) Redirect row provides better performance for error handling.
a) Redirect row provides better performance for error handling.
35.
Staging area??
a) It is a temporary data storage location. Where various data T/R activities take place.
a) It is a temporary data storage location. Where various data T/R activities take place.
Staging area is a kitchen
of data warehouse.
36.
Task??
a) An individual unit of work.
a) An individual unit of work.
Types:-
1. Active x script Task
2. Analysis services execute DDL Task *
3. Analysis services processing Task *
4. Bulk insert Task *
5. Data flow Task *
6. Data mining query Task
7. Execute Dts 2000 package Task
8. Execute package Task *
9. Execute process Task
10. Execute SQL Task *
11. File system Task *
12. Ftp Task
13. Message queue Task
14. Script Task *
15. Send mail Task *
16. Web service Task
17. Wmi data reader Task
18. Wmi event Task
19. Xml Task
2. Analysis services execute DDL Task *
3. Analysis services processing Task *
4. Bulk insert Task *
5. Data flow Task *
6. Data mining query Task
7. Execute Dts 2000 package Task
8. Execute package Task *
9. Execute process Task
10. Execute SQL Task *
11. File system Task *
12. Ftp Task
13. Message queue Task
14. Script Task *
15. Send mail Task *
16. Web service Task
17. Wmi data reader Task
18. Wmi event Task
19. Xml Task
37.
Event handler & logging?
Even handler is the mechanism to raise a event based on specific scenario. For example if there is any failure in data load it will notify thru email or entry in error table etc.
Even handler is the mechanism to raise a event based on specific scenario. For example if there is any failure in data load it will notify thru email or entry in error table etc.
Logging can be done based
on event, in SSIS there are 12 events that can be logged at Task or package
level. You can enable partial logging for one Task and enable much more
detailed logging for billing Tasks.
Example:-
Example:-
On error; On post validate;
On progress; On warning
In SSIS different type of logging mechanism are there:
In SSIS different type of logging mechanism are there:
SQL profiler
Text files
SQL server
Window event log
Xml file
Text files
SQL server
Window event log
Xml file
38.
Import & export wizard?
a) Easiest method to move data from sources like oracle, db2, SQL server.
Right click on database name->go to Task->import and export wizard
Select the source
Select the destination
Query copy of tables
Execute
Finish
a) Easiest method to move data from sources like oracle, db2, SQL server.
Right click on database name->go to Task->import and export wizard
Select the source
Select the destination
Query copy of tables
Execute
Finish
39.
What solution explorer?
Once you creating project with specific project name then if you want to add
data source/-data source views/packages/ miscellaneous; then this window will help to organize different files under one structure.
Once you creating project with specific project name then if you want to add
data source/-data source views/packages/ miscellaneous; then this window will help to organize different files under one structure.
40.
Precedence constraints?
a) Constraints that link executable, container, and Tasks within the package control flow and specify condition that determine the sequence
and conditions for determine whether executable run.
a) Constraints that link executable, container, and Tasks within the package control flow and specify condition that determine the sequence
and conditions for determine whether executable run.
41. What do we mean by dataflow in
SSIS?
Data flow is nothing but the flow of data from the corresponding sources to the referred destinations. In this process, the data transformations make changes to the data to make it ready for the data warehouse.
Data flow is nothing but the flow of data from the corresponding sources to the referred destinations. In this process, the data transformations make changes to the data to make it ready for the data warehouse.
42. What is a breakpoint in SSIS?
How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the
object where you want to set the breakpoint and select the ‘Edit Breakpoints…’ option.
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an
opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the
object where you want to set the breakpoint and select the ‘Edit Breakpoints…’ option.
43. Can you name 5 or more of the
native SSIS connection managers?
1) OLEDB connection – Used to connect to
any data source requiring an OLEDB connection (i.e.,
SQL Server 2000)
2) Flat file connection – Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
3) ADO.Net connection – Uses the .Net Provider to make a connection to SQL Server 2005 or other
connection exposed through managed code (like C#) in a custom task
4) Analysis Services connection – Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
5) File connection – Used to reference a file or folder. The options are to either use or create a file or folder
6) Excel
SQL Server 2000)
2) Flat file connection – Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
3) ADO.Net connection – Uses the .Net Provider to make a connection to SQL Server 2005 or other
connection exposed through managed code (like C#) in a custom task
4) Analysis Services connection – Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
5) File connection – Used to reference a file or folder. The options are to either use or create a file or folder
6) Excel
44. What is the use of Bulk Insert
Task in SSIS?
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports only OLE DB connections for destination database.
Bulk Insert Task is used to upload large amount of data from flat files into Sql Server. It supports only OLE DB connections for destination database.
45. Can you explain how to setup a
checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName – Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it’s a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage – Determines if/how checkpoints are used. Choose from these options: Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints – Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName – Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it’s a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage – Determines if/how checkpoints are used. Choose from these options: Never(default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints – Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.
46. What are the different values
you can set for CheckpointUsage property ?
There are three values, which describe how a checkpoint file is used during package execution:
1) Never: The package will not use a checkpoint file and therefore will never restart.
2) If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
3) Always: The package will always use a checkpoint file to restart, and if one does not exist, the package will fail.
There are three values, which describe how a checkpoint file is used during package execution:
1) Never: The package will not use a checkpoint file and therefore will never restart.
2) If Exists: If a checkpoint file exists in the place you specified for the CheckpointFilename property, then it will be used, and the package will restart according to the checkpoints written.
3) Always: The package will always use a checkpoint file to restart, and if one does not exist, the package will fail.
47. What is the ONLY Property you need to set on
TASKS in order to configure CHECKPOINTS to RESTART package from failure?
The one property you have to set on the task is FailPackageOnFailure. This must be set for each task or container that you want to be the point for a checkpoint and restart. If you do not set this property to true and the task fails, no file will be written, and the next time you invoke the package, it will start from the beginning again.
The one property you have to set on the task is FailPackageOnFailure. This must be set for each task or container that you want to be the point for a checkpoint and restart. If you do not set this property to true and the task fails, no file will be written, and the next time you invoke the package, it will start from the beginning again.
48. Where can we set the
CHECKPOINTS, in DataFlow or ControlFlow ?
Checkpoints only happen at the Control Flow; it is not possible to checkpoint transformations or restart inside a Data Flow. The Data Flow Task can be a checkpoint, but it is treated as any other task.
Checkpoints only happen at the Control Flow; it is not possible to checkpoint transformations or restart inside a Data Flow. The Data Flow Task can be a checkpoint, but it is treated as any other task.
49. Can you explain different
options for dynamic configurations in SSIS?
1) XML file
2) custom variables
3) Database per environment with the variables
4) Use a centralized database with all variables
1) XML file
2) custom variables
3) Database per environment with the variables
4) Use a centralized database with all variables
50. Which are the different types
of Control Flow components in SSIS?
The different types of Control Flow components are: Data Flow Tasks, SQL Server Tasks, Data Preparation Tasks, Work flow Tasks, Scripting Tasks, Analysis Services Tasks, Maintenance Tasks, Containers.
The different types of Control Flow components are: Data Flow Tasks, SQL Server Tasks, Data Preparation Tasks, Work flow Tasks, Scripting Tasks, Analysis Services Tasks, Maintenance Tasks, Containers.
51. What is SSIS Designer?
It is a graphical tool for creating packages. It has 4 tabs: Control Flow, Data Flow, Event Handlers and Package Explorer.
It is a graphical tool for creating packages. It has 4 tabs: Control Flow, Data Flow, Event Handlers and Package Explorer.
52. • What is Control Flow tab?
It is the tab in SSIS designer where various Tasks can be arranged and configured. This is the tab where we provide and control the program flow of the project.
It is the tab in SSIS designer where various Tasks can be arranged and configured. This is the tab where we provide and control the program flow of the project.
53. • What is Data Flow tab?
This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where we can extract data from sources, transform the data and then load them into destinations.
This is the tab where we do all the work related to ETL job. It is the tab in SSIS Designer where we can extract data from sources, transform the data and then load them into destinations.
54. What is the function of control
flow tab in SSIS?
On the control flow tab, the tasks including dataflow task, containers and precedence constraints that connect containers and tasks can be arranged and configured.
On the control flow tab, the tasks including dataflow task, containers and precedence constraints that connect containers and tasks can be arranged and configured.
55. • What is the function of Event
handlers tab in SSIS?
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Fails or Stops or Starts.
On the Event handlers tab, workflows can be configured to respond to package events.
For example, we can configure Work Flow when ANY task Fails or Stops or Starts.
56. • What is the function of
Package explorer tab in SSIS?
This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.
This tab provides an explorer view of the package. You can see what is happening in the package. The Package is a container at the top of the hierarchy.
57. What is Solution Explorer?
It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other miscellaneous files can be viewed and accessed for modification.
It is a place in SSIS Designer where all the projects, Data Sources, Data Source Views and other miscellaneous files can be viewed and accessed for modification.
58. How do we convert data type in
SSIS?
The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.
The Data Conversion Transformation in SSIS converts the data type of an input column to a different data type.
59. How are variables useful in
SSIS package?
Variables can provide communication among objects in the package. Variables can provide communication between parent and child packages. Variables can also be used in expressions and scripts. This helps in providing dynamic values to tasks.
Variables can provide communication among objects in the package. Variables can provide communication between parent and child packages. Variables can also be used in expressions and scripts. This helps in providing dynamic values to tasks.
60.
Data pipeline?
a) The memory based, multithreaded, buffered t/r process flow data through an SSIS data flow Task during package execution.
a) The memory based, multithreaded, buffered t/r process flow data through an SSIS data flow Task during package execution.
61.
TRANSFORMATIONS??
It is an object that generates, modifies, or passes data.
1. AGGEGATE T/R:-It applies an aggregate function to grouped records and produces new output records from aggregated results.
2. AUDIT T/R:-the t/r adds the value of a system variable, such as machine name or execution instance GUID to a new output column.
3. CHARACTER MAP T/R:-this t/r makes string data changes such as changing data from lower case to upper case.
4. CONDITIONAL SPLIT:-It separate input rows into separate output data pipelines based on the Boolean expressions configured for each output.
5. COPY COLUMN:-add a copy of column to the t/r output we can later transform the copy keeping the original for auditing personal
6.DATA CONVERSION:-converts a columns data type to another data type.
7. DATA MINING QUERY:-perform a data mining query against analysis services.
8. DERIVED COLUMN:-create a new derive column calculated from expression.
9. EXPORT COLUMN:-It allows you to export a column from the data flow to a file.
10. FUZZY GROUPING:-perform data cleansing by finding rows that are likely duplicates.
11. FUZZY LOOKUP:-matches and standardizes data based on fuzzy logic.
Ex:-transform the name jon to john
12.IMPORT COLUMN:-reads the data from a file & adds It into a dataflow.
13. LOOKUP:-perform the lookup of data to be used later in a transform.
Ex:-t/T to lookup a city based on zip code.
1. Getting a related value from a table using a key column value
2. Update slowly changing dimension table
3.to check whether records already exist in the table.
14. MERGE:-merges two sorted data sets into a single data set into a single data flow.
15. MERGE JOIN:-merges two data sets into a single dataset using a join junction.
16. MULTI CAST:-sends a copy of two data to an additional path in the workflow.
17. ROW COUNT:-stores the rows count from the data flow into a variable.
18. ROW SAMPLING:-captures the sample of data from the dataflow by using a row count of the total rows in dataflow.
19. ROW SAMPLING:-captures the sample of the data from the data flow by using a row count of the total rows in data flow.
20. UNION ALL:-merge multiple data sets into a single dataset.
21. PIVOT:-converts rows into columns
22.UNPIVOT:-converts columns into rows
It is an object that generates, modifies, or passes data.
1. AGGEGATE T/R:-It applies an aggregate function to grouped records and produces new output records from aggregated results.
2. AUDIT T/R:-the t/r adds the value of a system variable, such as machine name or execution instance GUID to a new output column.
3. CHARACTER MAP T/R:-this t/r makes string data changes such as changing data from lower case to upper case.
4. CONDITIONAL SPLIT:-It separate input rows into separate output data pipelines based on the Boolean expressions configured for each output.
5. COPY COLUMN:-add a copy of column to the t/r output we can later transform the copy keeping the original for auditing personal
6.DATA CONVERSION:-converts a columns data type to another data type.
7. DATA MINING QUERY:-perform a data mining query against analysis services.
8. DERIVED COLUMN:-create a new derive column calculated from expression.
9. EXPORT COLUMN:-It allows you to export a column from the data flow to a file.
10. FUZZY GROUPING:-perform data cleansing by finding rows that are likely duplicates.
11. FUZZY LOOKUP:-matches and standardizes data based on fuzzy logic.
Ex:-transform the name jon to john
12.IMPORT COLUMN:-reads the data from a file & adds It into a dataflow.
13. LOOKUP:-perform the lookup of data to be used later in a transform.
Ex:-t/T to lookup a city based on zip code.
1. Getting a related value from a table using a key column value
2. Update slowly changing dimension table
3.to check whether records already exist in the table.
14. MERGE:-merges two sorted data sets into a single data set into a single data flow.
15. MERGE JOIN:-merges two data sets into a single dataset using a join junction.
16. MULTI CAST:-sends a copy of two data to an additional path in the workflow.
17. ROW COUNT:-stores the rows count from the data flow into a variable.
18. ROW SAMPLING:-captures the sample of data from the dataflow by using a row count of the total rows in dataflow.
19. ROW SAMPLING:-captures the sample of the data from the data flow by using a row count of the total rows in data flow.
20. UNION ALL:-merge multiple data sets into a single dataset.
21. PIVOT:-converts rows into columns
22.UNPIVOT:-converts columns into rows
62.
Batch?
a) A batch is defined as group of sessions. Those are 2 types.
1. Parallel batch processing
2. Sequential batch processing
to execute a SSIS package we will use "execute package utility"
To deploy a SSIS package we will use "package deployment Utility”
a) A batch is defined as group of sessions. Those are 2 types.
1. Parallel batch processing
2. Sequential batch processing
to execute a SSIS package we will use "execute package utility"
To deploy a SSIS package we will use "package deployment Utility”
63. What are the Transformations available in SSIS?
AGGREGATE - It applies aggregate functions to Record Sets to produce new output records from aggregated values.
AUDIT - Adds Package and Task level Metadata - such as Machine Name, Execution Instance, Package Name, Package ID, etc..
CHARACTERMAP - Performs SQL Server column level string operations such as changing data from lower case to upper case.
CONDITIONALSPLIT– Separates available input into separate output pipelines based on Boolean Expressions configured for each output.
COPYCOLUMN - Add a copy of column to the output we can later transform the copy keeping the original for auditing.
DATACONVERSION - Converts columns data types from one to another type. It stands for Explicit Column Conversion.
DATAMININGQUERY– Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
DERIVEDCOLUMN - Create a new (computed) column from given expressions.
EXPORTCOLUMN– Used to export a Image specific column from the database to a flat file.
FUZZYGROUPING– Used for data cleansing by finding rows that are likely duplicates.
FUZZYLOOKUP - Used for Pattern Matching and Ranking based on fuzzy logic.
IMPORTCOLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a data source. It is used for exact matches only.
MERGE - Merges two sorted data sets into a single data set into a single data flow.
MERGEJOIN - Merges two data sets into a single dataset using a join junction.
MULTICAST - Sends a copy of supplied Data Source onto multiple Destinations.
ROWCOUNT - Stores the resulting row count from the data flow / transformation into a variable.
ROWSAMPLING - Captures sample data by using a row count of the total rows in dataflow specified by rows or percentage.
UNIONALL - Merge multiple data sets into a single dataset.
PIVOT– Used for Normalization of data sources to reduce analomolies by converting rows into columns
UNPIVOT– Used for denormalizing the data structure by converts columns into rows incase of building Data Warehouses.
AGGREGATE - It applies aggregate functions to Record Sets to produce new output records from aggregated values.
AUDIT - Adds Package and Task level Metadata - such as Machine Name, Execution Instance, Package Name, Package ID, etc..
CHARACTERMAP - Performs SQL Server column level string operations such as changing data from lower case to upper case.
CONDITIONALSPLIT– Separates available input into separate output pipelines based on Boolean Expressions configured for each output.
COPYCOLUMN - Add a copy of column to the output we can later transform the copy keeping the original for auditing.
DATACONVERSION - Converts columns data types from one to another type. It stands for Explicit Column Conversion.
DATAMININGQUERY– Used to perform data mining query against analysis services and manage Predictions Graphs and Controls.
DERIVEDCOLUMN - Create a new (computed) column from given expressions.
EXPORTCOLUMN– Used to export a Image specific column from the database to a flat file.
FUZZYGROUPING– Used for data cleansing by finding rows that are likely duplicates.
FUZZYLOOKUP - Used for Pattern Matching and Ranking based on fuzzy logic.
IMPORTCOLUMN - Reads image specific column from database onto a flat file.
LOOKUP - Performs the lookup (searching) of a given reference object set against a data source. It is used for exact matches only.
MERGE - Merges two sorted data sets into a single data set into a single data flow.
MERGEJOIN - Merges two data sets into a single dataset using a join junction.
MULTICAST - Sends a copy of supplied Data Source onto multiple Destinations.
ROWCOUNT - Stores the resulting row count from the data flow / transformation into a variable.
ROWSAMPLING - Captures sample data by using a row count of the total rows in dataflow specified by rows or percentage.
UNIONALL - Merge multiple data sets into a single dataset.
PIVOT– Used for Normalization of data sources to reduce analomolies by converting rows into columns
UNPIVOT– Used for denormalizing the data structure by converts columns into rows incase of building Data Warehouses.
64. How to log SSIS Executions?
SSIS includes logging features that write log entries when run-time events occur and can also write custom messages. This is not enabled by default. Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files. Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.
SSIS includes logging features that write log entries when run-time events occur and can also write custom messages. This is not enabled by default. Integration Services supports a diverse set of log providers, and gives you the ability to create custom log providers. The Integration Services log providers can write log entries to text files, SQL Server Profiler, SQL Server, Windows Event Log, or XML files. Logs are associated with packages and are configured at the package level. Each task or container in a package can log information to any package log. The tasks and containers in a package can be enabled for logging even if the package itself is not.
65. How do you deploy SSIS packages?
SSIS Project BUILD provides a Deployment Manifest File. We need to run the manifest file and decide whether to deploy this onto File System or onto SQL Server [ msdb]. SQL Server Deployment is very faster and more secure then File System Deployment. Alternatively, we can also import the package from SSMS from File System or SQL Server.
SSIS Project BUILD provides a Deployment Manifest File. We need to run the manifest file and decide whether to deploy this onto File System or onto SQL Server [ msdb]. SQL Server Deployment is very faster and more secure then File System Deployment. Alternatively, we can also import the package from SSMS from File System or SQL Server.
66. What are variables and what is variable scope?
Variables store values that a SSIS package and its containers, tasks, and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that sequence tasks and containers into a workflow can use variables when their constraint definitions include expressions. Integration Services supports two types of variables: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by Integration Services. You can create as many user-defined variables as a package requires, but you cannot create additional system variables.
Variables store values that a SSIS package and its containers, tasks, and event handlers can use at run time. The scripts in the Script task and the Script component can also use variables. The precedence constraints that sequence tasks and containers into a workflow can use variables when their constraint definitions include expressions. Integration Services supports two types of variables: user-defined variables and system variables. User-defined variables are defined by package developers, and system variables are defined by Integration Services. You can create as many user-defined variables as a package requires, but you cannot create additional system variables.
67. If you want to send some data from Access database
to SQL server database. What are different component of SSIS will you use?
In the data flow, we will use one OLE DB source,
data conversion transformation and one OLE DB destination or SQL server
destination. OLE DB source is data source is useful for reading data from Oracle,
SQL Server and Access databases. Data Conversion transformation would be needed
to remove datatype abnormality since there is difference in datatype between
the two databases (Access and SQL Server) mentioned. If our database server is
stored on and package is run from same machine, we can use SQL Server
destination otherwise we need to use OLE DB destination. The SQL Server
destination is the destination that optimizes the SQL Server.
68.
Difference and similarity between
merge and merge join transformation
Merge Transofrmations
|
Merge Join Transformation
|
|
The data from 2 input paths are
merged into one
|
The data from 2 inputs are merged
based on some common key.
|
|
Works as
|
UNION ALL
|
JOIN (LEFT, RIGHT OR FULL)
|
Supports
|
2 Datasets
|
1 Dataset
|
Columns
|
Metadata for all columns needs to be
same
|
Key columns metadata needs to be
same.
|
Pre-requisites
|
Data must be sorted.
Merged columns should have same
datatype i.e. if merged column is EmployeeName with string of 25 character in
Input 1, it can be of less than or equal to 25 characters for merging to
happen.
|
Data must be sorted.
Merged columns should have same
datatype i.e. if merged column is EmployeeName with string of 25 character in
Input 1, it can be of less than or equal to 25 characters for merging to
happen.
|
Limitations
|
Only 2 input paths can be merged.
Does not support error handling.
|
Does not support error handling.
|
Use
|
Merging of data from 2 data source
Can create complex datasets using
nesting merge transformation,
|
When data from 2 tables having
foreign key relationship needs to present based on common key.
|
69.
What are the different types of Data flow
components in SSIS?
There are 3 data flow components in SSIS.
1. Sources
2. Transformations
3. Destinations
1. Sources
2. Transformations
3. Destinations
70. Explain Audit
Transformation?
It allows you to add auditing information as
required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX).
Auditing options that you can add to transformed
data through this transformation are:
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4.VersionID : GUID version of the package
5. Execution StartTime
6.MachineName
7.UserName
8.TaskName
9.TaskID : uniqueidentifier type of the data flow task that contains audit transformation
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4.VersionID : GUID version of the package
5. Execution StartTime
6.MachineName
7.UserName
8.TaskName
9.TaskID : uniqueidentifier type of the data flow task that contains audit transformation
71. Explain Copy column
Transformation?
This component simply copies a column to another
new column. Just like ALIAS Column in T-Sql.
72. Explain Derived
column Transformation?
Derived column creates new column or put
manipulation of several columns into new column. You can directly copy existing
or create a new column using more than one column also.
73. Explain Multicast
Transformation?
This transformation sends output to multiple output
paths with no conditional as Conditional Split does. Takes ONE Input and makes
the COPY of data and passes the same data through many outputs. In simple Give
one input and take many outputs of the same data.
74. What is a
Transformation?
A
transformation simply means bringing in the data in a desired format. For
example you are pulling data from the source and want to ensure only distinct
records are written to the destination, so duplicates are removed.
Anther example is if you have master/reference data and want to pull only
related data from the source and hence you need some sort of lookup. There are
around 30 transformation tasks available and this can be extended further with
custom built tasks if needed.
75. How many difference
source and destinations have you used?
It is very common to get all kinds of
sources so the more the person worked with the better for you. Common ones are
SQL Server, CSV/TXT, Flat Files, Excel, Access, Oracle, MySQL but also
Salesforce, web data scrapping.
76. What is Conditional Split
transformation in SSIS?
This is just like IF condition which checks for the given condition and based on the condition evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and MANY outputs. Conditional Split transformation is used to send paths to different outputs based on some conditions. For example, we can organize the transform for the students in a class who have marks greater than 40 to one path and the students who score less than 40 to another path.
This is just like IF condition which checks for the given condition and based on the condition evaluation, the output will be sent to the appropriate OUTPUT path. It has ONE input and MANY outputs. Conditional Split transformation is used to send paths to different outputs based on some conditions. For example, we can organize the transform for the students in a class who have marks greater than 40 to one path and the students who score less than 40 to another path.
77. How do you eliminate quotes
from being uploaded from a flat file to SQL Server?
This can be done using TEXT QUALIFIER property. In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
This can be done using TEXT QUALIFIER property. In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
78. What is the use of Percentage
Sampling transformation in SSIS?
Percentage sampling transformation is generally used for data mining. This transformation builds a random sample of set of output row’s by choosing specified percentage of input rows. For example if the input has 1000 rows and if I specify 10 as percentage sample then the transformation returns 10% of the RANDOM records from the input data.
Percentage sampling transformation is generally used for data mining. This transformation builds a random sample of set of output row’s by choosing specified percentage of input rows. For example if the input has 1000 rows and if I specify 10 as percentage sample then the transformation returns 10% of the RANDOM records from the input data.
79. What is the use of Term
Extraction transformation in SSIS?
Term Extraction transformation is used to extract nouns or noun phrases or both noun and noun phrases only from English text. It extracts terms from text in a transformation input column and then writes the terms to a transformation output column. It can be also used to find out the content of a dataset.
Term Extraction transformation is used to extract nouns or noun phrases or both noun and noun phrases only from English text. It extracts terms from text in a transformation input column and then writes the terms to a transformation output column. It can be also used to find out the content of a dataset.
80. What is Data Viewer and what
are the different types of Data Viewers in SSIS?
A Data Viewer allows viewing data at a point of time at runtime. If data viewer is placed before and after the Aggregate transform, we can see data flowing to the transformation at the runtime and how it looks like after the transformation occurred. The different types of data viewers are:
A Data Viewer allows viewing data at a point of time at runtime. If data viewer is placed before and after the Aggregate transform, we can see data flowing to the transformation at the runtime and how it looks like after the transformation occurred. The different types of data viewers are:
1. Grid
2. Histogram
3. Scatter Plot
4. Column Chart.
2. Histogram
3. Scatter Plot
4. Column Chart.
81. What is Ignore Failure option in SSIS?
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data (wrong type of data or JUNK data) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.
In Ignore Failure option, the error will be ignored and the data row will be directed to continue on the next transformation. Let’s say you have some JUNK data (wrong type of data or JUNK data) flowing from source, then using this option in SSIS we can REDIRECT the junk data records to another transformation instead of FAILING the package. This helps to MOVE only valid data to destination and JUNK can be captured into separate file.
82. Explain Aggregate
Transformation in SSIS?
It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in datagrid, and also operations for each Output Alias in Operation columns of the same datagrid. Some of operation functions listed below :
• Group By
• Average
• Count
• Count Distinct : count distinct and non null column value
• Min, Max, Sum
In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low, medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn On Division By Zero. If you check Warn On Division By Zero, the component will give warning instead of error. Key Scale option will optimize transformation cache to certain number of key threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify particular number of keys here. Default value is unspecified. Similar to number of keys for Count Distinct scale option. It is used to optimize number of distinct value written to memory, default value is unspecified. Auto Extend Factor is used when you want some portion of memory is used for this component. Default value is 25% of memory.
It aggregates data, similar you do in applying TSQL functions like Group By, Min, Max, Avg, and Count. For example you get total quantity and Total line item for each product in Aggregate Transformation Editor. First you determine input columns, then output column name in Output Alias table in datagrid, and also operations for each Output Alias in Operation columns of the same datagrid. Some of operation functions listed below :
• Group By
• Average
• Count
• Count Distinct : count distinct and non null column value
• Min, Max, Sum
In Advanced tab, you can do some optimization here, such as setting up Key Scale option (low, medium, high), Count Distinct scale option (low, medium, high), Auto Extend factor and Warn On Division By Zero. If you check Warn On Division By Zero, the component will give warning instead of error. Key Scale option will optimize transformation cache to certain number of key threshold. If you set it low, optimization will target to 500,000 keys written to cache, medium can handle up to 5 million keys, and high can handle up to 25 million keys, or you can specify particular number of keys here. Default value is unspecified. Similar to number of keys for Count Distinct scale option. It is used to optimize number of distinct value written to memory, default value is unspecified. Auto Extend Factor is used when you want some portion of memory is used for this component. Default value is 25% of memory.
83. • Explain Audit Transformation?
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this transformation are:
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4. VersionID : GUID version of the package
5. Execution StartTime
6. MachineName
7. UserName
8. TaskName
9. TaskID : uniqueidentifier type of the data flow task that contains audit transformation.
It allows you to add auditing information as required in auditing world specified by HIPPA and Sarbanes-Oxley (SOX). Auditing options that you can add to transformed data through this transformation are:
1. Execution of Instance GUID : ID of execution instance of the package
2. PackageID : ID of the package
3. PackageName
4. VersionID : GUID version of the package
5. Execution StartTime
6. MachineName
7. UserName
8. TaskName
9. TaskID : uniqueidentifier type of the data flow task that contains audit transformation.
84. Explain Character Map
Transformation?
It transforms some character. It gives options whether output result will override the existing column or add to new column. If you define it as new column, specify new column name. Operations available here are:
1. Uppercase
2. Lowercase
3. Byte reversal : such as from 0×1234 to 0×4321
4. Full width
5. Half width
6. Hiragana/katakana/traditional Chinese/simplified Chinese
7. Linguistic casing
It transforms some character. It gives options whether output result will override the existing column or add to new column. If you define it as new column, specify new column name. Operations available here are:
1. Uppercase
2. Lowercase
3. Byte reversal : such as from 0×1234 to 0×4321
4. Full width
5. Half width
6. Hiragana/katakana/traditional Chinese/simplified Chinese
7. Linguistic casing
85. Explain Conditional split
Transformation ?
It functions as if…then…else construct. It enables send input data to a satisfied conditional branch. For example you want to split product quantity between less than 500 and greater or equal to 500. You can give the conditional a name that easily identifies its purpose. Else section will be covered in Default Output Column name.
After you configure the component, it connect to subsequent transformation/destination, when connected, it pops up dialog box to let you choose which conditional options will apply to the destination transformation/destination.
It functions as if…then…else construct. It enables send input data to a satisfied conditional branch. For example you want to split product quantity between less than 500 and greater or equal to 500. You can give the conditional a name that easily identifies its purpose. Else section will be covered in Default Output Column name.
After you configure the component, it connect to subsequent transformation/destination, when connected, it pops up dialog box to let you choose which conditional options will apply to the destination transformation/destination.
86. Explain Copy column
Transformation?
This component simply copies a column to another new column. Just like ALIAS Column in T-Sql.
This component simply copies a column to another new column. Just like ALIAS Column in T-Sql.
87. Explain Data conversion
Transformation?
This component does conversion data type, similar to TSQL function CAST or CONVERT. If you wish to convert the data from one type to another then this is the best bet. But please make sure that you have COMPATABLE data in the column.
This component does conversion data type, similar to TSQL function CAST or CONVERT. If you wish to convert the data from one type to another then this is the best bet. But please make sure that you have COMPATABLE data in the column.
88. Explain Data Mining query
Transformation?
This component does prediction on the data or fills gap on it. Some good scenarios uses this component is:
1. Take some input columns as number of children, domestic income, and marital income to predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping cart.
3. Filling blank data or default values when customer doesn’t fill some items in the questionnaire.
This component does prediction on the data or fills gap on it. Some good scenarios uses this component is:
1. Take some input columns as number of children, domestic income, and marital income to predict whether someone owns a house or not.
2. Take prediction what a customer would buy based analysis buying pattern on their shopping cart.
3. Filling blank data or default values when customer doesn’t fill some items in the questionnaire.
89. Explain Derived column
Transformation?
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.
Derived column creates new column or put manipulation of several columns into new column. You can directly copy existing or create a new column using more than one column also.
90. • Explain Merge Transformation?
Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.
Merge transformation merges two paths into single path. It is useful when you want to break out data into path that handles errors after the errors are handled, the data are merge back into downstream or you want to merge 2 data sources. It is similar with Union All transformation, but Merge has some restrictions :
1. Data should be in sorted order
2. Data type , data length and other meta data attribute must be similar before merged.
91. Explain Merge Join
Transformation?
Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.
Merge Join transformation will merge output from 2 inputs and doing INNER or OUTER join on the data. But if you the data come from 1 OLEDB data source, it is better you join through SQL query rather than using Merge Join transformation. Merge Join is intended to join 2 different data source.
92. Explain Multicast
Transformation?
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.
This transformation sends output to multiple output paths with no conditional as Conditional Split does. Takes ONE Input and makes the COPY of data and passes the same data through many outputs. In simple Give one input and take many outputs of the same data.
93. Explain Percentage and row
sampling Transformations?
This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.
This transformation will take data from source and randomly sampling data. It gives you 2 outputs. First is selected data and second one is unselected data. It is used in situation where you train data mining model. These two are used to take the SAMPLE of data from the input data.
94. Explain Sort Transformation?
This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.
This component will sort data, similar in TSQL command ORDER BY. Some transformations need sorted data.
95. Explain Union all
Transformation?
It works in opposite way to merge transformation. It can take output from more than 2 input paths and combines into single output path.
It works in opposite way to merge transformation. It can take output from more than 2 input paths and combines into single output path.
96. What r the possible locations
to save SSIS package?
You can save a package wherever you want.
SQL Server
Package Store
File System
You can save a package wherever you want.
SQL Server
Package Store
File System
97. What are Connection Managers?
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.
It is a bridge b/w package object and physical data. It provides logical representation of a connection at design time the properties of the connection mgr describes the physical connection that integration services creates when the package is run.
98. What is Design time Deployment in SSIS?
When you
run a package from within BIDS, it is built and temporarily deployed to the
folder. By default the package will be deployed to the BIN folder in the
Package’s Project folder and you can configure for custom folder for
deployment. When the Package’s execution is completed and stopped in BIDS, the
deployed package will be deleted and this is called as Design Time Deployment.
99. What is Manifest file
in SSIS?
Manifest while if the
utility which can be used to deploy the package using wizard on file system and
SQL
Server database.
Server database.
100.
What is File System Deployment?
File system deployment means to save
package file on local or network drive.
Question: how to back up or retrieve the SSIS package
if your package is deployed on SQL Server then you can back up the MSDB database as all the package on SQL
server deploys at MSDB.
Question: how to back up or retrieve the SSIS package
if your package is deployed on SQL Server then you can back up the MSDB database as all the package on SQL
server deploys at MSDB.
101.
What is
Transaction in SSIS package and how to implement it? (SSIS Interview Questions)
Packages use
transactions to bind the database actions that tasks perform into atomic units,
and by doing this maintain data integrity. All Microsoft Integration
Services container types—packages, the For Loop, For each Loop, and Sequence
containers, and the task hosts that encapsulate each task—can be configured to
use transactions. Integration Services provides three options for configuring
transactions: Not
Supported, Supported, and Required.
Required indicates that the
container starts a transaction, unless one is already started by its parent
container. If a transaction already exists, the container joins the
transaction. For example, if a package that is not configured to support
transactions includes a Sequence container that uses the required option,
the Sequence container would start its own transaction. If the package were
configured to use the required option, the Sequence container would
join the package transaction.
Supported indicates that the
container does not start a transaction, but joins any transaction started by
its parent container. For example, if a package with four Execute SQL tasks
starts a transaction and all four tasks use the Supported option, the
database updates performed by the Execute SQL tasks are rolled back if any task
fails. If the package does not start a transaction, the four Execute SQL tasks
are not bound by a transaction, and no database updates except the ones
performed by the failed task are rolled back.
Not Supported indicates that the
container does not start a transaction or join an existing transaction. A
transaction started by a parent container does not affect child containers that
have been configured to not support transactions. For example, if a package is
configured to start a transaction and a For Loop container in the package uses
the Not Supported option, none of the tasks in the For Loop can roll
back if they fail.
102.
Difference
between For Loop and For Each Loop
For Loop and For Each loop both are used for looping the object or tasks
in SSIS package. But both of them work differently.
For Loop is like normal for loop where you defines the EVAL expression
and task loops that no of times. For example if you want to iterate some task
20 times it will loop for 20 times and then stop.
While For Each Loop works for object level let say we have a folder in
which there are multiple files and these files need be dumped into database.
For this you need to iterate for each file which is possible through for each
loop task. For Each loop task will iterate for each file in the folder. You can
say its iterates for the collection. For each loop has multiple type of
iterates.
103.
Difference between Execute
TSQL Task and Execute SQL Task
In SSIS there is one tasks Execute TSQL task which is similar to Execute
SQL task. Will see what the difference between two is.
Execute TSQL Task
|
Execute SQL Task
|
Takes less memory
|
Takes more memory compare to ETL
|
Parameterized queries not support
|
Parameterized query possible
|
Output into variable not supported
|
supported
|
Only ADO.net connection supported
|
Supported multiple type of connection
|
Faster performance
|
Comparatively low
|
104.
Difference Between Merge and Union
All
The Merge Transform can merge data
from two paths into a single output. The transform is
useful when you wish to break out
your Data Flow into a path that handles certain errors and
then merge it back into the main Data Flow downstream after the errors have been handled. It’s also useful if you wish to merge data from two Data Sources.
then merge it back into the main Data Flow downstream after the errors have been handled. It’s also useful if you wish to merge data from two Data Sources.
The data must be
sorted before the Merge Transform. You can do this by using the Sort
Transform prior to the merge or by
specifying an ORDER BY clause in the source connection.
The metadata must be the same between both paths. For example, the CustomerID column
can’t be a numeric column in one path and a character column in another path.
The metadata must be the same between both paths. For example, the CustomerID column
can’t be a numeric column in one path and a character column in another path.
The Union All
Transform works much the same way as the Merge Transform, but it does not
Require the data be
sorted. It takes the outputs from multiple sources or transforms and
combines them into a single result set.
combines them into a single result set.
105.
SSIS Synchronous and
Asynchronous
Transformations are broken into two
main categories: synchronous and asynchronous.
Synchronous transformations are
components like the Derived Column and Data Conversion
Transforms where rows flow into memory buffers in the transform and the same buffers come
out. No rows are held and typically these transforms perform very quickly with minimal impact
to your Data Flow.
Transforms where rows flow into memory buffers in the transform and the same buffers come
out. No rows are held and typically these transforms perform very quickly with minimal impact
to your Data Flow.
There are two types of asynchronous
transforms: fully blocking and partial blocking.
Partial blocking transforms, such as
the Union All Transform, create new memory buffers for
the output other transform than what come into the transform. Full blocking transforms, such
as the Sort and Aggregate Transforms, do the same thing but cause a full block of the data. In
order to sort the data, SSIS must first see every single row of the data. If you have a 100MB
file, then you may require 200MB of RAM in order to process the Data Flow because of a fully
blocking transform. These fully blocking transforms represent the single largest slowdown in
SSIS and architecture decisions you must make.
the output other transform than what come into the transform. Full blocking transforms, such
as the Sort and Aggregate Transforms, do the same thing but cause a full block of the data. In
order to sort the data, SSIS must first see every single row of the data. If you have a 100MB
file, then you may require 200MB of RAM in order to process the Data Flow because of a fully
blocking transform. These fully blocking transforms represent the single largest slowdown in
SSIS and architecture decisions you must make.
106.
Config Files
Config file in Sql
Server Integration Service is used to provide inputs to connection manager
different properties which package use to assign values at run time
dynamically.
Using config file user need to make changes to the config file which package will take automatically at run time because of using it you don't need to every time made changes in package in case you are deploying package on multiple servers or locations.
There are multiple ways in which configuration values can be store.
XML configuration file
Store the configuration as an XML file. .
Environment variable
Store the configuration in one of the environment variables.
Registry entry
Store the configuration in the registry.
Parent package variable
Store the configuration as a variable in the package that contains the task.
SQL Server
Store the configuration in a table in SQL Server.
Here are the steps to enable the config file for package.
On the SSIS package, right click and select Package Configurations.
In the Package Configuration Organizer dialog box, select Enable package configurations, and then click Add.
Using config file user need to make changes to the config file which package will take automatically at run time because of using it you don't need to every time made changes in package in case you are deploying package on multiple servers or locations.
There are multiple ways in which configuration values can be store.
XML configuration file
Store the configuration as an XML file. .
Environment variable
Store the configuration in one of the environment variables.
Registry entry
Store the configuration in the registry.
Parent package variable
Store the configuration as a variable in the package that contains the task.
SQL Server
Store the configuration in a table in SQL Server.
Here are the steps to enable the config file for package.
On the SSIS package, right click and select Package Configurations.
In the Package Configuration Organizer dialog box, select Enable package configurations, and then click Add.
107.
What configuration options
have you used?
This is an
important one. Configuration should always be dynamic and usually is done using
XML and/or Environment Variable and SQL Table with all configurations.
108.
How to quickly load data into
sql server table?
Fast Load
option. This option is not set by default so most developers know this answer
as otherwise the load is very slow.
109.
Give example of handling data
quality issues?
Data Quality is
almost always a problem and SSIS handles it very well. Examples include
importing customers from different sources where customer name can be
duplicates. For instance you can have as company name: SQL Server Business
Intelligence but also SQL Server BI or SQL Server BI LTD or SQL Server BI
Limited or intelligence (with one l). There are different ways to handle it.
Robust and time consuming is to create a table with or possible scenarios and
update it after each update. You can also use fuzzy grouping which is usually
easy to implement and will make usually very good decisions but it is not 100%
accurate so this approach has to be justified. Other typical quality issues are
nulls (missing values), outliers (dates like 2999 or types like 50000 instead
of 5000 especially important if someone is adjusting the value to get bigger
bonus), incorrect addresses and these are either corrected during ETL,
ignored, re-directed for further manual updates or it fails the packages which
for big processes is usually not practised.
110.
When to use Stored
Procedures?
This was one of the
requested question in comment (at the bottom of the page). This one is very
important but also tricky. ALL SSIS developers have SQL Server background and
that is sometime not very good if they use SQL not SSIS approach.Let's start
with when you typically use SPs. This is for preparing tables (truncate), audit
tasks (usually part of SSIS framework), getting configuration values for loops
and a few other general tasks.
During ETL extract you usually type simple SQL because it comes from other sources and usually over complication is not a good choice (make it dynamic) because any changes usually affect the package which has to be updated as well.
During Transformation phase (business rules, cleaning, core work) you should use Transformation tasks not Stored procedures! There are loads of tasks that make the package much easier to develop but also a very important reason is readability which is very important for other people who need to change the package and obviously it reduces risks of making errors. Performance is usually very good with SSIS as it is memory/flow based approach. So when to use Stored Procedures for transformations? If you don't have strong SSIS developers or you have performance reasons to do it. In some cases SPs can be much faster (usually it only applies to very large datasets). Most important is have reasons which approach is better for the situation.
During ETL extract you usually type simple SQL because it comes from other sources and usually over complication is not a good choice (make it dynamic) because any changes usually affect the package which has to be updated as well.
During Transformation phase (business rules, cleaning, core work) you should use Transformation tasks not Stored procedures! There are loads of tasks that make the package much easier to develop but also a very important reason is readability which is very important for other people who need to change the package and obviously it reduces risks of making errors. Performance is usually very good with SSIS as it is memory/flow based approach. So when to use Stored Procedures for transformations? If you don't have strong SSIS developers or you have performance reasons to do it. In some cases SPs can be much faster (usually it only applies to very large datasets). Most important is have reasons which approach is better for the situation.
111.
What is your approach for ETL
with data warehouses (how many packages you developer during typical load
etc.)?
This is rather
generic question. A typical approach (for me) when building ETL is to. Have a
package to extract data per source with extract specific transformations
(lookups, business rules, cleaning) and loads data into staging table. Then a
package do a simple merge from staging to data warehouse (Stored Procedure) or
a package that takes data from staging and performs extra work before loading
to data warehouse. I prefer the first one and due to this approach I
occasionally consider having extract stage (as well as stage phase) which gives
me more flexibility with transformation (per source) and makes it simpler to
follow (not everything in one go). So to summarize you usually have package per
source and one package per data warehouse table destination. There are might be
other approach valid as well so ask for reasons.
112.
True or False -
Using a checkpoint file in SSIS is just like issuing the CHECKPOINT command
against the relational engine. It commits all of the data to the database.
False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure.
False. SSIS provides a Checkpoint capability which allows a package to restart at the point of failure.
113.
Can you explain the
what the Import\Export tool does and the basic steps in the wizard?
The Import\Export tool is accessible via BIDS or executing the dtswizard command.
The tool identifies a data source and a destination to move data either within 1 database, between instances or even from a database to a file (or vice versa).
The Import\Export tool is accessible via BIDS or executing the dtswizard command.
The tool identifies a data source and a destination to move data either within 1 database, between instances or even from a database to a file (or vice versa).
114.
What are the
command line tools to execute SQL Server Integration Services packages?
DTSEXECUI - When this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to execute an SSIS package.
DTEXEC - This is a pure command line tool where all of the needed switches must be passed into the command for successful execution of the SSIS package.
DTSEXECUI - When this command line tool is run a user interface is loaded in order to configure each of the applicable parameters to execute an SSIS package.
DTEXEC - This is a pure command line tool where all of the needed switches must be passed into the command for successful execution of the SSIS package.
115.
Can you explain the
SQL Server Integration Services functionality in Management Studio?
You have the ability to do the following:
Login to the SQL Server Integration Services instance
View the SSIS log
View the packages that are currently running on that instance
Browse the packages stored in MSDB or the file system
Import or export packages
Delete packages
Run packages
You have the ability to do the following:
Login to the SQL Server Integration Services instance
View the SSIS log
View the packages that are currently running on that instance
Browse the packages stored in MSDB or the file system
Import or export packages
Delete packages
Run packages
116.
Can you name some
of the core SSIS components in the Business Intelligence Development Studio you
work with on a regular basis when building an SSIS package?
Connection Managers
Control Flow
Data Flow
Event Handlers
Variables window
Toolbox window
Output window
Logging
Package Configurations
Connection Managers
Control Flow
Data Flow
Event Handlers
Variables window
Toolbox window
Output window
Logging
Package Configurations
117.
True or False: SSIS
has a default means to log all records updated, deleted or inserted on a per
table basis.
False, but a custom solution can be built to meet these needs.
Question 2 - What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option.
Question 3 - Can you name 5 or more of the native SSIS connection managers?
OLEDB connection - Used to connect to any data source requiring an OLEDB connection (i.e., SQL Server 2000)
Flat file connection - Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task
Analysis Services connection - Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
File connection - Used to reference a file or folder. The options are to either use or create a file or folder
Excel
FTP
HTTP
MSMQ
SMO
SMTP
SQLMobile
WMI
Question 4 - How do you eliminate quotes from being uploaded from a flat file to SQL Server?
In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
Additional information: How to strip out double quotes from an import file in SQL Server Integration Services
Question 5 - Can you name 5 or more of the main SSIS tool box widgets and their functionality?
For Loop Container
Foreach Loop Container
Sequence Container
ActiveX Script Task
Analysis Services Execute DDL Task
Analysis Services Processing Task
Bulk Insert Task
Data Flow Task
Data Mining Query Task
Execute DTS 2000 Package Task
Execute Package Task
Execute Process Task
Execute SQL Task
etc.
Question Difficulty = Difficult
Question 1 - Can you explain one approach to deploy an SSIS package?
One option is to build a deployment manifest file in BIDS, then copy the directory to the applicable SQL Server then work through the steps of the package installation wizard
A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package
A third option is to login to SQL Server Integration Services via SQL Server Management Studio then navigate to the 'Stored Packages' folder then right click on the one of the children folders or an SSIS package to access the 'Import Packages...' or 'Export Packages...'option.
A fourth option in BIDS is to navigate to File | Save Copy of Package and complete the interface.
Question 2 - Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.
Question 3 - Can you explain different options for dynamic configurations in SSIS?
Use an XML file
Use custom variables
Use a database per environment with the variables
Use a centralized database with all variables
Question 4 - How do you upgrade an SSIS Package?
Depending on the complexity of the package, one or two techniques are typically used:
Recode the package based on the functionality in SQL Server DTS
Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package that is not accurate
Question 5 - Can you name five of the Perfmon counters for SSIS and the value they provide?
SQLServer:SSIS Service
SSIS Package Instances - Total number of simultaneous SSIS Packages running
SQLServer:SSIS Pipeline
BLOB bytes read - Total bytes read from binary large objects during the monitoring period.
BLOB bytes written - Total bytes written to binary large objects during the monitoring period.
BLOB files in use - Number of binary large objects files used during the data flow task during the monitoring period.
Buffer memory - The amount of physical or virtual memory used by the data flow task during the monitoring period.
Buffers in use - The number of buffers in use during the data flow task during the monitoring period.
Buffers spooled - The number of buffers written to disk during the data flow task during the monitoring period.
Flat buffer memory - The total number of blocks of memory in use by the data flow task during the monitoring period.
Flat buffers in use - The number of blocks of memory in use by the data flow task at a point in time.
Private buffer memory - The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period.
Private buffers in use - The number of blocks of memory in use by the transformations in the data flow task at a point in time.
Rows read - Total number of input rows in use by the data flow task at a point in time.
Rows written - Total number of output rows in use by the data flow task at a point in time.
False, but a custom solution can be built to meet these needs.
Question 2 - What is a breakpoint in SSIS? How is it setup? How do you disable it?
A breakpoint is a stopping point in the code. The breakpoint can give the Developer\DBA an opportunity to review the status of the data, variables and the overall status of the SSIS package.
10 unique conditions exist for each breakpoint.
Breakpoints are setup in BIDS. In BIDS, navigate to the control flow interface. Right click on the object where you want to set the breakpoint and select the 'Edit Breakpoints...' option.
Question 3 - Can you name 5 or more of the native SSIS connection managers?
OLEDB connection - Used to connect to any data source requiring an OLEDB connection (i.e., SQL Server 2000)
Flat file connection - Used to make a connection to a single file in the File System. Required for reading information from a File System flat file
ADO.Net connection - Uses the .Net Provider to make a connection to SQL Server 2005 or other connection exposed through managed code (like C#) in a custom task
Analysis Services connection - Used to make a connection to an Analysis Services database or project. Required for the Analysis Services DDL Task and Analysis Services Processing Task
File connection - Used to reference a file or folder. The options are to either use or create a file or folder
Excel
FTP
HTTP
MSMQ
SMO
SMTP
SQLMobile
WMI
Question 4 - How do you eliminate quotes from being uploaded from a flat file to SQL Server?
In the SSIS package on the Flat File Connection Manager Editor, enter quotes into the Text qualifier field then preview the data to ensure the quotes are not included.
Additional information: How to strip out double quotes from an import file in SQL Server Integration Services
Question 5 - Can you name 5 or more of the main SSIS tool box widgets and their functionality?
For Loop Container
Foreach Loop Container
Sequence Container
ActiveX Script Task
Analysis Services Execute DDL Task
Analysis Services Processing Task
Bulk Insert Task
Data Flow Task
Data Mining Query Task
Execute DTS 2000 Package Task
Execute Package Task
Execute Process Task
Execute SQL Task
etc.
Question Difficulty = Difficult
Question 1 - Can you explain one approach to deploy an SSIS package?
One option is to build a deployment manifest file in BIDS, then copy the directory to the applicable SQL Server then work through the steps of the package installation wizard
A second option is using the dtutil utility to copy, paste, rename, delete an SSIS Package
A third option is to login to SQL Server Integration Services via SQL Server Management Studio then navigate to the 'Stored Packages' folder then right click on the one of the children folders or an SSIS package to access the 'Import Packages...' or 'Export Packages...'option.
A fourth option in BIDS is to navigate to File | Save Copy of Package and complete the interface.
Question 2 - Can you explain how to setup a checkpoint file in SSIS?
The following items need to be configured on the properties tab for SSIS package:
CheckpointFileName - Specify the full path to the Checkpoint file that the package uses to save the value of package variables and log completed tasks. Rather than using a hard-coded path as shown above, it's a good idea to use an expression that concatenates a path defined in a package variable and the package name.
CheckpointUsage - Determines if/how checkpoints are used. Choose from these options: Never (default), IfExists, or Always. Never indicates that you are not using Checkpoints. IfExists is the typical setting and implements the restart at the point of failure behavior. If a Checkpoint file is found it is used to restore package variable values and restart at the point of failure. If a Checkpoint file is not found the package starts execution with the first task. The Always choice raises an error if the Checkpoint file does not exist.
SaveCheckpoints - Choose from these options: True or False (default). You must select True to implement the Checkpoint behavior.
Question 3 - Can you explain different options for dynamic configurations in SSIS?
Use an XML file
Use custom variables
Use a database per environment with the variables
Use a centralized database with all variables
Question 4 - How do you upgrade an SSIS Package?
Depending on the complexity of the package, one or two techniques are typically used:
Recode the package based on the functionality in SQL Server DTS
Use the Migrate DTS 2000 Package wizard in BIDS then recode any portion of the package that is not accurate
Question 5 - Can you name five of the Perfmon counters for SSIS and the value they provide?
SQLServer:SSIS Service
SSIS Package Instances - Total number of simultaneous SSIS Packages running
SQLServer:SSIS Pipeline
BLOB bytes read - Total bytes read from binary large objects during the monitoring period.
BLOB bytes written - Total bytes written to binary large objects during the monitoring period.
BLOB files in use - Number of binary large objects files used during the data flow task during the monitoring period.
Buffer memory - The amount of physical or virtual memory used by the data flow task during the monitoring period.
Buffers in use - The number of buffers in use during the data flow task during the monitoring period.
Buffers spooled - The number of buffers written to disk during the data flow task during the monitoring period.
Flat buffer memory - The total number of blocks of memory in use by the data flow task during the monitoring period.
Flat buffers in use - The number of blocks of memory in use by the data flow task at a point in time.
Private buffer memory - The total amount of physical or virtual memory used by data transformation tasks in the data flow engine during the monitoring period.
Private buffers in use - The number of blocks of memory in use by the transformations in the data flow task at a point in time.
Rows read - Total number of input rows in use by the data flow task at a point in time.
Rows written - Total number of output rows in use by the data flow task at a point in time.
1.
3.
What is Business Intelligence ( BI) ?
7.
What is parallel execution in SSIS, and how many Data Flow Tasks
can a package run in parallel? (Video Answer)
9.
What is the Engine Thread property of Data Flow Task?
10.
What are the Precedence Constraints in SSIS, and where and why
have you used them? (Video Answer)
11.
What is the difference between the Success and the Completion
value of Precedence Constraint? (Video Answer)
12.
What is the DelayValidation property of Data Flow Task? Why does
one use this property? (Video Answer)
13.
What is RetainSameConnection Property on Connection Manager in
SSIS Package? Why is it used? (Video
Answer)
14.
If we create a temp table in SSIS Package and want to use it in
other tasks, which properties do we need to use? (Video Answer)
15.
What is data Viewer in SSIS? Is data viewer available in
ControlFlow or Data Flow? (Video Answer)
16.
I am running my package for debugging and I do not want to load
datainto any destination. Which transformation can I use to ensure that the
data goes nowhere else? (Video Answer)
18.
Will my package run successfully by using SQL Server Agent if I
have data viewers and Breakpoint enabled?
19.
What are different ways to execute your SSIS Package? Can I run a SSIS Package by using a Stored
Procedure? (Video Answer)
22.
Which version of SSIS can track versions of a SSIS Package
deployed to the Server? (Video Answer)
23.
To run your SSIS Package, the Integration Services Server and
SQL Server should be installed on the same server. Is this a correct statement?
27.
Explain the important steps for performance tuning of your SSIS
Package.
28.
If your package is scheduled to run every night at 10 P.M., and
the package fails in production, where will you begin your search for the error
details? (Video Answer)
29.
You have created a variable in SSIS, and you have used that
variable in Data Flow Task in Row Count Transformation. If you want to display
or see that value of variable after Data Flow execution, how will you do that? (Video Answer)
30.
You are looking at Control Flow Item, and you cannot find
Execute SQL task. How will you bring it back to Control Flow Items? (Video
Answer)
34.
What is Annotation? Is it only available in Control Flow
Pane oravailable in other Panes as well? (Video Answer)
36.
What is a TransactionOption Property at the Package level?
Is this property only available at the package level, or it is also
available at the Container Level or Task level?
38.
You have an excel file and you want to clear the first cell of
sheet 1, which task will you? (Answer : To perform this you have to use Script
Task)
39.
A third-party software is available that you need to execute by
using SSIS. Which task can be used to run EXE file? (Video Answer)
40.
You need to load 500 million records in a table; this table has
cluster and non-cluster indexes already created. What will be your approach to
load the data?
41.
What are Asynchronous and Synchronous data flow components?
42.
What are Non-blocking, Semi-blocking and Fully-blocking
components in SSIS?
43.
What is Balanced Data Distributor( BDD)?
44.
What is Error Output? Can you redirect rows from
Sources,Transformations and Destinations in SSIS? (Video
Answer)
46.
Let’s say you have created an Excel File by using Excel
Destination. If you have to make the Header row bold, how will you do that in
SSIS? (Video Answer)
47.
If you need to send an HTML email, can you use the built-in Send
MailTask? What other options do you have? (Video Answer)
48.
If you need to watch a directory for a specific file to be
added, which Task will you use? (Video Answer)
49.
What is For-each Item Enumerator in For-each Loop Container?
Why would you use it? (Video Answer)
52.
You have redirected records due to Truncation or data conversion
errors from some transformation. How will you find which column created
the error? How do you detect the name of the column? (Video Answer)
54.
You have received a big Excel or flat file, but you only want to
load the first 100 records in a table. How will you do that? (Video Answer)
55.
You have received an Excel Source file, but the first 10 rows
have only company information; actual data starts from Row 11. How will
you skip the first 10 rows and start reading from row 11? (Video
Answer)
56.
If you need to read only one cell value from an Excel file in
SSIS, which task will you use? (Video Answer)
57.
You are extracting data from a view. The definition of
View can change anytime. We want to load this data to an Excel file by using
this view. How do you create a dynamic package so that you don't have to
re-do the mapping? (Video Answer)
58.
You are running different SSIS Packages on your server. How do
you determine how much time each package took for execution? (Video Answer)
61.
After developing your SSIS Package, you want to write a
technical document. What are important contents would you like to cover
in that document? (Video Answer)
62.
There are so many files in our folders, we want to save the file
name, file created date and size of each file in an Audit table. Which
tasks will you use to do that? (Video Answer)
63.
We have .sql files sitting in a folder, and we want to execute
all of them. How can you run them by using SSIS Package? (Video Answer)
64.
We have a table which contains different types of files that we
want to export to folder. Which transformation do we need to use? (Video
Answer)
65.
There are 100 files in a directory. All have the same structure.
You need to load the most recent file to table. How will you do that? (Video Answer)
66.
Is backwards compatibility possible for an SSIS Package? In
other words, if you have created your SSIS Package in SSIS 2008, can you
downgrade to SSIS 2005? (Video Answer)
67.
What is the difference between Package Level Connection Manager
and Project Level Connection Manager? (Video Answer)
71.
How would you find and replace object name in SSIS Package or
SSIS Packages in Solution? (Video Answer)
75.
Sometime when you execute your SSIS Package, You get an error
" File is used by another process". Why did you get this error? and
How can you avoid this error? (Video Answer)
76.
How would you implement data validation in your SSIS Package?
77.
What are the best practices to test SSIS Package?
78.
We can load data from one database tables to another database
tables by using TSQL, Why do we use SSIS instead ?
79.
How would you change the value of variable in SSIS Package
during debugging to test different scenarios? (Video Answer)
82.
Which Control Flow Task you will use if you need to convert
excel file to csv file? (Video Answer)
84.
You have different SSIS Packages in one of the folder, How would
you find the version of each SSIS Package? (Video Answer)
85.
Your SSIS Package consist of multiple Data Flow Tasks. You need
to execute only first 2 Data Flow Task, How would you do that? (Video Answer)
87.
If you have to perform Case Statement on one of the column data,
Which Transformation would you use and how will the expressions look like? (Video
Answer)
88.
If you need to copy all the tables from a SQL server Database to
flat files, how would you do that? (Video Answer)
90.
How would you convert Month Number into Month Full Name and
Month Short Name in SSIS Package? (Video Answer)
92.
How would you create monthly archive folders in SSIS Package to
archive your loaded files?
(Video Answer)
(Video Answer)
94.
When you are working as SSIS developer , why it is important to
know C# or Visual basis language? (Video Answer)
95.
Do you think every task should be done by using components of
ETL Tool or better to use combination of SQL Objects ( such as Stored
Procedure, Queries etc.) with components available in ETL Tool? (Video Answer)
96.
You just joined a company as ETL Developer, You hear this term
"Staging Database" from your ETL colleagues. What does this term mean
to you? (Video Answer)
97.
What is ODS Database? What it has to do with you as ETL
Developer?
98.
What is the difference between Full Load and Incremental Load?
What is your approach to load data into Staging Database? (Video Answer)
99.
What techniques do you use to perform incremental Load in SSIS
or as ETL developer?
102.
How would you load multiple excel files with same structure to
SQL Server Table and Archive them after adding datetime to them? (Video Answer)
Sources
1. Name a few of the Sources
available in SSIS.
2. If we need to read an XML Source
file, which Source will we use in SSIS?
3. What problems have you faced when
you used Excel Source?
5. If one of the Excel columns has
alphanumeric data and another developer has changed the registry setting
to TypeGuessRow=0, what
does that mean to you?(Video Answer)
6. How will you handle a column
(such as comments) that has more than 255 characters in Excel at row number
1000. (Video Answer)
7. What is fast parse property in
Flat File Source? What are the requirements and limitations to use this
property? (Video Answer)
8. Can I read a csv file with Raw
File Source? If not, what type of files can I read by using Raw File
Source? (Video Answer)
9. Can we use Stored Procedure in
OLE DB Source? If yes, how do you map the parameters? (Video Answer)
10. If you need to connect to TeraData, Which Data Source will you use? (Video Answer)
11. How would you load fixed width flat file to SQL server table? (Video Answer)
10. If you need to connect to TeraData, Which Data Source will you use? (Video Answer)
11. How would you load fixed width flat file to SQL server table? (Video Answer)
Transformations
2. If you need to get the Username,
Package Name and Package Start Time, which transformation will you use? (Video Answer)
4. If you need to convert Data Type
from String to Integer, which transformation will you use? (Video Answer)
6. You are reading data from a source by using a SSIS Package. If
some records have Null values and you want to replace Null values to Unknow
before inserting the data into a table, which Transformation will you use? (Video Answer)
8. What is Lookup Transformation,
and why do we use this transformation when we load Fact Table?
9. What are three modes of Lookup
Transformation, and what criteria should be used to choose the correct mode?
10. Can we insert record in Lookup
Table by using Lookup Transformation?
11. Should you use a drop down to
choose our reference table in Lookup? Is this a good practice or not? (Video Answer)
14. What is the major difference between ConditionalSplit and
Multicast Transformation? (Video Answer)
15. What is OLE DB Command Transformation? Where would you use
it and what are the disadvantages/advantages of using the OLE DB Command
Transformation? (Video Answer)
16. What alternative methods would
you like to use for OLE DB Command Transformation? (Video Answer)
17. If you need to get the Count for the number of records loaded
from Source to Destination, which transformation will you use? (Video Answer)
18. If you need to create a sequence number for input records, which
transformation will you use? (Video Answer)
20. What is Slowly Changing Dimension,
and which transformation can you use to load SCD tables?
22. How will you load an SCD2 type
Table by using SSIS?
25. What are the alternatives of SORT
Transformation if our source is SQL Server and we need to use Merge Join for
multiple sources?
27. How will you remove duplicate records in SSIS? Which
transformation can help with this task? (Video Answer)
28. Let’s say we have some reference
data in Excel, and we want to use that Excel data in Lookup Transformation.
How can we achieve that without loading that data into a staging table or
temp table?
29. What is Cache Transformation, and
which transformation can use Cache Transformation-loaded data? (Answer- Cache Transformation)
31. Which transformation can be used to change columndata to Lower
Case or Upper Case? (Video Answer)
32. What is the difference between
Copy Column Transformation and Derived Column Transformation? (Video Answer)
33. Let’s say my table contains
images, and I am reading data from a table in Data Flow task. Which
transformation can help me to save those images to files?
35. I have a source file that
contains 1000 records, and I want to insert 15% of those records in TableA and
the remaining records in TableB. Which transformation should I use? (Video Answer)
36. What is the difference between
Row Sampling and Percent Sampling Transformations? (Video Answer)
37. Which transformation can be used
to extract nouns only, noun phrases only, or both nouns and noun phases from a
text-input column?
38. How is Term Lookup different from
Lookup Transformation?
39. To Pivot or Unpivot input data,
which transformations are available in SSIS? (Pivot Answer) - Unpivot Answer
40. There is no Union Transformation in SSIS. How do you
perform UNION operation using built-in Transformation? (Video Answer)
41. If we have source S1 and Source
S2, and we need to merge them so that we get sorted output, which
transformation would we use? (Video Answer)
43. In Merge Join Transformation, we
can use Inner Join, Left Join and Full Outer Join. If I have to use Cross
Join, which transformation would I use? (Video Answer)
45. Do we have MonthName and DayName functions available in SSIS
which can be used in expressions or in Derived Column Transformation? (Video Answer)
46. How would you split single column data into multiple columns in SSIS Package?
(Video Answer)
47. How would you filter Null value rows in Data Flow Task? (Video Answer)
46. How would you split single column data into multiple columns in SSIS Package?
(Video Answer)
47. How would you filter Null value rows in Data Flow Task? (Video Answer)
Destinations
6. What is "Rows Per
batch" and "Maximum Insert Commit Size" mean to you in OLE DB
Destination? How do we use them and why do we need to change values from
default values?
7. If there is trigger on a table
and we are loading data in that table by using SSIS, Access Mode properties is
set to “ Table or View-fast Load” , Will trigger fire? If not, then which
Mode (
Table or
View,
Table or
View-Fast Load,
Table
Name or view name variable,
Table
name or view name variable –fast load,
SQL
Command) can we choose to load data and Trigger get fire as well? (Video Answer)
8. How would you create Fixed Width column Text file by using SSIS? (Video Answer)
8. How would you create Fixed Width column Text file by using SSIS? (Video Answer)
Variables and Expressions
1. What is variable in SSIS, what
are data types available for variable?
2. What is
Scope of Variable? How can we changeScope of variable in SSIS 2008 and SSIS
2012? (Video Answer)
5. Can we write expression on
Connection Managers? If yes, where would you need to do that? (Video Answer)
10. What are Parameters in SSIS 2012?
how are they different from Variables?
11. Can you create two variables with same in name in SSIS Package? (Video Answer)
11. Can you create two variables with same in name in SSIS Package? (Video Answer)
Logging
1. What is
logging in SSIS? How many types oflogging available in SSIS? Which one have you
used? (Video Answer)
8. If you
use timestamp in your text file logging,it creates multiple log files with each
execution, how to avoid that and whysingle execution create more than one log
file? (Video Answer)
9. If we are using SQL Server
Logging, how often the records get deleted from syssislog table? or do we
have to create some purge process? (Video Answer)
10- Junior ETL developer stops to your desk and asked your suggest or best practice which Logging type he/she should use? Explain which logging type in SSIS package you will suggest to him/her and why? (Video Answer)
11- How to Enable Windows Event Log type logging in SSIS Package and What's your suggestion, should we use it not ? (Video Answer)
12- Is it possible to have two types of logging e.g. text file logging and SQL Server Logging in same SSIS Package? (Video Answer)
13-Explain different methods to make the log information created by SSIS Packages available to your developers? (Video Answer)
14- What type of logging create .trc file when you enable Logging in SSIS Package? (Video Answer)
15- Can we create xml log file with datetime on each SSIS Package execution, How would you do that? (Video Answer)
10- Junior ETL developer stops to your desk and asked your suggest or best practice which Logging type he/she should use? Explain which logging type in SSIS package you will suggest to him/her and why? (Video Answer)
11- How to Enable Windows Event Log type logging in SSIS Package and What's your suggestion, should we use it not ? (Video Answer)
12- Is it possible to have two types of logging e.g. text file logging and SQL Server Logging in same SSIS Package? (Video Answer)
13-Explain different methods to make the log information created by SSIS Packages available to your developers? (Video Answer)
14- What type of logging create .trc file when you enable Logging in SSIS Package? (Video Answer)
15- Can we create xml log file with datetime on each SSIS Package execution, How would you do that? (Video Answer)
16-
Quick demo how to provide permission to non admin account on Event Viewer in
Windows 2008R2/2012. This can be used when we enable Windows Event Log in SSIS
Package and need to provide read permissions to developers to read Event Viewer
entries for SSIS events. ( Video Answer)
Configuration
5. If we use SQL Server
Configuration, what is table name created by configuration Wizard? (Video Answer)
7. For Indirect Configuration, do
you create User variable or System variable to hold connection string? (Video Answer)
10. Let’s say we have SQL Server
Connection Name ‘MyConnection’, For configuration which properties
of Connection manager will you choose (Description,Name,
ServerName,Scope,ConnectionString etc.) ? (Video Answer)
Control Flow
Task
2. If you need to run some SQL script( Stored Procedure, DML,DDL ).
Which task will you use in Control Flow? (Video Answer)
3. If you have created object type variable that you want to use in
script task later, how will you load data into Object Typevariable in Control
flow? Which task will help? (Video
Answer)
4. If your database in Full Recovery
mode, can you use Bulk Insert Task to load data? What are the requirements to
use Bulk Insert Task? (Video Answer)
5. If your company is using Ftp site to receivedaily data files,
which task will you use to download/upload/Delete files onFTP Site?
7. What script languages are
available for you to do scripting in Script task and script component? (Video Answer)
8. What is the difference between
ReadOnlyVariables and ReadWriteVariable in Script task? (Video Answer)
9. Can you do debugging in Script
task or Script component in SSIS 2012?
10. You have create 5 packages, you
want to call all of them in one package, which task can you use to call them? (Video Answer)
11. You have deployed your SSIS
Package to Integration Services Server, Can you still call them in Parent
Package? (Video
Answer)
12. If you load set of big files, after each load you want to zip
them and put them into archive folder. Which task would you need to use to Zip
them? (Video Answer)
14. Can you rename and move file by using one task? which is that
task and how will you do that? (Video Answer)
15. If your data flow task fails in Control Flow,which task will you
use to send email on Error? (Video Answer)
16. Before you create your SSIS
Package and load data into destination, you want to analyze your data, which
task will help you to achieve that?
17. What is WSDL and in which task
you have to use it?
18. You have loaded email addresses
in Object type variable, now you want to send email to each of the address you
have in Object type variable, which task will you use to loop through and send
email one by one? (Video Answer)
19. There are 10 files in folder, we
want to load them one by one, which tasks are required to perform this job? (Video Answer)
20. You have a lot of Tasks sitting
in Control Flow task , you want to group them, which container can help you to
achieve that? (Video Answer)
21. You got this assignment where you
want to perform the same task 10 times, which loop will you use to do that? (Video Answer)
22. Name few of the tasks those are
available to create Maintenance plan or those can perform DBA tasks such as
Create indexes, take backup etc. (Video Answer)
23. In Execute SQL Task, what is
Result Set ( Single Row, Full Result Set,XML) ? (Video Answer)
24. Which task would you use to execute dot batch files in SSIS? (Video Answer)
25. What type of Containers are available in SSIS Package? Explain two of them in detail. (Video Answer)
26. You need to create a directory with Date in SSIS Package, Which Task would you use that? (Video Answer)
24. Which task would you use to execute dot batch files in SSIS? (Video Answer)
25. What type of Containers are available in SSIS Package? Explain two of them in detail. (Video Answer)
26. You need to create a directory with Date in SSIS Package, Which Task would you use that? (Video Answer)
Event Handlers
3.
What is the relationship between Executable and Event Handler in
Event Handler Pane? (Video Answer)
4.
If your SSIS Package failed on any task, You want to send an
email, how would you use the event handler to do this? (Video Answer)
5.
You have multiple Tasks in your SSIS Package such as Task1,Task2
and Task 3. If Task 1 or Task 2 fails, You want to send an email to users but
if Task 3 fails then you want to truncate all the tables which are loaded by
your SSIS Package, How would you do that? (Video Answer)
6.
We can run any tasks in Control Flow Pane on Failure of a Task/s
by using Precedence constraint. What are the advantages of using Event Handler
over Precedence Constraints? (Video Answer)
Security
1.
What are the different ways to secure your SSIS Packages?
2.
Who should have the permission to access SSIS Packages in
Production, While you were working as developer, Did you had permission to
access packages in Production environment?
3.
What are different ways to store the Packages in different
environments such as DEV,QA,UAT and Production?
4.
Your SSIS Package contains important information such as Server
Name, Passwords etc. You want to set a password to protect this SSIS Package so
every time it opens it ask for password. How would you do that?
5.
What are three integration services security roles available in
SSIS 2008/R2?
6.
Which Package property will help you to encrypt package to
restrict access to its contents?
SQL Server Integration Services
Interview Questions and Answers
1.