Friday, November 21, 2014

MCTS SSIS 2012 Implementing a Data Warehouse

  1. Lesson 1: Implementing Dimensions and Fact Tables
    1. Creating a Data Warehouse Database
    2. Implementing Dimensions
    3. Implementing Fact Tables
      1. Practice: Implementing Dimensions and Fact Tables
    4. Lesson Summary
    5. Lesson Review
  2. Lesson 2: Managing the Performance of a Data Warehouse
    1. Indexing Dimensions and Fact Tables
    2. Indexed Views
      1. Using Appropriate Query Techniques
    3. Data Compression
    4. Columnstore Indexes and Batch Processing
      1. Practice: Loading Data and Using Data Compression and Columnstore Indexes
    5. Lesson Summary
    6. Lesson Review
  3. Lesson 3: Loading and Auditing Loads
    1. Using Partitions
    2. Data Lineage
      1. Practice: Performing Table Partitioning
    3. Lesson Summary
    4. Lesson Review
  4. Case Scenarios
    1. Case Scenario 1: Slow DW Reports
    2. Case Scenario 2: DW Administration Problems
  5. Suggested Practices
    1. Test Different Indexing Methods
    2. Test Table Partitioning

Lesson Summary
  1. Lesson 1: Implementing Dimensions and Fact Tables

■■ In this lesson, you learned about implementing a data warehouse.
■■ For a data warehouse database, you should use the Simple recovery model.
■■ When creating a database, allocate enough space for data files and log files to prevent autogrowth of the files.
■■ Use surrogate keys in dimensions in which you expect SCD Type 2 changes.
■■ Use computed columns.
Lesson Review
1. Which database objects and object properties can you use for autonumbering?
(Choose all that apply.)
A. IDENTITY property
B. SEQUENCE object
C. PRIMARY KEY constraint
D. CHECK constraint
2. What columns do you add to a table to support Type 2 SCD changes? (Choose all that
apply.)
A. Member properties
B. Current row flag
C. Lineage columns
D. Surrogate key
3. What is an inferred member?
A. A row in a fact table added during dimension load
B. A row with aggregated values
C. A row in a dimension added during fact table load
D. A computed column in a fact table
Answers:
Lesson 1
1. Correct Answers: A and B
A. Correct: The IDENTITY property autonumbers rows.
B. Correct: You can use the new SQL Server 2012 SEQUENCE object for
autonumbering.
C. Incorrect: Primary keys are used to uniquely identify rows, not for
autonumbering.
D. Incorrect: Check constraints are used to enforce data integrity, not for
autonumbering.
2. Correct Answers: B and D
A. Incorrect: Member properties are dimension columns used for additional information
on reports only.
B. Correct: You need a current flag for denoting the current row when you implement
Type 2 SCD changes.
C. Incorrect: Lineage columns are used, as their name states, to track the lineage
information.
D. Correct: You need a new, surrogate key when you implement Type 2 SCD changes.
3. Correct Answer: C
A. Incorrect: You do not add rows to a fact table during dimension load.
B. Incorrect: You do not create rows with aggregated values.
C. Correct: A row in a dimension added during fact table load is called an inferred
member.
D. Incorrect: A computed column is just a computed column, not an inferred member.


■■ In this lesson, you learned how to optimize data warehouse query performance.
■■ In a DW, you should not use many nonclustered indexes.
■■ Use small, integer surrogate columns for clustered primary keys.
■■ Use indexed views.
■■ Use columnstore indexes and exploit batch processing.
Lesson Review
1. Which types of data compression are supported by SQL Server? (Choose all that apply.)
A. Bitmap
B. Unicode(my word it automatically applied when u choose row or page compression)
C. Row
D. Page
2. Which operators can benefit from batch processing? (Choose all that apply.)
A. Hash Join
B. Merge Join
C. Scan
D. Nested Loops Join
E. Filter
3. Why would you use indexed views? (Choose all that apply.)
A. To speed up queries that aggregate data
B. To speed up data load
C. To speed up selective queries
D. To speed up queries that involve multiple joins

Answers:
1. Correct Answers: B, C, and D
A. Incorrect: SQL Server does not support bitmap compression.
B. Correct: SQL Server supports Unicode compression. It is applied automatically
when you use either row or page compression.
C. Correct: SQL Server supports row compression.
D. Correct: SQL Server supports page compression.
2. Correct Answers: A, C, and E
A. Correct: Hash joins can use batch processing.
B. Incorrect: Merge joins do not use batch processing.
C. Correct: Scan operators can benefit from batch processing.
D. Incorrect: Nested loops joins do not use batch processing.
E. Correct: Filter operators use batch processing as well.
3. Correct Answers: A and D
A. Correct: Indexed views are especially useful for speeding up queries that aggregate
data.
B. Incorrect: As with any indexes, indexed views only slow down data load.
C. Incorrect: For selective queries, nonclustered indexes are more appropriate.
D. Correct: Indexed views can also speed up queries that perform multiple joins.


■■ Table partitioning is extremely useful for large fact tables with columnstore indexes.
■■ Partition switch is a metadata operation only if an index is aligned with its base table.
■■ You can add lineage information to your dimensions and fact tables to audit changes
to your DW on a row level.
Lesson Review
1. The database object that maps partitions of a table to filegroups is called a(n)
A. Aligned index
B. Partition function
C. Partition column
D. Partition scheme
2. If you want to switch content from a nonpartitioned table to a partition of a partitioned
table, what conditions must the nonpartitioned table meet? (Choose all that apply.)
A. It must have the same constraints as the partitioned table.
B. It must have the same compression as the partitioned table.
C. It must be in a special PartitionedTables schema.
D. It must have a check constraint on the partitioning column that guarantees that all
of the data goes to exactly one partition of the partitioned table.
E. It must have the same indexes as the partitioned table.
3. Which of the following T-SQL functions is not very useful for capturing lineage
information?
A. APP_NAME()
B. USER_NAME()
C. DEVICE_STATUS()
D. SUSER_SNAME()

Test Table Partitioning
In order to understand table partitioning thoroughly, you should test it with aligned and
nonaligned indexes.
■■ Practice 1 Partition the FactInternetSales table in the AdventureWorkDW2012 sample
database. Create aligned nonclustered indexes on all foreign keys of the fact table
included in joins of the query from the previous practice. Run the query and check the
execution plan.
■■ Practice 2 Create nonaligned nonclustered indexes on all foreign keys of the fact
table included in joins of the query from the previous practice. Run the query and
check the execution plan again.

Answers:
1. Correct Answer: D
A. Incorrect: Aligned indexes are indexes with the same partitioning as their base
table.
B. Incorrect: The partition function does logical partitioning.
C. Incorrect: The partition column is the column used for partitioning.
D. Correct: The partition scheme does physical partitioning.
2. Correct Answers: A, B, D, and E
A. Correct: It must have the same constraints as the partitioned table.
B. Correct: It must have the same compression as the partitioned table.
C. Incorrect: There is no special schema for partitioned tables.
D. Correct: It must have a check constraint to guarantee that all data goes to a single
partition.
E. Correct: It must have the same indexes as the partitioned table.
3. Correct Answer: C
A. Incorrect: The APP_NAME() function can be useful for capturing lineage
information.
B. Incorrect: The USER_NAME() function can be useful for capturing lineage
information.
C. Correct: There is no DEVICE_STATUS() function in T-SQL.
D. Incorrect: The SUSER_SNAME() function can be useful for capturing lineage
information.



Prepared interview question on SSIS 2012

CheckPoint:
What do you need to set in order to use checkpoint properties at the package level, after you have turned the checkpoint properties on? 
Ans:You need to set the FailPackageOnFailure property to True for tasks to write to the checkpoint file. However, if you want to rerun any successful tasks that occur before the failure task, you need to use a sequence container around the group of related tasks that require transactions.
If you set the CheckpointUsage to Always, the checkpoint file must be present or the package will not start.
BEST PRACTICES Managing multiple tasks with transactions and restartability

To do this, you need to enable checkpoints in the package and then use a Sequence Container that holds all the associated tasks. The Sequence Container needs to have the transactions turned on by setting the TransactionOption to Required at the container level.

When a package has the restartability settings enabled and checkpoints set, it will manage the checkpoint file during its execution. The following processes happen behind the scenes for a package that has checkpoints set:
1.        When the package loads for execution, the checkpoint file is created.
2.        For every task or container that is configured for checkpoints, the checkpoint file is updated to include the latest runtime information such as variable values and the successful tasks.
3.        If the package fails, then the file remains in place so that the next time the package is executed, it will read the checkpoint file for execution. Alternatively, if you want the package to begin from the first task or container, the file can be manually deleted.
4.        If the package execution is successful, then the checkpoint file will be deleted automatically.
Note: not all systems support DTC, and a transaction cannot be forced on a non-compliant system.
How does SSIS Validation Works
SSIS Validation is the process of ensuring the package will execute successfully during runtime with its current property settings. There are two types of package validation available in SSIS.
• Design Time Validation (Early Validation) – Validates all of components and metadata of the package prior to executing the package
• Run Time Validation (Late validation) – Validates the individual components and metadata of the package prior to running the component
Designtime validation
Runtime
Workoffline(Project level)(ssis2012)
when you open a package, takes place
When you execute a package, the SSIS design time and run time validation takes place. In essence, during run time the package is actually validated twice.

This will speed up package opening and eliminate and early validation while working in the BIDS designer.
On the down side, the Work Offline property prevents the following.
·         You cannot run or debug any packages while in Work Offline property = “TRUE”
·         Once you close and save the package, the Work Offline setting is set to default setting of “FALSE
·         This is a project level setting, not a package level setting
You can override the default Work Offline setting by hacking the dtsproj.user file in the project.

SSIS Validations Properties
let take a look at the properties and options to manage validation at a project, package and individual component level.
Two step validation process utilized by SSIS.
Package level validation
Individual component level
Delay Validation: The delay validation property is available at the package level and at the individual component level. The Delay Validation property setting is not available for individual data flow components.
Package level Delay validation
Individual component level delay validation
is only applicable at runtime, not during design time(true)
Ex:execute sql task, required data connection
True for executesql task then the data connection error is removed in SSDT designer,so it is applied at design time,if we set the delayvalidation=true.
I would not recommend setting early Delay Validation = “TRUE” at the package level under any circumstances.

I would only change Delay Validation setting at the component level under the following conditions
·         I have a large package which uses a data source multiple times and the connection is slow or is a remote connection.
·         I have a database object that is created during run time and is referenced in another downstream component. (i.e. creation of a temporary table)  I would turn off early validation at the downstream component level to allow the package to run and delay early validation
If I am dynamically setting properties of a data connection which is not available during the early validation property.
Validate External Metadata
The Validate External Metadata property is similar to the Delay validation property but is available only in individual Data Flow Components only. The default setting for the property = “False”. Changing the Validate External Metadata property to true eliminates early package validation but has no effect on run time validation in SSIS.
I would recommend change the Validate External Metadata from its default setting under the following conditions.
·         Getting data from a data flow source task using a flat file or excel  that does not exist at runtime
·         A source or destination tasks where the database components is not available at runtime
·         Work Offline (Connection Level)
·         SQL 2012 introduces the ability to Work Offline at the package level and connection level.
Package parameters allow you to modify package execution without having to edit and redeploy the package.
Package deployment model:
Step1: Deploy package (contains parameters using the package deployment model)
Step2: execute package
Parameters
Package parameter
Project parameters
The parameters are not called during execution
If the package contains package parameters and expressions within the package use the parameters, the resulting values are applied at runtime.
If the package contains project parameters, the package execution may fail.


2. Typically, the principal difference between development and production environments is
in the configuration of data stores. In development, all data can reside on the same server
(even in the same database). In fact, because for development a subset of data is usually all
that is needed (or available) to the developer, all stored development data could easily be
placed on the developer’s personal computer. Therefore, you should account for the following
differences between the development and the production environments when developing
SSIS solutions:
■■ Connections:source &destinations
Data platforms:sql server versions diff on prod and dev
Security
3. What is the principal difference between a success constraint and a completion constraint?
A success constraint will only allow the process to continue to the following task if the preceding task completed successfully, whereas a completion constraint will allow the process to continue as soon as the preceding task has completed, regardless of the outcome.
4. What is the difference between the Union All and the Merge transformation?
The Merge transformation is similar to Union All, but with Merge, the sources have to be sorted and the sort position is preserved.
5. What is the difference between the Lookup and Merge Join transformations?
The Lookup transformation does not need a sorted input; it is a non-blocking transformation, and in cases when more matching rows exist from the referenced dataset, only the first one will be retrieved. This means that the transformation will never retrieve more rows than exist in the input rows. With the Merge Join transformation more rows can be retrieved, because all matching data is retrieved.
6. There is an important difference between lineage and other columns:
lineage columns are never exposed to end users and are never shown on end users’ reports.
To summarize, a dimension may contain the following types of columns:
■■ Keys Used to identify entities
■■ Name columns Used for human names of entities
■■ Attributes Used for pivoting in analyses
■■ Member properties Used for labels in a report
■■ Lineage columns Used for auditing, and never exposed to end users

7. You should understand the differences between individual SSISDB logging levels in order to be able to choose the most appropriate one for a particular purpose. For instance, by turning logging off, you could slightly improve execution performance at the cost of losing all ability to detect execution errors. On the other hand, using verbose logging will provide a lot of information about the execution but will lead to less-than-optimal performance.

·         Quick Check

1. You add a sequence container to a package that contains several tasks, one of which calls a command on a legacy system and another of which a Data Flow Task imports data into SQL Server. Even with DTC started and transactions turned on, your sequence container fails before the tasks even run. What is the problem?

2. What do you need to set in order to use checkpoint properties at the package level, after you have turned the checkpoint properties on?

Quick Check Answers

1. The transactions featured in SSIS use the DTC service. However, not all systems support DTC, and a transaction cannot be forced on a non-compliant system, so the container will fail. You should remove the legacy task from the sequence container that has the transaction.

2. You need to set the FailPackageOnFailure property to True for tasks to write to the checkpoint file. However, if you want to rerun any successful tasks that occur before the failure task, you need to use a sequence container around the group of related tasks that require transactions.
·          
·         Quick Check

1. A Data Conversion transformation is failing in the middle of the data flow execution, and you need to determine what is causing the error. How should you proceed?

Ans: . To determine what is causing the error, configure the Data Conversion transformation error path to Flat File so that any rows that are failing conversion are sent out to a file. Then, create a data viewer on the error path, and run the package in BIDS. This technique will capture the errors in a file and display the rows in the designer for troubleshooting.

2. Your package contains a string variable that you are updating, using a Script Task, to be a file path and file name. Your package is failing at a File System Task that is configured to use the variable to move the file to a different folder on the server. How do you troubleshoot the package? Ans: . Because the Script Task can contain embedded breakpoints in the code, set a breakpoint in the script so that you will be able to execute the package and step through the lines of code, observing the value of the variable to check the code and accuracy.
3. You would like to log all the package errors to a custom database table you have created for auditing purposes. How can you accomplish this task?
Ans: By using the OnError event handler assigned to the package level, you can also use an Execute SQL Task that calls a stored procedure, passing in the SourceName and ErrorDescription variable values. The procedure can then track these details into a metadata storage table for auditing.

Friday, November 14, 2014

SSIS INTERVIEW QUESTIONS

General

  1. What is ETL? (Answer)
  2. What are the important components of an ETL Tool ?
  3. What is Business Intelligence ( BI) ?
  4. Which ETL Tool have you used as a Developer ?
  5. Which version of SSIS have you worked with?
  6. What is DTS, and how is it different from SSIS?
  7. Name some of the features available in SSIS 2008 that were not available in SSIS 2005?
  8. What additional features are available in SSIS 2012 that were not in SSIS 2008? (Answer )
  9. What is the difference between Control  Flow and Data Flow in SSIS? (Answer)
  10. What is parallel execution in SSIS, and how many Data Flow Tasks can a package run in parallel?(Answer)
  11. What is the MaxConcurrentExecutables property on a Package level? (Answer)
  12. What is the Engine Thread property of Data Flow Task?
  13. What are the Precedence Constraints in SSIS, and where and why have you used them?  (Answer)
  14. What is the difference between the Success and the Completion value of Precedence Constraint? (Answer)
  15. What is the DelayValidation property of Data Flow Task? Why does one use this property? ( Answer )
  16. What is RetainSameConnection Property on Connection? Why is it used? (Answer
  17. If we create a temp table in SSIS Package and want to use it in other tasks, which properties do we need to use? (Answer)
  18. What is data Viewer in SSIS? Is data viewer available in Control Flow or Data Flow? (Answer)
  19. I am running my package for debugging and I do not want to load data into any destination. Which transformation can I use to ensure that the data goes nowhere else?  (Answer)
  20. What is the difference between Checkpoint and Breakpoint in SSIS? (Answer-BreakPoint)
  21. Will my package run successfully by using SQL Server Agent if I have data viewers and Breakpoint enabled?
  22. What are different ways to execute your SSIS Package? Can I run a SSIS Package by using a Stored Procedure?
  23. What types of deployment are available for a SSIS Package? Explain all.
  24. What is the difference between Package Deployment and Project Deployment?
  25. Which version of SSIS can track versions of a SSIS Package deployed to the Server? (Answer)
  26. To run your SSIS Package, the Integration Services Server and SQL Server should be installed on the same server. Is this a correct statement?
  27. What are the different ways to run your SSIS package on a schedule?
  28. What are Event Handlers in SSIS? On which events can you run the event handler part? Name a few of the Events available?
  29. Let’s say you have configured Event Handler to send an email to report an error for Data Flow Task inside For Each Loop. If error occurred in a data flow task, you will get multiple emails. Why is that? How we can prevent those series of emails coming for one error? ( Answer )
  30. How do you debug your SSIS package?
  31. Explain the important steps for performance tuning of your SSIS Package.
  32. If your package is scheduled to run every night at 10 P.M., and the package fails in production, where will you begin your search for the error details? (Answer )
  33. You have created a variable in SSIS, and you have used that variable in Data Flow Task in Row Count Transformation. If you want to display or see that value of variable after Data Flow execution, how will you do that? ( Answer )
  34. You are looking at Control Flow Item, and you cannot find Execute SQL task.  How will you bring it back to Control Flow Items? (Answer)
  35. Your company is using some third party transformations and tasks, and you need to add them to Control Flow Item and Data Flow Items.  How will you do that? (Answer)
  36. What are Attunity Drivers and why do we use them with SSIS?
  37. If you do not want to use Slowly Changing Transformation, Which T-SQL statement can help you to load  an SCD type table in one statement? ( Answer )
  38. What is Annotation?  Is it only available in Control Flow Pane or available in other Panes as well? (Answer)
  39. Let’s say we have two Execute SQL Tasks in a Sequence container.  If any of them fail, we want to roll back the transaction. How can we achieve this? (Answer)
  40. What is a TransactionOption Property at the Package level?  Is this property only available at the package level, or it is also available at the Container Level or Task level?
  41. Can you create  an SSIS Package without using BIDS? (Answer)
  42. You have an excel file and you want to clear the first cell of sheet 1, which task will you?
  43. A third-party software is available that you need to execute by using SSIS.  Which task can be used to run EXE file?  (Answer)
  44. You need to load 500 million records in a table; this table has cluster and non-cluster indexes already created. What will be your approach to load the data?
  45. What are Asynchronous and Synchronous data flow components?
  46. What are Non-blocking, Semi-blocking and Fully-blocking components in SSIS?
  47. What is Balanced Data Distributor( BDD)?
  48. What is Error Output? Can you redirect rows from Sources, Transformations and Destinations in SSIS? (Answer)
  49. If you need to check that a file exists in a folder, which task(s) will you use? (Answer)
  50. Let’s say you have created an Excel File by using Excel Destination. If you have to make the Header row bold, how will you do that in SSIS? (Answer)
  51. If you need to send an HTML email, can you use the built-in Send Mail Task? What other options do you have? (Answer)
  52. If you are using SQL Server Log Provider, can you make the SQL Server connection dynamic?
  53. If you need to watch a directory for a specific file to be added, which Task will you use?
  54. What is For-each Item Enumerator in For-each Loop Container?  Why would you use it?
  55. If you need to move a file and rename it at the same time, how will you do that? (Answer)
  56. Which Package property will help you to encrypt package to restrict access to its contents?
  57. We have received Excel Source File with multiple sheets but with the same meta data.  How will you design your package to load all of the sheets to same table?  (Answer)
  58. You have redirected records due to Truncation or data conversion errors from some transformation.  How will you find which column created the error?  How do you detect the name of the column?  (Answer)
  59. You have created different types of SSIS Packages.  How do you determine all the variable names, connection managers, logging information or expression used in those Packages?  (Answer)
  60. You have received a big Excel or flat file, but you only want to load the first 100 records in a table.  How will you  do that?  (Answer)
  61. You have received an Excel Source file, but the first 10 rows have only company information; actual data starts from Row 11.  How will you skip the first 10 rows and start reading from row 11?  (Answer)
  62. If you need to read only one cell value from an Excel file in SSIS, which task will you use? (Answer)
  63. You are extracting data from a view.  The definition of View can change anytime. We want to load this data to an Excel file by using this view.  How do you create a dynamic package so that you don't have to re-do the mapping?   (Answer)
  64. You are running different SSIS Packages on your server. How do you determine how much time each package took for execution?   (Answer)
  65. Is it possible to run the SSIS Package from Excel by using Excel Macros? (Answer)
  66. If you need to purge old files, which tasks will you use in your SSIS Package ? (Answer)
  67. After developing your SSIS Package, you want to write a technical document.  What are important contents would you like to cover in that document?  (Answer)
  68. There are so many files in our folders, we want to save the file name, file created date and size of each file in an Audit table.  Which tasks will you use to do that?  (Answer)
  69. We have .sql files sitting in a folder, and we want to execute all of them.  How can you run them by using SSIS Package? (Answer) 
  70. We have a table which contains different types of files that we want to export to folder.  Which transformation do we need to use? (Answer)
  71. There are 100 files in a directory. All have the same structure. You need to load the most recent file to table. How will you do that? (Answer)
  72. Is backwards compatibility possible for an SSIS Package? In other words, if  you have created your SSIS Package in SSIS 2008, can you downgrade to SSIS 2005? (Answer)
  73. What is the difference between Package Level Connection Manager and Project Level Connection Manager?
  74. How would you get the oldest file from a Folder? (Answer)
  75. How would you find which SSIS Packages are used by SQL Server Agent Jobs? (Answer)
  76. How would you find out if a Stored Procedure is used in SSIS Package? Answer)
  77. How would you find and replace object name in SSIS Package or SSIS Packages in Solution? Answer)
  78. How would you create Load Summary Email logic in SSIS Package? (Answer )
  79. You have different folders sitting in Parent Folder. You want to delete old folder which are older than 7 days. How would you do that in SSIS Package? (Answer)
  80. How would you make your SSIS Package dynamic? 
  81. Sometime when you execute your SSIS Package, You get an error " File is used by another process". Why did you get this error? and How can you avoid this error?
  82. How would you implement data validation in your SSIS Package?
  83. What are the best practices to test SSIS Package? 
  84. We can load data from one database tables to another database tables by using TSQL, Why do we use SSIS instead ?
  85. What approach do you follow to troubleshoot your SSIS Package?
  86. How would you change the value of variable in SSIS Package during debugging to test different scenarios? ( Answer )
  87. Is it possible to save Stored Procedure Output Parameter value to SSIS variable ? (Answer )
  88. What is the purpose of naming convention in SSIS ? (Answer)
  89. Which Control Flow Task you will use if you need to convert excel file to csv file? (Answer)
  90. You need to delete Top N Rows from Flat File, How would you do that? ((Answer)
  91. You have different SSIS Packages in one of the folder, How would you find the version of each SSIS Package? (Answer)
  92. Your SSIS Package consist of multiple Data Flow Tasks. You need to execute only first 2 Data Flow Task, How would you do that? (Answer)
  93. Explain an example where you had to build your Query dynamically in SSIS Package and how did you do that? (Answer)
  94. If you have to perform Case Statement on one of the column data, Which Transformation would you use and how will the expressions look like? (Answer)
  95. If you need to copy all the tables from a SQL server Database to flat files, how would you do that?

Sources

1.       Name a few of the Sources available in SSIS.
2.       If we need to read an XML Source file, which Source will we use in SSIS?
3.       What problems have you faced when you used Excel Source?
4.       Why do we add IMEX=1 to extended properties of Excel Connection Manager?
5.       If one of the Excel columns has alphanumeric data and another developer has changed the registry setting to TypeGuessRow=0, what does that mean to you?
6.       How will you handle a column (such as comments) that has more than 255 characters in Excel at row number 1000.
7.       What is fast parse property in Flat File Source? What are the requirements and limitations to use this property?
8.       Can I read a csv file with Raw File Source?  If not, what type of files can I read by using Raw File Source?
9.       Can we use Stored Procedure in OLE DB Source?  If yes, how do you map the parameters?
10. If you need to connect to TeraData, Which Data Source will you use?
11. How would you load fixed width flat file to SQL server table? (Answer)


Transformations

1.       What is a SSIS Transformation?
2.       If you need to get the Username, Package Name and Package Start Time, which transformation will you use?
3.       Which Transformation can perform operations such as Sum, Count and Group By? (Answer)
4.       If you need to convert Data Type from String to Integer, which transformation will you use?
5.       If you need to add a new column(s), which transformation will you use?
6.       You are reading data from a source by using a SSIS Package. If some records have Null values and you want to replace Null values to Unknow before inserting the data into a table, which Transformation will you use?
7.       What is syntax for writing IF ELSE in Derived column Transformation? (Answer)
8.       What is Lookup Transformation, and why do we use this transformation when we load Fact Table?
9.       What are three modes of Lookup Transformation, and what criteria should be used to choose the correct mode?
10.   Can we insert record in Lookup Table by using Lookup Transformation?
11.   Should you use a drop down to choose our reference table in Lookup?  Is this a good practice or not?
12.   What is multicast Transformation in SSIS?  (Answer)
13.   What is conditional Split Transformation in SSIS?  (Answer)
14.   What is the major difference between Conditional Split and Multicast Transformation? ( Answer )
15.   What is OLE DB Command Transformation?  Where would you use it and what are the disadvantages/advantages of using the OLE DB Command Transformation?  (Answer)
16.   What alternative methods would you like to use for OLE DB Command Transformation?
17.   If you need to get the Count for the number of records loaded from Source to Destination, which transformation will you use?  (Answer)
18.   If you need to create a sequence number for input records, which transformation will you use?  (Answer)
19.   Which transformation can be used as Source, Destination or Transformation?
20.   What is Slowly Changing Dimension, and which transformation can you use to load SCD tables?
21.   How will you load an SCD1 type table by using SSIS? (Answer)
22.   How will you load an SCD2 type Table by using SSIS?
23.   What are the best practices when you load huge SCD type tables?  (Answer)
24.   Which Transformation requires us to use SORT Transformation with it?
25.   What are the alternatives of SORT Transformation if our source is SQL Server and we need to use Merge Join for multiple sources?
26.   What is IsSorted Property, and why do we use it?
27.   How will you remove duplicate records in SSIS?  Which transformation can help with this task? (Answer)
28.   Let’s say we have some reference data in Excel, and we want to use that Excel data in Lookup Transformation.  How can we achieve that without loading that data into a staging table or temp table?
29.   What is Cache Transformation, and which transformation can use Cache Transformation-loaded data?  (Answer- Cache Transformation)
30.   Why do we need to use Cache Transformation? (Answer)
31.   Which transformation can be used to change column data to Lower Case or Upper Case? (Answer)
32.   What is the difference between Copy Column Transformation and Derived Column Transformation?
33.   Let’s say my table contains images, and I am reading data from a table in Data Flow task. Which transformation can help me to save those images to files?
34.   Which transformation can I use in SSIS to import image files to a Table?  (Answer)
35.   I have a source file that contains 1000 records, and I want to insert 15% of those records in TableA and the remaining records in TableB.  Which transformation should I use?
36.   What is the difference between Row Sampling and Percent Sampling Transformations?
37.   Which transformation can be used to extract nouns only, noun phrases only, or both nouns and noun phases from a text-input column?
38.   How is Term Lookup different from Lookup Transformation?
39.   To Pivot or Unpivot input data, which transformations are available in SSIS?  (Pivot Answer) - Unpivot Answer
40.   There is no Union Transformation in SSIS.  How do you perform UNION operation using built-in Transformation? (Answer)
41.   If we have source S1 and Source S2, and we need to merge them so that we get sorted output, which transformation would we use?
42.   What is the difference between Merge and Union All Transformation?
43.   In Merge Join Transformation, we can use Inner Join, Left Join and Full Outer Join.  If I have to use Cross Join, which transformation would I use?
44.   What is FindString function in Derived Column Transformation?
45.   Do we have MonthName and DayName functions available in SSIS which can be used in expressions or in Derived Column Transformation? (Answer)
46. How would you split single column data into multiple columns in SSIS Package? (Answer)

Destinations

1.       What is Raw File Destination, where and why do we use it?
2.       What is the difference between OLE DB Destination and SQL Server Destination?
3.       What is Recordset Destination and where do we need use?
4.       Can we redirect rows from OLE DB Destination? If yes, what are the important things need to remember while we configure that? ( Answer )
5.       What does “Keep Identity”,”Keep Nulls”,”Table lock” and “Check Constraints” Check box means on OLE DB Destination? (Answer)
6.       What is "Rows Per batch" and "Maximum Insert Commit Size" mean to you in OLE DB Destination? How do we use them and why do we need to change values from default values?
7.       If there is trigger on a table and we are loading data in that table by using SSIS, Access Mode properties is set to “ Table or View-fast Load”  , Will trigger fire? If not, then which Mode(
Table or View,
Table or View-Fast Load,
Table Name or view name variable,
Table name or view name variable –fast load,
SQL Command) can we  choose to load data and Trigger get fire as well?
8. How would you create Fixed Width column Text file by using SSIS ? (Answer)

Variables and Expressions

1.       What is variable in SSIS, what are data types available for variable?
2.       What is Scope of Variable? How can we change Scope of variable in SSIS 2008 and SSIS 2012? (Answer)
3.       What are expressions in SSIS? Where and Why do we need to use them?
4.       Can we write expression on Variable? If yes, how and where can you write them? (Answer)
5.       Can we write expression on Connection Managers? If yes, where would you need to do that?
6.       What are System Variables in SSIS? Name few of them those you have used.
7.       Can we write expression on Precedence Constraint? if yes, how would you do it?
8.       What is the maximum length of expressions in SSIS 2008 and SSIS 2012? (Answer )
9.       What is new Expression Task in SSIS 2012? (Answer )
10.   What are Parameters in SSIS 2012? how are they different from Variables?
11. Can you create two variables with same in name in SSIS Package?


Logging

1.       What is logging in SSIS? How many types of logging available in SSIS? Which one have you used? (Answer)
2.       If you need to create a text type log file with timestamp, how would you do that? (Answer)
3.       If you use SQL Server Logging, which table will be used to store log information?
4.       What are the few column names that sysssislog have to store log information?
5.       Do you log all the events or prefer to choose few of them?
6.       Name few of Events you like to use in logging?
7.       What is custom Logging and how is it different from built in Logging?
8.       If you use timestamp in your text file logging, it creates multiple log files with each execution, how to avoid that and why single execution create more than one log file? (Answer)
9.       If we are using SQL Server Logging, how often the records get deleted from syssislog table? or do we have to create some purge process?
10- Junior ETL developer stops to your desk and asked your suggest or best practice which Logging type he/she should use? Explain which logging type in SSIS package you will suggest to him/her and why?
11- Why don't you suggest to use Windows Event Log in Production for SSIS Package?
12- Is it possible to have two types of logging e.g. text file logging and SQL Server Logging in same SSIS Package?
13-Explain different methods to make the log information created by SSIS Packages available to your developers?

Configuration

1.       What is Configuration in SQL Server Integration Services(SSIS)?
2.       Why do we even need Configuration?
3.       What are configuration types available for us to use?
4.       What is the difference between Direct and Indirect Configuration?
5.       If we use SQL Server Configuration, what is table name created by configuration Wizard?
6.       How many columns configuration table do have? which one are important for us?
7.       For Indirect Configuration, do you create User variable or System variable to hold connection string?
8.       What is Parent Child Configuration and where do we need to use that?
9.       Explain the best practices for Integration Services Configuration.
10.   Let’s say we have SQL Server Connection Name ‘MyConnection’,  For configuration which properties of  Connection manager will you choose (Description,EvaluateAsExpression,Expression,IncludeInDebug,Name,NameSpace,RaiseChangedEvent,ReadOnly,Value) ?


Control Flow Task

1.       What is Data Flow Task? ( Answer )
2.       If you need to run some SQL script( Stored Procedure, DML,DDL ). Which task will you use in Control Flow? (Answer)
3.       If you have created object type variable that you want to use in script task later, how will you load data into Object Type variable in Control flow? Which task will help? (Answer)
4.       If your database in Full Recovery mode, can you use Bulk Insert Task to load data? What are the requirements to use Bulk Insert Task?
5.       If your company is using Ftp site to receive daily data files, which task will you use to download/upload/Delete files on FTP Site? (Answer)
6.       What is the difference between Script task and Script component?
7.       What script languages are available for you to do scripting in Script task and script component?
8.       What is the difference between ReadOnlyVariables and ReadWriteVariable in Script task?
9.       Can you do debugging in Script task or Script component in SSIS 2012?
10.   You have create 5 packages, you want to call all of them in one package, which task can you use to call them?
11.   You have deployed your SSIS Package to Integration Services Server, Can you still call them in Parent Package?
12.   If you load set of big files, after each load you want to zip them and put them into archive folder. Which task would you need to use to Zip them? (Answer)
13.   Which task can you use to delete, rename, move files and folders?
14.   Can you rename and move file by using one task? which is that task and how will you do that? ( Answer )
15.   If your data flow task fails in Control Flow, which task will you use to send email on Error? (Answer)
16.   Before you create your SSIS Package and load data into destination, you want to analyze your data, which task will help you to achieve that?
17.   What is WSDL and in which task you have to use it?
18.   You have loaded email addresses in Object type variable, now you want to send email to each of the address you have in Object type variable, which task will you use to loop through and send email one by one?
19.   There are 10 files in folder, we want to load them one by one, which tasks are required to perform this job?
20.   You have a lot of Tasks sitting in Control Flow task , you want to group them, which container can help you to achieve that?
21.   You got this assignment where you want to perform the same task 10 times, which loop will you use to do that?
22.   Name few of the tasks those are available to create Maintenance plan or those can perform DBA tasks such as Create indexes, take backup etc.
23.   In Execute SQL Task, what is Result Set ( Single Row, Full Result Set,XML) ?
24. Which task would you use to execute dot batch files in SSIS (Answer)
25. What type of Containers are available in SSIS Package? Explain two of them in detail.
26. You need to create a directory with Date in SSIS Package, Which Task would you use that? (Answer)

 Event Handlers

  1. What are Event Handlers?
  2. What type of Tasks we can use in Event Handler pane?
  3. What is the relationship between Executable and Event Handler in Event Handler Pane?
  4. If your SSIS Package failed on any task, You want to send an email, how would you use the event handler to do this?
  5. You have multiple Tasks in your SSIS Package such as Task1,Task2 and Task 3. If Task 1 or Task 2 fails, You want to send an email to users but if Task 3 fails then you want to truncate all the tables which are loaded by your SSIS Package, How would you do that?
  6. We can run any tasks in Control Flow Task on Failure by using Precedence constraint. What are the advantages of using Event Handler over Precedence Constraint in Control Flow Task?
  7. Name couple of Events which are available in Event Handler on which we can perform different actions.