Thursday, November 13, 2014

SSIS Very Useful


Very very useful for ssis internals@ http://www.citagus.com/citagus/blog/sql-server-integration-services-best-practices-part-i/

control flow defines the tasks used in performing data management operations;
it determines the order in which these tasks are executed and the conditions of their execution.

Dataflow:is a special contral flow task  used specifically in data movement operations and data transformations.

Lineage columns:used for auditing and never exposed to end users


implement Staging table and other objects in a sepearte dbase.

Usage Staging table:temporarily store source data before cleansing it or merging it with data from other sources.

staging table-->serve as intermediate layer (b/w DW and source tables) and then starts reqular ETL process.

DSA:The part of a DW containing staging tables is called the data staging area(DSA).



Implementing Fact Tables:It is not necessary that all foreign keys together uniquely identify each row of a fact table.

inferred member:A row in a demension added during fact table load is called an inferred member

You can add lineage information to your dimensions and fact tables to audit changes to your DW on a row level(summary)

Tasks representing logical units of work can be grouped in containers.

SSIS does not support receiving email messages.

 A failure constraint will allow the following task to begin if the preceding
task has encountered errors and failed.

The External Process tasks will be executed in sequence, and the process  will continue, even if one or more of them fail.

Use a Raw File destination if you have to temporarily store data to be used by SSIS
later.
■■ Use the Resolve References dialog box to solve mapping errors.

 Which data flow transformation would you use if had to combine data from two different
database tables that exist on two different servers?


Variables and expressions can also be used to extend the elementary functionality of
precedence constraints based on information available at run time that usually is not
available at design time.

Lookup: more matching rows exist from the referenced dataset,only first one will be retreived
Mergejoin:more rows will retreived,because all matching data is retreving

The Merge transformation is similar to Union All, but with Merge, the sources have to be sorted and the sort position is preserved.

Multicast(N):creates a logical copy of data.

Is it possible to change the settings of an SSIS object at run time?

When are property expressions evaluated as a package is running?

Answer Unlike parameters that are read at the start of package execution, property expressions are updated when the property is accessed by the package during package execution.
A property expression can change the value of a property in the middle of package execution, so that the new value is read when the property is needed by the package.

Like:aggregate,sort transformation,Fuzzy Group,Fuzzy Lookup(Blocking)-->Remember which transformations are fully blocked and try to use them only when absolutely
Necessary, because they often require more memory and processor capacity. If you
are aggregating or sorting a data source input that will not fit into the server memory, the

performance will degrade by a factor of 100, because swapping to disk will occur.


1 comment:

  1. I feel SSIS is the most useful and interesting aspect to look for more of complex operations.

    SSIS Postgresql Write

    ReplyDelete