pushdown optimization

4
What is Pushdown Optimization? Pushdown optimization is a way o f load-balancing among servers in order to achieve optimal performance. Veteran ETL developers often come across issues when they need to determine the appropriate place to perform ETL logic. Suppose an E TL logic needs to filter out data based o n some condition. One can either do it in database by using WHERE condition in the SQL query or inside Informatica by using Informatica Filter transformation. Sometimes, we can even "push" some transformation logic to the target database instead of doing it in the source side (Especially in the case of EL-T rather than ETL). Such optimization is crucial for overall ETL performance. How does Push-Down Optimization work? One can push transformation logic to t he source or target database using pushdown o ptimization. The Integration Service translates the transformation logic into S QL queries and sends the SQL queries to the source or t he target database which executes the SQL queries to process the transformations. The amount of transformation logic one can push to the database depends on the database, transformation logic, and mapping and session configuration. The Integration Service analyzes the transformation logic it can push to the database and executes t he SQL statement generated against the source o r target tables, and it processes any transformation logic that it cannot push to the database. Using Pushdown Optimization Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic that the Integration Service can push to the source or target database. You can also use the Pushdown Optimization Viewer to view the messages related to p ushdown optimization. Let us take an example: Filter Condition used in this mapping is: DEPTNO>40 Suppose a mapping contains a Filter transformation that fil ters o ut all employees except those with a DEPTNO greater than 40. The Integration Service can push the t ransformation logic to the database. It generates the following SQL statement to process the transformation logic: INSERT INTO EMP_TGT(EMPNO, ENAME, SAL, COMM, DEPTNO) SELECT EMP_SRC.EMPNO, EMP_SRC.ENAME, EMP_SRC.SAL, EMP_SRC.COMM,

Upload: shravan-tankasala

Post on 06-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

8/3/2019 Pushdown Optimization

http://slidepdf.com/reader/full/pushdown-optimization 1/4

What is Pushdown Optimization?

Pushdown optimization is a way of load-balancing among servers in order toachieve optimal performance. Veteran ETL developers often come acrossissues when they need to determine the appropriate place to perform ETL

logic. Suppose an ETL logic needs to filter out data based on some condition.One can either do it in database by using WHERE condition in the SQL queryor inside Informatica by using Informatica Filter transformation. Sometimes,we can even "push" some transformation logic to the target database instead of

doing it in the source side (Especially in the case of EL-T rather than ETL). Such optimization iscrucial for overall ETL performance.

How does Push-Down Optimization work?

One can push transformation logic to the source or target database using pushdown optimization.The Integration Service translates the transformation logic into SQL queries and sends the SQL

queries to the source or the target database which executes the SQL queries to process thetransformations. The amount of transformation logic one can push to the database depends on thedatabase, transformation logic, and mapping and session configuration. The Integration Serviceanalyzes the transformation logic it can push to the database and executes the SQL statementgenerated against the source or target tables, and it processes any transformation logic that itcannot push to the database.

Using Pushdown Optimization

Use the Pushdown Optimization Viewer to preview the SQL statements and mapping logic thatthe Integration Service can push to the source or target database. You can also use the Pushdown

Optimization Viewer to view the messages related to pushdown optimization.

Let us take an example:

Filter Condition used in this mapping is: DEPTNO>40

Suppose a mapping contains a Filter transformation that filters out all employees except thosewith a DEPTNO greater than 40. The Integration Service can push the transformation logic to thedatabase. It generates the following SQL statement to process the transformation logic:

INSERT INTO EMP_TGT(EMPNO, ENAME, SAL, COMM, DEPTNO)SELECTEMP_SRC.EMPNO,EMP_SRC.ENAME,EMP_SRC.SAL,EMP_SRC.COMM,

8/3/2019 Pushdown Optimization

http://slidepdf.com/reader/full/pushdown-optimization 2/4

EMP_SRC.DEPTNOFROM EMP_SRCWHERE (EMP_SRC.DEPTNO >40)

The Integration Service generates an INSERT SELECT statement and it filters the data using aWHERE clause. The Integration Service does not extract data from the database at this time.

We can configure pushdown optimization in the following ways:

Using source-side pushdown optimization:

The Integration Service pushes as much transformation logic as possible to the source database.The Integration Service analyzes the mapping from the source to the target or until it reaches adownstream transformation it cannot push to the source database and executes the correspondingSELECT statement.

Using target-side pushdown optimization:

The Integration Service pushes as much transformation logic as possible to the target database.The Integration Service analyzes the mapping from the target to the source or until it reaches anupstream transformation it cannot push to the target database. It generates an INSERT, DELETE,or UPDATE statement based on the transformation logic for each transformation it can push tothe database and executes the DML.

Using full pushdown optimization:

The Integration Service pushes as much transformation logic as possible to both source andtarget databases. If you configure a session for full pushdown optimization, and the IntegrationService cannot push all the transformation logic to the database, it performs source-side or target-side pushdown optimization instead. Also the source and target must be on the samedatabase. The Integration Service analyzes the mapping starting with the source and analyzeseach transformation in the pipeline until it analyzes the target.

When it can push all transformation logic to the database, it generates an INSERT SELECTstatement to run on the database. The statement incorporates transformation logic from all thetransformations in the mapping. If the Integration Service can push only part of thetransformation logic to the database, it does not fail the session, it pushes as much transformationlogic to the source and target database as possible and then processes the remainingtransformation logic.

For example, a mapping contains the following transformations:

SourceDefn -> SourceQualifier -> Aggregator -> Rank -> Expression -> TargetDefn

SUM(SAL), SUM(COMM) Group by DEPTNORANK PORT on SALTOTAL = SAL+COMM

8/3/2019 Pushdown Optimization

http://slidepdf.com/reader/full/pushdown-optimization 3/4

The Rank transformation cannot be pushed to the database. If the session is configured for full

pushdown optimization, the Integration Service pushes the Source Qualifier transformation andthe Aggregator transformation to the source, processes the Rank transformation, and pushes theExpression transformation and target to the target database.

When we use pushdown optimization, the Integration Service converts the expression in thetransformation or in the workflow link by determining equivalent operators, variables, andfunctions in the database. If there is no equivalent operator, variable, or function, the IntegrationService itself processes the transformation logic. The Integration Service logs a message in theworkflow log and the Pushdown Optimization Viewer when it cannot push an expression to thedatabase. Use the message to determine the reason why it could not push the expression to thedatabase.

How does Integration Service handle Push Down Optimization

To push transformation logic to a database, the Integration Service might create temporaryobjects in the database. The Integration Service creates a temporary sequence object in thedatabase to push Sequence Generator transformation logic to the database. The IntegrationService creates temporary views in the database while pushing a Source Qualifier transformationor a Lookup transformation with a SQL override to the database, an unconnected relationallookup, filtered lookup.

1. To push Sequence Generator transformation logic to a database, we must configure the

session for pushdown optimization with Sequence .2. To enable the Integration Service to create the view objects in the database we mustconfigure the session for pushdown optimization with View .

After the database transaction completes, the Integration Service drops sequence and viewobjects created for pushdown optimization.

Configuring Parameters for Pushdown Optimization

Depending on the database workload, we might want to use source-side, target-side, or full pushdown optimization at different times and for that we can use the $$PushdownConfig

mapping parameter. The settings in the $$PushdownConfig parameter override the pushdownoptimization settings in the session properties. Create $$PushdownConfig parameter in theMapping Designer , in session property for Pushdown Optimization attribute select$$PushdownConfig and define the parameter in the parameter file.

The possible values may be,

1. none i.e the integration service itself processes all the transformations.

8/3/2019 Pushdown Optimization

http://slidepdf.com/reader/full/pushdown-optimization 4/4

2. Source [Seq View], 3. T arget [Seq View], 4. F ull [Seq View]

Using Pushdown Optimization Viewer

Use the Pushdown Optimization Viewer to examine the transformations that can be pushed to thedatabase. Select a pushdown option or pushdown group in the Pushdown Optimization Viewer toview the corresponding SQL statement that is generated for the specified selections. When weselect a pushdown option or pushdown group, we do not change the pushdown configuration. Tochange the configuration, we must update the pushdown option in the session properties.

Database that supports Informatica Pushdown Optimization

We can configure sessions for pushdown optimization having any of the databases like Oracle,IBM DB2, Teradata, Microsoft SQL Server, Sybase ASE or Databases that use ODBC drivers.

When we use native drivers, the Integration Service generates SQL statements using nativedatabase SQL. When we use ODBC drivers, the Integration Service generates SQL statementsusing ANSI SQL. The Integration Service can generate more functions when it generates SQLstatements using native language instead of ANSI SQL.

Pushdown Optimization Error Handling

When the Integration Service pushes transformation logic to the database, it cannot track errorsthat occur in the database.

When the Integration Service runs a session configured for full pushdown optimization and anerror occurs, the database handles the errors. When the database handles errors, the IntegrationService does not write reject rows to the reject file.

If we configure a session for full pushdown optimization and the session fails, the IntegrationService cannot perform incremental recovery because the database processes thetransformations. Instead, the database rolls back the transactions. If the database server fails, itrolls back transactions when it restarts. If the Integration Service fails, the database server rolls

back the transaction.