dwh interview questions-tech

Upload: daki84

Post on 14-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 DWH Interview Questions-Tech

    1/21

    Interview Questions:

    INFORMATICA:

    1. How many mappings you have done and what are they?Ans:

    2. What is a Surrogate key?Ans:

    A surrogate key has a system generated primary key values which allows to maintain historicalrecords. This is used to keep track of the slowly changing dimension a new column is created thisis called the surrogate key.

    3. Can surrogate key applied to a fact table? If so how?Ans:Surrogate key is related to dimensional table and not for fact table. In fact table we can only haveforeign keys.

    4. What does mapping consists?Ans:

    Mapping consists of the components like Source Definition (characteristics of a source table or afile), Transformations (modifies data to perform different functions), Target definitions (definestarget table or files), and connectors. It passes the data depending on the business requirement.

    5. What is a mapplet and difference b/w mapping and mapplet?Ans:

    A mapplet is a reusable object that represents a set of transformations. It allows you to reusetransformation logic and can contain as many transformations as we need.

    Mappings represent the data flow between sources and targets. When the server runs a session,it uses the instructions configured in the mapping to read, transform, and write data. Mappingcant be reusable where as mapplet is reusable in other mappings. We cant use Normalizertransformation in mapplet where we can use the same in mapping.

    6. What are slowly changing Dimensions (SCDs)? How to achieve this?Ans:SCDs are Slowly Changing Dimensions.These are dimension tables that have slowly increasing dimension data, as well as updates toexisting dimensions.We can achieve this by Type 1, Type 2, Type 3.

    Type 1 Dimension mapping:Loads a slowly changing dimension table by inserting new dimensions and overwriting existingdimensions. This mapping is used when we do not want a history of previous dimension data.

    Type 2 Dimension/Version Data mapping:

    Loads a slowly changing dimension table by inserting new and changed dimensions using aversion number and incremented primary key to track changes. This mapping is used when wewant to keep a full history of dimension data and to track the progression of changes.Type 2 Dimension/Flag Current mapping:Loads a slowly changing dimension table by inserting new and changed dimensions using a flagto mark current dimension data and an incremented primary key to track changes. This mappingis used when we want to keep a full history of dimension data, tracking the progression ofchanges while flagging only the current dimension.Type 2 Dimension/Effective Date Range mapping:

  • 7/30/2019 DWH Interview Questions-Tech

    2/21

    Loads a slowly changing dimension table by inserting new and changed dimensions using a daterange to define current dimension data. This mapping is used when we want to keep a full historyof dimension data, tracking changes with an exact effective date range.

    Type 3 Dimension mapping:Loads a slowly changing dimension table by inserting new dimensions and updating values inexisting dimensions. This mapping is used when we want to keep the current and previousdimension values in your dimension table.

    7. How does Update Strategy work?Ans:It determines whether to Insert, Delete, Update or Reject records.If the given condition is true or false, you can flag the record to insert, delete, update or reject.

    We can set the Update Strategy in two levels:1. With in a Session2. With in a Mapping

    1.With in a Session:When we configure a session, we can instruct the informatica server to wither treat all the records

    in the same way (like, treat all records as inserts) or use instructions coded into the sessionmapping to flag records for different database operations. This performs the operations atSession level.2.With in a Mapping:In this we can flag records for Insert, delete, Update or Reject. This performs the operations atrow level.

    8. If data is changed, how it is reflected in your repository?Ans:Once we save the repository, it is reflected.

    9. How many records in your project after performing a transformation?Ans:

    That depends on the valid records in the respective regions, which are coming, generally the loadwill occur only once a month, the volume of data could be somewhere close to 1 millions to 1billion in each target.

    10. What are the ports available in Informatica?Ans:Input, Output, Variable, Lookup, group by, Rank, Return, Sorted ports.

    11. What is a Conditional clause?Ans:

    Any conditions like IIF, WHERE, etc which we use in Transformations for specifying conditions onthe data.

    12. How to generate reports in Informatica output data?Ans:Using PowerPlay Reports.

    13. In lookup transformation, how dynamic cache option is created and what is theadvantage over static cache?Ans:The advantage of using cache is, it creates the cache files and queries the cache instead oflookup table to improve performance.

  • 7/30/2019 DWH Interview Questions-Tech

    3/21

    In Static cache, once the session is completed cache no longer exists. Static cache is usedwhen there are no updates in the Lookup table.To keep the cache, we need to use the option Rebuild cache from database or Static Persistentcache. Static persistent cache keeps the cache files in the cache directory, so that we can reusethe cache in the other transformation or mappings.Instead of Rebuilding the cache again and again better to query on the lookup table.

    Dynamic cache can be enabled from Lookup properties. Once we use dynamic cache, first timewhen we run the session, it will insert the values 1, 2 both in cache and target table. Next timewhen we run if there are new or updated records from source, it queries the cache and first it willeither update or insert the cache and then same will be loaded to the target table.

    Dynamic cache is used when we want to lookup data in the target. As we are querying on thelatest cache, no need to lookup the data again and again. It checks for only newly inserted orupdated records instead of looking on the whole data in cache. This improves the performance.

    14. What are lookup ports?Ans:Look up ports are used to lookup the data on the specified look up table.

    15. What is the code written in Update strategy to insert records? Or types of updatecommands?Ans:DD_INSERT - 0, DD_UPDATE - 1, DD_DELETE - 2, DD_REJECT - 3

    16. Can we insert records (which are changed) at session level? Then why to use Update

    Strategy?Ans:We can insert records at session level, but this affects table level. If we do this at mapping level, itaffects row level.

    17. In Stored Procedure Transformation, what is the use of Connected and UnconnectedStored procedure?Ans:In connected mode, data flows through the stored procedure transformation. All data enteringthe transformation through the input ports affects the stored procedure. We can use this when we

  • 7/30/2019 DWH Interview Questions-Tech

    4/21

    need data from an input port sent as an input parameter to the stored procedure, or the results ofa stored procedure sent as an output parameter to another transformation.In Unconnected mode, the unconnected Stored Procedure transformation is not connecteddirectly to the flow of the mapping. It either runs before or after the session, or is called by anexpression in another transformation in the mapping.

    18. What is the toughest transformation U has done?Ans:I can say, lookup transformation as our data has to lookup in several other tables and has to findthe matching records based on the lookup-condition.

    19. If the Source has a Oracle table which consists of one record, with whichtransformation we can breakup into 10 duplicate records of the same record in target?Ans:If the source is from Oracle, create 10 target instances and from one expression we can map theoutput to the targets.If it is COBOL source, use the Normalizer transformation.

    20. Which transformation affects the performance of a Data warehouse?Ans:

    Joiner, Aggregator, Filter transformations affects the performance.If the sources are from different databases, Joiner transformation takes long time to run anddecreases the performance. Better optimize the query or increase the Joiner Cache sizes.

    If performance is low at Aggregator, better use Sorter Transformation so that the data is firstsorted and then it perform calculations on the sorted data.

    Instead of using Filter transformations, better use filter condition at Source QualifierTransformation which increases the performance.

    21. What are Rapid changing dimensions?Ans:

    22. As an ETL developer? Do you recommend data unloading and perform transformationor use SQL Queries?Ans:I can suggest SQL queries using the tool. If we use SQL queries, no need to perform the tasksexternally. We can do the same using the ETL tool also.

    23. Difference between connected and unconnected Lookup Transformations?Ans:Connected:

    1. Connected receives input directly from the pipeline.2. Dynamic / Static cache.3. Cache includes Look up condition table columns and output ports table columns.

    4. It can return multiple columns from the same row or insert into dynamic lookup cache.5. If there is no match for the Lookup condition, it returns default values from Output ports.6. If there is a match for the Lookup condition, the informatica server returns the result of

    Lookup condition for all the LKP/Output ports. For dynamic cache, it either updates therow in the cache or leaves it unchanged.

    7. Supports User-defined values.

    Unconnected:1. Receives input from the result of a :LKP expression in another transformation.2. Static Cache.

  • 7/30/2019 DWH Interview Questions-Tech

    5/21

    3. Cache includes all Lookup condition ports, all lookup output ports and Lookup return port.4. Returns one column for each row.5. If there is no match it returns NULL.6. If there is no match, it returns the result of the lookup condition into the return port.7. Doesnt support user-defined default values.8. These are used for SCDs to update changed dimensions.

    24. What is condition you used in Unconnected Lookup transformation?Ans:Condition :LKP.lookup_transformation_name(argument, argument, ...)

    25. In Joiner Transformation, how to identify the master and detail tables if one is fromOracle and the other is from Sybase.Ans:

    26. What is the Transformation that cant be used in a mapplet?Ans:Normalizer Transformation.

    27. How to achieve Fine tuning in Lookup?

    Ans:We can improve the performance by:1. By indexing the columns in the LKP condition. We can improve LKP initialization time byadding an index to the Lookup table.This is important for very large lookup tables. Since the Informatica Server needs to query, sort,and compare values in these columns, the index needs to include every column used in a lookupcondition.2. Place conditions with an equality operator(=) first.If a Lookup transformation specifies several conditions, you can improve lookup performance byplacing all the conditions that use the equality operator first in the list of conditions that appearunder the Condition tab.3. Join tables in the database.If the lookup table is on the same database as the source table in your mapping and caching is

    not feasible, join the tables in the source database rather than using a Lookup transformation.4. Override the ORDER BY statement for cached lookups.By default, the Informatica Server generates an ORDER BY statement for a cached lookup thatcontains all lookup ports. To increase performance, we can suppress the default ORDER BYstatement and enter an override ORDER BY with fewer columns.

    To override the default ORDER BY statement, complete the following steps:

    Generate the lookup query in the Lookup transformation.

    Enter an ORDER BY statement that contains the condition ports in the same order theyappear in the Lookup condition.

    Place a comment notation after the ORDER BY statement, such as two dashes ' --'.5. Use a persistent lookup cache for static lookup tables.

    28. Can we have any no. of mappings in one session and can we have any no. of sessionsin one mapping?Ans:We can have any no. of mappings in one session but the reverse is not possible.

    29. How to unlock a session if it is locked?Ans:

  • 7/30/2019 DWH Interview Questions-Tech

    6/21

    We can unlock objects, versions, and folders by identifying the user connection associated withthe lock and terminating the user connection. We must have Super User privilege to unlock. InRepository manager, go to the option Show Locks and Edit Connections to unlock.

    30. What is Rank and Router Transformation?Ans:Rank Transformation:The Rank Transformation is used to select only the top or bottom rank of data.We can use this transformation to return the largest or smallest numeric value in a port or agroup.Router Transformation:Router Transformation is like a Filter Transformation. In filter transformation, it tests data for onecondition and drops the rows of data that dont meet the condition.Router transformation tests data in one or more condition and gives the option to route rows ofdata that dont meet any of the condition to default.Ex: If we want to test data based on three conditions, we can only need one RouterTransformation instead of three filter transformations to perform this task.

    31. What is the command to start Informatica server from Command prompt?Ans:

    pmcmd startworkflow s serveraddress:port_no u username p password Workflowname

    32. What is the transformation which performs on group?Ans:

    Aggregator and Sorter transformations.

    33. In which transformations DML statements are used?Ans:Update Strategy transformation.

    34. What is Sorter transformation and its advantage?Ans:Sorter Transformation is used to sort the data (ascending or descending).

    If performance is low at Aggregator, better use Sorter Transformation so that the data is firstsorted and then it perform calculations on the sorted data.

    35. If 2 source tables are joined and wants to perform some calculations and filter therecords and dump 15 target indexes, how to achieve the above? Can SQL query used toincrease performance?Ans:Use the Joiner Transformation and aggregator transformation to perform the calculations andfilter the records in Source Qualifier transformation and load the data in the target.

    36. What is Workflow manager?Ans:

    Using Workflow manager we can connect to Repository and create Workflows and Tasks. Alsowe can schedule the workflows.

    37. Difference between Informatica 5.1 and 6.2?Ans:5.1:1. Session and batch concept to run the sessions.2. It has Server Manager.3. It cant be installed on Linux operating system.4. No option to copy the objects.

  • 7/30/2019 DWH Interview Questions-Tech

    7/21

    6.2:1. Using task we can run the session either sequentially or concurrent.2. It has Workflow manager and monitor.3. We can work on Linux operating system.4. We can copy and paste the objects.5. We can create flat file targets definitions to output data to flat files.6. Pre- and Post-session SQL.We can use pre-session SQL to issue commands to the database such as dropping indexesbefore extracting data. And post-session SQL to issue commands to a database such as re-creating indexes.7. View dependencies to view the list of objects that depend.8. External loaders to load data into DB2 or Teradata.

    38. How to create flat file output in Informatica?Ans:

    We can create the flat files using designer Go to Targets Create option and select the database type as flat file.

    39. How to implement Incremental Loading?

    Ans:

    40. Why we go for Lookup cache enabled? Types of Lookup caches?Ans:If dont select the option Lookup cache enabled, the cache files are not created and we cantquery from the cache. Every time we run the session, it will query on the lookup table instead oncache, which decreases the performance.Refer to Q # 13.

    41. How do you migrate Designer and workflow manager folders (mappings, Shortcuts,Procedures, Sessions) from development to testing and finally to Production?Ans:

    We have separate repositories for Development, Testing and Production environments. Once weare completed with the Development, all the objects are moved to the Testing environment.

    42. Difference between PowerCenter and PowerMart?Ans:PowerCenter:

    Distributed metadata (repository).

    Global repository and can register multiple Informatica servers. One can share metadataacross repositories.

    Can connect to varied sources like Peoplesoft, SAP etc.

    Has bridges which can transport meta data from other tools (like Erwin)

    PowerMart:

    Informatica Power Mart includes all features except distributed metadata, multipleregistered servers. We can connect to only one repository and can register only oneInformatica server.

    Cant connect to varied sources like PEOPLESOFT, SAP.

    43. After a load is done to my target table, I need to insert a last row, which might beformed at some expression transformation and brought to target. Do I have to have

  • 7/30/2019 DWH Interview Questions-Tech

    8/21

    separate mapping to be run after the 1st mapping? Or in a single mapping how to achievethis?Ans:

    44. What are shortcuts and when they are used?Ans:Shortcuts allow using metadata across folders without making copies, ensuring uniformmetadata. A shortcut inherits all properties of the object to which it points. Once you create ashortcut, you can configure the shortcut name and description.For example, if you have a shortcut to a target definition, and you add a column to the definition,the shortcut automatically inherits the additional column.

    You can create the following types of shortcuts:Local shortcut: A shortcut created in the same repository as the original object.Global shortcut: A shortcut created in a local repository that references an object in a globalrepository.

    45. When ever the table structure of source or Target table changes? What are side effectsand do we need to import these tables again & again and do rework?Ans:

    If we are not using Shortcuts as the targets, all the mappings which are using these targets willbecome Invalid and session fails due to this. So to avoid this better create the targets usingShortcuts, so that it inherits all the object properties and no need to import the table again andagain whenever it is changed.

    46. Architecture of Informatica 6.2.1 and 5.1?Ans:Informatica 5.1 Architecture:

    Informatica 6.2.1 Architecture:Informatica 6.2.1 Architecture consists of Workflow manager instead of Server manager and ithas Repository Server Administration Console in which we can register new servers.47. What is the difference between Active & Passive Transformations with example?Ans:

    Active Transformation:An Active Transformation can change the number of records (rows) passed through it.Ex: Filter Transformation removes rows that dont meet the filter condition defined in thetransformation.

    Active Transformations that might change the records count are:1. Advanced External Procedure.2. Aggregator.3. Filter4. Joiner

    Server

    Sources Target

    Repository

    RepositoryManager

    Designer

    Server Manager

    Informatica Client

    Source Analyzer

    WarehouseDesigner

    TransformationDeveloper

    Mapping Designer

    Mapplet Designer

  • 7/30/2019 DWH Interview Questions-Tech

    9/21

    Passive Transformation:Passive transformation doesnt change the number of Rows that pass through it.Ex: Expression transformation that performs a calculation on data and passes all rows throughthe transformation.

    Passive Transformation that never change the records count are:1. Lookup2. Expression3. Sequence Generator4. Stored Procedure

    48. What is the difference between Normal and Bulk Loading? Which one isrecommended?Ans:Normal Loading:Recovery is possible as it maintains the log files for the data loaded into the target. We canperform Roll back operation. As log files are created performance will be low.

    Bulk Loading:It bypasses the Log files while loading into the target and it directly goes to the data files. As it

    doesnt maintain log files, Recovery is not possible.

    Informatica supports Bulk loading on the databases like Sybase, Oracle, or Microsoft SQL Server.It improves the performance of a session that inserts a large amount of data to the targetdatabase.

    When bulk loading, the Informatica Server invokes the database bulk utility and bypasses thedatabase log, which speeds performance. Without writing to the database log, however, thetarget database cannot perform rollback. As a result, you may not be able to perform recovery.Therefore, you must weigh the importance of improved session performance against the ability torecover an incomplete session.

    49. What is Incremental Aggregation and when it should be implemented?

    Ans:If the source changes only incrementally and if we need to capture those changes, we canconfigure the session to process only those changes. This allows the Informatica Server toupdate your target incrementally, rather than forcing it to process the entire source andrecalculate the same calculations each time you run the session.

    We can use incremental aggregation if: The mapping includes an aggregate function.

    The source changes only incrementally.

    When we want to capture incremental changes. We might do this by filtering source databy timestamp.

    Before implementing incremental aggregation, consider the following issues:

    Whether it is appropriate for the session. What to do before enabling incremental aggregation.

    When to reinitialize the aggregate caches.

    50. Explain Session and batches?Ans:Session consists of set of instructions through which we can run the mapping and it gives theinstruction to Informatica server to get the work done.

  • 7/30/2019 DWH Interview Questions-Tech

    10/21

    Batches are the collection of sessions or set of sessions

    51. What is the difference between a Shortcut of an Object and Copy of an object?Ans:We can inherit the object properties if we use Shortcut of an object. If we use Copy of an object,we need to import the object every time if there is any change.

    52. What is the advantage of Persistent cache? When it should be used?Ans:When Lookup cache is saved in Look up Transformation, it is called persistent cache. The firsttime session runs it is saved on the disk and can be utilized in subsequent running of theSession. It is used when the look up table is Static i.e. doesnt change frequently.

    Static persistent cache keeps the cache files in the cache directory, so that we can reuse thecache in the other transformation or mappings.

    53. What is the difference between Connected and Unconnected Lookup?Ans: Refer Q # 23

    54. In one database whether two different Repositories is possible? If so how?

    Ans:Yes.

    55. Architecture of Repository? Difference between Repository Server and Agent?Ans:The Repository Serverstores the information related to the connectivity, which is required forthe Repository Agent process to connect to the repository database. The Repository Server usesone Repository Agent process for each repository database. The Repository Agent process usesnative drivers to communicate directly with the repository database.

    The Repository Agent process supports multiple connections from a client application. Eachclient tool opens a separate connection to the Repository Agent process. The Informatica Serveropens additional connections to the Repository Agent process for each Load Manager and DTM

    process.

    56. What are the challenges you have experienced in using ETL Informatica?Ans:

    57. Different types of partitions? Why they are used?Ans:Partitioning:

    A partition is a pipeline stage that executes in a single reader, transformation and writer threads.Partitioning improves the session performance. If we increase no. of partitions, sessionperformance will improve.

    There are 4 types of Partitions:

    1. Round-Robin Partitioning:In round-robin partitioning, the Informatica Server distributes rows of data evenly to all partitions.Each partition processes approximately the same number of rows.

    In a pipeline that reads data from file sources of different sizes, we can use round-robinpartitioning so that all the partitions receive same number of rows.

    2. Hash Partitioning:

  • 7/30/2019 DWH Interview Questions-Tech

    11/21

    In hash partitioning, the Informatica Server uses a hash function to group rows of data amongpartitions. Based on a partition key, the Informatica Server groups the data.

    We can use Hash partitioning when we want the Informatica server to distribute rows to thepartitions by group.Ex: We need to sort items by item ID, but we do not know how many items have a particular IDnumber.

    There are two types of hash partitioning.

    Hash Auto-keys Partitioning:We can use hash auto-keys partitioning at or before Rank, Sorter, and unsorted Aggregatortransformations to ensure that rows are grouped properly before they enter thesetransformations. The Informatica Server uses all grouped or sorted ports as a compound partitionkey.

    If items with the same description exist in more than one source file, each partition will containitems with the same description. Without hash auto-keys partitioning, the Aggregatortransformation might calculate average costs and prices for each item incorrectly.

    To prevent errors in the cost and prices calculations, set a partition point at the Sortertransformation and set the partition type to hash auto-keys. When you do this, the InformaticaServer redistributes the data so that all items with the same description reach the Sorter and

    Aggregator transformations in a single partition.

    Hash User Keys Partitioning:In hash user keys partitioning, the Informatica Server uses a hash function to group rows of dataamong partitions based on a user-defined partition key. We can choose the ports that define thepartition key.

    If we select hash user keys partitioning at any partition point, we must specify a hash key. TheInformatica Server uses the hash key to distribute rows to the appropriate partition according to

    group.

    Key Range Partitioning:With key range partitioning, the Informatica Server distributes rows of data based on a port or setof ports that we specify as the partition key. For each port, we have to define a range of values.The Informatica Server uses the key and ranges to send rows to the appropriate partition.

    Key range partitioning is used in mappings where the source and target tables are partitioned bykey range.

    To optimize writing to the target table, perform the following tasks:

    1. Set the partition type at the target instance to key range.

    2. Create three partitions.3. Choose the partition key.The Informatica Server uses this key to pass data to the appropriate partition.4. Set the key ranges at all the partitions.

    Pass-through Partitioning:In pass-through partitioning, the Informatica Server processes data without redistributing rowsamong partitions. Therefore, all rows in a single partition stay in that partition after crossing apass-through partition point.

  • 7/30/2019 DWH Interview Questions-Tech

    12/21

    In pass-through partitioning, the Informatica Server processes data without redistributing rowsamong partitions. Therefore, all rows in a single partition stay in that partition after crossing apass-through partition point.

    When you add a partition point to a pipeline, the master thread creates an additional pipelinestage. Use pass-through partitioning when you want to increase data throughput, but do not wantto (or cannot) increase the number of partitions.

    58. Can update strategy used with out Look-Up transformation?Ans:Yes, we can use.

    59. How to Tune performance?Ans:Performance can be tuned at different levels. These are categorized as1. Source2. Target3. Transformation4. Session

    60. How many unconnected transformations in Informatica?Ans:There are 3 Unconnected Transformations:1. External Procedure Transformation.2. Stored Procedure.3. LookUp Transformation.

    61. Whether Sorter is Active or Passive Transformation?Ans:Sorter is Active Transformation.

    62. What is the life cycle of DWH?Ans:

    63. What is the Top-Down and Bottom-Up approach?Ans:Top-Down Approach:

    InfoInfo

    AccessAccess

    InfoInfoAccessAccess

    Reportingtools

    WebBrowser

    s

    OLAP

    Mining

    ETLETLETLETL

    External DataStorage

    BusinessRequiremen

    t

    MapData

    sources

    ReverseEngg.

    Map Req.to OLTP

    OLTPSystem

    LogicalModelin

    g

    RefineModel

    DataData

    WarehousWarehousee

    DataDataWarehousWarehous

    ee

  • 7/30/2019 DWH Interview Questions-Tech

    13/21

    In this approach data warehouse consists of all the business information in a single datawarehouse. We can identify what keeps management awake.Bottom-Up Approach:In this approach first data marts are created and then data warehouse is loaded. We can easilyidentify which data sources are available.

    64. In aggregator properties, if you are using sorted-input and not using Sortertransformation, will the output correct?Ans:No, the output remains same.

    65. Can we use a mapplet variable in a mapping? Or mapping variable in a mapplet?Ans:

    66. What is the difference between Source commit and Target commit?Ans:Target-Based commit:In target based commit, if the server reaches the commit interval it continues to fill the writerbuffer. When the block is filled, the informatica server issues a commit command. As a result, theamount of data commited at the commit point generally exceeds the commit interval.

    Source-Based commit:In Source based commit, informatica server commits data to the target based on the number ofrows from an active source in a single pipeline.

    An active source can be any of the active transformations except Filter, Router and UpdateStrategy Transformations.

    67. Why staging is required for ur project?Ans:

    68. Scenario: in Lookup transformation, one condition is having SQL override (Empno 1000), which is dynamic. How will u resolve thissituation?

    Ans:

    69. What is Source Qualifier Transformation? What is its use and the activities can beperformed with Source Qualifier?Ans:It is used to represent data read from relational and flat file sources.

    70. Whether default condition is used in Filter transformation?Ans: No

    71. What is the use of Update strategy at mapping level and session level?Ans:Mapping level: It performs updates at Row level.

    Session level: It performs updates at table level.

    72. For Unconnected lookup, where can we get the output for target? Where do we specifythe connection to get the output?Ans:For unconnected lookup, there wont be any connection, we need to give the condition in someother transformation.

    Condition :LKP.lookup_transformation_name(argument, argument, ...)

  • 7/30/2019 DWH Interview Questions-Tech

    14/21

    73. For unconnected lookup, is it possible to use dynamic cache? If not why?Ans:No, because unconnected uses only static cache. The cache will not change during the sessionrun.

    74. How to set the language option on Informatica server? Which is used of what?Ans:

    75. Whenever the Stored Procedure runs during a mapping, the session must wait for theStored Procedure to complete in the database. Can we avoid this to happen?Ans:

    76. How to get an error message Popup window when we get any error during sessionrun?Ans:

    77. How to trap when we get NULL values in Lookup port?Ans:In lookup transformation properties, there is a port called IgnoreNULL, we can handle NULLvalues with this option.

    78. Different phases in ur project?Ans:Refer to Q # 101.

    79. What SDLC u has followed?Ans:

    80. In Lookup transformation, how to handle 2 duplicate records?Ans:In LookUp properties there is an option to pickup the First, Second values. We can pick eitherone among these.

    81. What are the types of mapping variables? What is the use?Ans:Mapping Variables:These are user-defined variables. These can be used in different transformations differently.Variable Functions:

    SetCountVariable

    SetMaxVariable

    SetMinVariable

    SetVariable

    82. Scenario: 2 oracle tables and one flat file, how many joiners are needed?Ans:If the oracle tables are from the same schema, only one joiner will work otherwise we need to 2

    joiners.

    83. What is partitioning and in which situation we use?Ans:Partitioning:

    A partition is a pipeline stage that executes in a single reader, transformation and writer threads.Partitioning improves the session performance. If we increase no. of partitions, sessionperformance will improve.When the performance is low, with partitioning we can increase the performance.

  • 7/30/2019 DWH Interview Questions-Tech

    15/21

    85. What is referential Integrity?Ans:Referential Integrity is used to keep the relations among tables. To keep the integrity of datathrough foreign key constraints.

    86. What is fact less fact table and in which situation we use?Ans:It contains no measures and has only foreign keys. These tables are used to check whether thecombination of all the dimensions or foreign keys exists or not.

    87. PL/SQL code for restricting the salary not to decrease but can increase the existingsalary?Ans:

    88. What are NewLookupRow Values?Ans:NewLookupRow values are returned in connected Lookup dynamic cache.

    0 -> Informatica server doesnt update or insert the row in the cache.

    1 -> Informatica server inserts the row into the cache2 -> Informatica server updates the row in the cache.

    89. What are the types of Tracing levels? Explain? Which is best for performance?Ans:There are 4 types of Tracing levels.1. Normal2. Terse3. Verbose Initialization4. Verbose Data

    Normal or Terse are for performace. If we use the other two it takes much time to run theworkflows.

    90. Difference between OLAP Metadata & OLTP metadata? Advantages of OLAP overOLTP?Ans:OLAP Metadata: Online Analytical ProcessingFeatures:

    1. It is a Multi-Dimensional Model2. Standard Aggregation Methods3. Data re-load mechanism4. Defining special dimension (Time, Currency etc)5. Rule-based Consolidation

    OLTP Metadata: Online Transactional processing. Forcibly we have to create metadata. This

    data is used in day to day transactions like Bank Account information.Ex: Oracle data dictionary is one example.

    Features:1. Relational Model2. Multi-Threading3. Data Retrieval mechanism

    91. What is DTM process?Ans:

  • 7/30/2019 DWH Interview Questions-Tech

    16/21

    When the workflow reaches a session, the Load Manager starts the Data TransformationManager (DTM) process. The DTM process is associated with the session task. The LoadManager creates one DTM process for each session in the workflow. The DTM process performsthe following tasks:

    Reads session information from the repository. (Mapping & Session)

    Expands the server and session variables and parameters.

    Creates the session log file. (Log files can be used to troubleshoot system and sessionproblems).

    Validates source and target code pages.

    Verifies connection object permissions.

    Runs pre-session shell commands, stored procedures and SQL.

    Creates and runs mapping, reader, writer, and transformation threads to extract,transform, and load data.

    Runs post-session stored procedures, SQL, and shell commands.

    Sends post-session email.

    92. How to handle errors in Informatica?Ans:In Task properties, Using the attributes like

    Stop on errorsOverride Tracing : Tracing levelOn Stored Procedure errorOn Pre-session command task errorOn Pre-Post SQL errorwe can handle the errors.

    93. Try to see the advanced session properties in Informatica which deals with errorhandling and how to get a pop window of possible error cause. How to kill a sessionduring run time when error occurs.Ans:By default static cache has "Report Error" option for multiple matches. If Lookup transformation

    finds multiple matches it sends the error message or First row or Last row.

    94. In Dynamic lookup, what are Named and unnamed caches?Ans:For cache, named caches can have name for reference. Unnamed caches don't have name.We can name the cache files under the Lookup cache directory ($PMCacheDir) if it is a persistentcache. We can reuse these cache files in other mappings & transformations also.

    File name prefix for the persisted cache files. Lookup will read from the named cache files. If thenamed persistent cache files exist, the Informatica Server builds the memory cache from the files.If the named persistent cache files do not exist, the Informatica Server rebuilds the persistentcache files.Named caches can be used in the other mappings also. Where unnamed caches cant be used in

    the other mappings, these are used just in that Transformation.

    95. What is metadata exchange view?Ans:Metadata Exchange view provides a set of relational views that allow easy SQL access to theInformatica metadata repository. The Repository Manager generates these views when we createor upgrade a repository.

    96. Session level update strategy.Ans:

  • 7/30/2019 DWH Interview Questions-Tech

    17/21

    Mapping level: It performs updates at Row level.Session level: It performs updates at table level.

    97. Difference b/w Abort and Stop?Ans:

    STOP:1. It Stops reading.2. Continues processing data.3. Continues writing and committing data to targets.4. If the Informatica Server cannot finish processing and committing data, you need to issue the

    Abort command to stop the session.

    ABORT:1. It Stops reading.2. Continues processing data.3. Continues writing and committing data to targets.4. If the Informatica Server cannot finish processing and committing data within 60 seconds, itkills the DTM process.

    98. How do you implement configuration management in Informatica?Ans:There are several methods to do this. Some of them are:Taking a back up of the repository as a binary file and treat it as a configurable item. ImplementFolder Versioning utility in Informatica.

    99. A mapping containsSource Table S_Time ( Start_Year, End_Year )Target Table Time_Dim ( Date, Day, Month, Year, Quarter )

    Stored procedure transformation: A procedure has two input parameters I_Start_Year,I_End_Year and output parameter as O_Date, Day, Month, Year, Quarter. If this session isrunning, how many rows will be available in the target and why?.

    Ans:Only one row the last date of the End_Year.

    All the subsequent rows are overriding the previous rows.

    100. Discuss two approaches for updation of target table in informatica and how they aredifferent.Ans:Update strategy transformation: We can write our own code .It is flexible.

    Normal insert / update /delete (with proper variation of the update option) :It can be configured in the Session properties.

    Any change in the row will cause an update. Inflexible.

    101. Development Methodology?

  • 7/30/2019 DWH Interview Questions-Tech

    18/21

    102. What are the different threads used by the DTM process?Ans:Master Thread:Main thread of the DTM process. Creates and manages all other threads. Handles stop andabort requests from the Load Manager.Mapping Thread:One thread for each session. Fetches session and mapping information. Compiles the mapping.Cleans up after session execution.Session Thread:One thread each to perform pre-session and post-session operations.Reader Thread:

    One thread for each partition for each source pipeline. Reads sources. Relational sources userelational threads, and file sources use file threads.Writer Thread:One thread for each partition, if a target exists in the source pipeline. Writes to targets.Transformation Thread:One or more transformation threads for each partition.

    BusinessBusiness

    RequirementsRequirements

    PlanningPlanning

    Conceptual-Conceptual-

    izationization

    BusinessBusinessRequirementsRequirements

    DefinitionDefinition

    Design &Design &

    PrototypingPrototyping

    ArchitectureArchitecture

    ConstructionConstruction

    Unit TestingUnit Testing

    ImplementationImplementation

    Integration &Integration &

    SystemSystem

    TestingTesting

  • 7/30/2019 DWH Interview Questions-Tech

    19/21

    DATAMODELLING:

    1. What is Star Schema? How many Star Schemas in your project?Ans:

    2. What are facts and Dimensions? Differentiate between them?Ans:

    3. What is a fact table and dimension table?Ans:

    4. What are factless fact table?

    Ans:It contains no measures and has only foreign keys. These tables are used to check whether thecombination of all the dimensions or foreign keys exists or not.

    5. How many fact tables and dimensional tables in ur project? How much time it has takento load?Ans:

    6. What is the size of fact tables?Ans:

  • 7/30/2019 DWH Interview Questions-Tech

    20/21

    7. What is a SnowFlake schema? Difference between Star Schema and Snowflakeschema?Ans:

    8. Is there any other schemas?Ans:Constellation schema.

    9. What are rapid changing dimensions?Ans:

    10. How does data modeling done?Ans:

    11. Does fact tables contain String values?Ans:

    12. Can 2 fact tables share same dimension tables?Ans:

    ORACLE:

    1. How to work with Triggers and Procedures in Oracle?

    2. What is the difference between Package and Procedure in PL/SQL?Ans:

    3. What is the difference between Primary Key and Indexes?

    Ans:Primary Key:

    Indexes:

    4. What are all the types of Constraints in Oracle?

    5. How to write StoredProcedure?

    6. What is a View? Difference b/w materialized view and Views?

  • 7/30/2019 DWH Interview Questions-Tech

    21/21

    7. What is an Index? Types of indexes? Disadvantages of it?Ans:B-Tree and Bit-Mapped index

    8. Whether you used B-Tree or Bitmapped Index and their advantages?

    9. What is a Stored procedure?

    10. What is a Co-related Sub-Query? In subquery, whether the outer query executes first or Innerquery?

    11. What are Triggers and types of Triggers?

    12. Is Instead of Trigger exists in Oracle 8.0 version?

    13. Can we use trigger while inserting the data? How?

    14. Can a SQL query written on a View?

    15. Different types of Locks in Oracle?

    16. What is flag and why is it used?

    17. Type of Oracle Joins?

    18. What is NVL and CURVAL?

    19. What is a Cursor? Different types of Cursors?

    20. How to create users in Oracle using SQL?

    21. Types of Partitions in Oracle

    Ans:Range/List PartitionsHash partitions

    Business Objects: