bods interview questions

8/13/2019 BODS Interview Questions

1/10

1. What is the use of BusinessObjects Data Services?Answer:BusinessObjects Data Services provides a graphical interface that allows you to easily create jobs that

extract data fromheterogeneous sources, transform that data to meet the business requirements of yourorganization, and load the data into a single location.

2. Define Data Services components.Answer:Data Services includes the following standard components:

Designer Repository Job Server Engines Access Server Adapters Real-time Services Address Server Cleansing Packages, Dictionaries, andDirectories Management Console

3. What are the steps included in Data integration process?Answer:

Stage data in an operational datastore, data warehouse, or data mart. Update staged data in batch or real-time modes. Create a single environment for developing, testing, and deploying the entire data integration

platform.

Manage a single metadata repository to capture the relationships between different extraction andaccess methods and provide integrated lineage and impact analysis.

4. Define the terms Job, Workflow, and DataflowAnswer:

A job is the smallest unit of work that you can schedule independently for execution. A work flow defines the decision-making process for executing data flows. Data flows extract, transform, and load data. Everything having to do with data, including reading

sources, transforming data, and loading targets, occurs inside a data flow.

3. What are the steps included in Data integration process?


2/10

Answer: Stage data in an operational datastore, data warehouse, or data mart. Update staged data in batch or real-time modes. Create a single environment for developing, testing, and deploying the entire data integration

platform. Manage a single metadata repository to capture the relationships between different extraction and

access methods and provide integrated lineage and impact analysis.

4. Define the terms Job, Workflow, and DataflowAnswer:

A job is the smallest unit of work that you can schedule independently for execution. A work flow defines the decision-making process for executing data flows. Data flows extract, transform, and load data. Everything having to do with data, including reading

sources, transforming data, and loading targets, occurs inside a data flow.

5. Arrange these objects in order by their hierarchy: Dataflow, Job, Project, and Workflow.AnswerProject, Job, Workflow, Dataflow.

6. What are reusable objects in DataServices?Answer:

Job, Workflow, Dataflow.

7. What is a transform?Answer:A transform enables you to control how datasets change in a dataflow.

8. What is a Script?Answer:A script is a single-use object that is used to call functions and assign values in a workflow.

9. What is a real time Job?Answer:


3/10

Real-time jobs "extract" data from the body of the real time message received and from any secondary

sources used in the job.

10. What is an Embedded Dataflow?

Answer:An Embedded Dataflow is a dataflow that is called from inside another dataflow.

11. What is the difference between a data store and a database?Answer:A datastore is a connection to a database.

12. How many types of datastores are present in Data services?Answer:Three.

Database Datastores: provide a simple way to import metadata directly froman RDBMS. Application Datastores: let users easily import metadata frommost Enterprise Resource Planning

(ERP) systems.

Adapter Datastores: can provide access to an applications data and metadata or just metadata.13. What is the use of Compace repository?Answer:Remove redundant and obsolete objects from the repository tables.

14. What are Memory Datastores?Answer:Data Services also allows you to create a database datastore using Memory as the Database type. Memory

Datastores are designed to enhance processing performance of data flows executing in real-time jobs.

15. What are file formats?Answer:A file format is a set of properties describing the structure of a flat file (ASCII). File formats describe the

metadata structure. File format objects can describe files in:


4/10

Delimited format Characters such as commas or tabs separate each field. Fixed width format The column width is specified by the user. SAP ERP and R/3 format.

16. Which is NOT a datastore type?

Answer:File Format

17. What is repository? List the types of repositories.Answer:The DataServices repository is a set of tables that holds user-created and predefined system objects,

source and target metadata, and transformation rules. There are 3 types of repositories.

A local repository A central repository A profiler repository

18. What is the difference between a Repository and a Datastore?Answer:A Repository is a set of tables that hold system objects, source and target metadata, and transformation

rules. A Datastore is an actual connection to a database that holds data.

19. What is the difference between a Parameter and a Variable?Answer:A Parameter is an expression that passes a piece of information to a work flow, data flow or custom

function when it is called in a job. A Variable is a symbolic placeholder for values.

20. When would you use a global variable instead of a local variable?Answer:

When the variable will need to be used multiple times within a job. When you want to reduce the development time required for passing values between job

components.

When you need to create a dependency between job level global variable name and jobcomponents.

21. What is Substitution Parameter?


5/10


6/10

Answer:Adapters are additional Java-based programs that can be installed on the job server to provide

connectivity to other systems such as Salesforce.com or the JavaMessagingQueue. There is also a

SoftwareDevelopment Kit (SDK) to allow customers to create adapters for custom applications.

28. List the data integrator transformsAnswer:

Data_Transfer Date_Generation Effective_Date Hierarchy_Flattening History_Preserving Key_Generation Map_CDC_Operation Pivot Reverse Pivot Table_Comparison XML_Pipeline

29. List the Data Quality TransformsAnswer:

Global_Address_Cleanse Data_Cleanse Match Associate Country_id USA_Regulatory_Address_Cleanse

30. What are Cleansing Packages?Answer:These are packages that enhance the ability of Data Cleanse to accurately process various forms of global

data by including language-specific reference data and parsing rules.

31. What is Data Cleanse?Answer:The Data Cleanse transform identifies and isolates specific parts of mixed data, and standardizes your

data based on information stored in the parsing dictionary, business rules defined in the rule file, and

expressions defined in the pattern file.


7/10

32. What is the difference between Dictionary and Directory?Answer:Directories provide information on addresses from postal authorities. Dictionary files are used to identify,

parse, and standardize data such as names, titles, and firm data.

33. Give some examples of how data can be enhanced through the data cleanse transform, and describethe benefit of those enhancements.Answer:

Enhancement Benefit Determine gender distributions and target Gender Codes marketing campaigns Provide fields for improving matching Match Standards results

34. A project requires the parsing of names into given and family, validating address information, andfinding duplicates across several systems. Name the transforms needed and the task they will perform.Answer:

Data Cleanse: Parse names into given and family. Address Cleanse: Validate address information. Match: Find duplicates.

35. Describe when to use the USA Regulatory and Global Address Cleanse transforms.Answer:Use the USA Regulatory transform if USPS certification and/or additional options such as DPV and Geocode

are required. Global Address Cleanse should be utilized when processing multi-country data.

36. Give two examples of how the Data Cleanse transform can enhance append) data.Answer:The Data Cleanse transform can generate name match standards and greetings. It can also assign gender

codes and prenames such as Mr. and Mrs.

37. What are name match standards and how are they used?Answer:


8/10

Name match standards illustrate the multiple ways a name can be represented.They are used in the match

process to greatly increase match results.

38. What are the different strategies you can use to avoid duplicate rows of data when re-loading a job.

Answer: Using the auto-correct load option in the target table. Including the Table Comparison transform in the data flow. Designing the data flow to completely replace the target table during each execution. Including a preload SQL statement to execute before the table loads.

39. What is the use of Auto Correct Load?Answer:It does not allow duplicated data entering into the target table.It works like Type 1 Insert else Update the

rows based on Non-matching and matching data respectively.

40. What is the use of Array fetch size?Answer:Array fetch size indicates the number of rows retrieved in a single request to a source database. The

default value is 1000. Higher numbers reduce requests, lowering network traffic, and possibly improve

performance. The maximum value is 5000

41. What are the difference between Row-by-row select and Cached comparison table and sorted input inTable Comparison Tranform?Answer:

Row-by-row select look up the target table using SQL every time it receives an input row. Thisoption is best if the target table is large.

Cached comparison table To load the comparison table into memory. This option is best whenthe table fits into memory and you are comparing the entire target table

Sorted input To read the comparison table in the order of the primary key column(s) usingsequential read.This option improves performance because Data Integrator reads the comparison

table only once.Add a query between the source and the Table_Comparison transform. Then, from

the querys input schema, drag the primary key columns into the Order By box of the query.

42. What is the use of using Number of loaders in Target Table?Answer:


9/10

Number of loaders loading with one loader is known as Single loader Loading. Loading when the numberof loaders is greater than one is known as Parallel Loading. The default number of loaders is 1. Themaximum number of loaders is 5.

43. What is the use of Rows per commit?Answer:Specifies the transaction size in number of rows. If set to 1000, Data Integrator sends a commit to the

underlying database every 1000 rows.

44. What is the difference between lookup ), lookup_ext ) and lookup_seq )?Answer:

lookup() : Briefly, It returns single value based on single condition lookup_ext(): It returns multiple values based on single/multiple condition(s) lookup_seq(): It returns multiple values based on sequence number

45. What is the use of History preserving transform?Answer:The History_Preserving transform allows you to produce a new row in your target rather than updating an

existing row. You can indicate in which columns the transform identifies changes to be preserved. If the

value of certain columns change, this transform creates a new row for each row flagged as UPDATE in the

input data set.

46. What is the use of Map-Operation Transfrom?Answer:The Map_Operation transform allows you to change operation codes on data sets to produce the desired

output. Operation codes: INSERT UPDATE, DELETE, NORMAL, or DISCARD.

47. What is Heirarchy Flatenning?Answer:Constructs a complete hierarchy from parent/child relationships, and then produces a description of the

hierarchy in vertically or horizontally flattened format.

Parent Column, Child Column Parent Attributes, Child Attributes.

48. What is the use of Case Transform?


10/10

Answer:Use the Case transform to simplify branch logic in data flows by consolidating case or decision-making

logic into one transform. The transformallows you to split a data set into smaller sets based on logical

branches.

49. What must you define in order to audit a data flow?Answer:You must define audit points and audit rules when you want to audit a data flow.

50. List some factors for PERFORMANCE TUNING in data services?Answer:The following sections describe ways you can adjust Data Integrator performance

Source-based performance options Using array fetch size Caching data Join ordering Minimizing extracted data Target-based performance options Loading method and rows per commit Staging tables to speed up auto-correct loads Job design performance options Improving throughput Maximizing the number of pushed-down operations Minimizing data type conversion Minimizing locale conversion Improving Informix repository performance

bods interview questions

Documents