cloveretl training sample

12
Training course goals Develop and support solutions based on CloverETL technology Compose and debug transformations in CloverETL Designer Connect to any number of data sources or sinks (files, databases, web/cloud…) Detect and react to errors in the data Use CloverETL Server to process large amounts of data Design and develop Job Flows to manage complex processes Support existing solutions using CloverETL © 2013 Javlin; All rights reserved 1

Upload: cloveretl

Post on 23-Dec-2014

3.373 views

Category:

Devices & Hardware


0 download

DESCRIPTION

A sample from CloverETL training available on-site or online

TRANSCRIPT

Page 1: CloverETL Training Sample

Training course goals

• Develop and support solutions based on CloverETL technology– Compose and debug transformations in CloverETL Designer– Connect to any number of data sources or sinks (files,

databases, web/cloud…)– Detect and react to errors in the data– Use CloverETL Server to process large amounts of data– Design and develop Job Flows to manage complex

processes– Support existing solutions using CloverETL

© 2013 Javlin; All rights reserved1

Page 2: CloverETL Training Sample

Course modules: basics

1. CloverETL introductionCloverETL product family, basic terminology

2. First steps in CloverETL DesignerBuilding transformations with basic components, reading and writing data

3. Error handlingProperly handling input data errors and runtime errors

4. Common components and CTL programmingCommonly used components and business rule development in CTL

5. DatabasesConnecting to databases and using them as data sources and targets

© 2013 Javlin; All rights reserved2

Page 3: CloverETL Training Sample

Course modules: advanced

6. Structured dataHandling complex data formats like XML or JSON, using web services

7. Advanced graph designComplex transformation components, Java transformations and more

8. CloverETL ServerIntroduction to CloverETL Server, its user interface and execution environment

9. JobflowsBuilding jobflows to manage your processes on CloverETL Server

10. Advanced CloverETL ServerAdvanced graph scheduling, using Launch Services and CloverETL Cluster

© 2013 Javlin; All rights reserved3

Page 4: CloverETL Training Sample

CloverETL product family

• CloverETL is a whole family of products– Support for broad range of usage scenarios– Purely Java-based – supported on many operating

systems• Windows, *nix, Linux, Mac

© 2013 Javlin; All rights reserved4

Java

CloverETL Engine

CloverETL Designer

CloverETL Server

Page 5: CloverETL Training Sample

Metadata

• Metadata describe record structure and format– Required for each edge used in the graph to define the format of the data flowing

through that edge

• Structure defines fields in the record– Unique (within record) field names– Data types to determine type of information which can be stored in the record.– Flat structure – no nesting is allowed

• Format defines rules for data input and output– Format of the record: delimited, fixed-length or mixed

• Delimiters only apply when working with files

– Parsing rules for readers and formatting rules for writers• Special formatting for numbers, date fields, …

© 2013 Javlin; All rights reserved5

Page 6: CloverETL Training Sample

© 2013 Javlin; All rights reserved6

Metadata types and fields

• Record type determines how to find the fields– Delimited: fields are separated by delimiters– Fixed-length: each fields has predefined number

of characters– Mixed: both types of fields in single record

• Fields can be of various types– Numeric: integer, long, number, decimal– Text: string– Boolean values: boolean– Date and time: date– Other: byte, cbyte– Containers: list or map of a primitive type

Transaction

transactionType

string

transactionId long

accountNumber long

timestamp date

amount decimal(20, 3)

3

1

2

5

4

Page 7: CloverETL Training Sample

© 2013 Javlin; All rights reserved7

1340817132,3293200814,D,59.940,201001021252431340817156,5357054331,C,6.720,20100116080136

1340817746,4270100470,D,194.920,20100323100706

Field ordering matters

• Ordering of the fields is very important– For parsing– For output formatting

• Data is read/written in the same order in which the fields are defined.

Transaction

transactionType

string

transactionId long

accountNumber long

timestamp date

amount decimal(20, 3)

3

1

2

5

4

Page 8: CloverETL Training Sample

Reformat and CTL code

• Transformation in Reformat can be written directly in CTL without using Visual Mode– Use all CTL features: control structures, error handling, logging…– Write comments explaining the complex parts of the code

• Editor supports syntax highlighting, autocomplete and on-the-fly code validation

© 2013 Javlin; All rights reserved8

Page 9: CloverETL Training Sample

© 2013 Javlin; All rights reserved9

Reformat code workflowinit

preExecute

backbacktransform

postExecute

transformOnError

Called during component initialization

Called before the first record is processed.

Main part of the transformation. Called once for each input record.Return value determines which port (if any) receives the result.

Called after the last record is processed immediately before component finishes.

Begin

End

Next

reco

rd

Error

Called only if transform caused an error.

Page 10: CloverETL Training Sample

© 2013 Javlin; All rights reserved10

Data denormalization

accountId customerId

balance created closed

9804568699

27345 2300.56 2011-11-14

1108193472

27345 -1739.05 2005-07-22

6054951154

27345 4500.60 2009-09-01 2010-04-30

9459175447

27345 3200.80 2011-03-08

customerId

totalBalance

accounts

27345 8262.91 [9804568699, 1108193472, 6054951154, 9459175447]

CustomerAccounts

AccountOriginal dataMultiple records grouped based on the key.

Denormalized dataSingle record containing values determined by processing the whole input group.

Denormalize

Page 11: CloverETL Training Sample

© 2013 Javlin; All rights reserved11

Denormalizer

• Converts data into denormalized form– Combine multiple records in a group into one output record– Output usually uses different metadata

• Required configuration– Transformation code

• Only CTL can be used, visual mode is not available

– Grouping• Group can be defined based on a key or group size• If key is used, data has to be sorted

Page 12: CloverETL Training Sample

© 2013 Javlin; All rights reserved12

Denormalize code workflowinit

preExecute

backbacktransform

postExecute

transformOnError

transform and append have their own error handler. Each handler interrupts the group and resumes processing as if the group was processed as a whole.append is called once for each

record in a group. It is typically used to update global variables which are then used in transform function.

transform is called once per group and is the only function which generates output records.

clean is called after each transform and can be used to clean-up internal variables.

Begin

End

Next

reco

rd

Error

backbackappend

back backclean

appendOnError

Next

gro

up