talend open studio fundamentals #1: workspaces, jobs, metadata and trips & tricks
DESCRIPTION
Introduction to Talend Open Studio for Data Integration, focusing on job architecture, metadata, workspaces, connection types and common use components. Rick Tips & Tricks sectionsTRANSCRIPT
![Page 1: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/1.jpg)
Talend Open StudioFundamentals
gabrielebaldassarre.com
![Page 2: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/2.jpg)
What is Talend for Data Integration?
❏ Eclipse-based visual programming IDE for ETL
applications
❏ Java code generator
❏ 600+ connectors for open and proprietary data systems
❏ Easily embeddable in custom applications
❏ Cross-platform
❏ Central metadata repository
❏ Available in both open source and premium flavours
![Page 3: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/3.jpg)
What does ETL stand for?
It summarizes every operation that loads, retrieves,
digests, consumes, transforms and shapes data:
❏ Extract - get the data from different sources.
From flat files, RDBMS, Big Data systems, web services, business...
❏ Transform - convert it in a form suitable for the destination
data system.
Aggregate, transform, combine, reshape, clean, filter, improve quality...
❏ Load - move to target destination in a suitable way.
Write the data in the target format.
![Page 4: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/4.jpg)
Talend Open Studio
❏ It’s the open source, free to use, community-supported
version of Talend for Data Integration;
❏ Often abbreviated in “TOS”, to differ from the premium
version (“TIS”);
❏ Features-lite, but still completely usable:
❏ Same set of connectors and components of the premium
version;
❏ It misses team working and Enterprise capabilities like
SVN, scheduling, process orchestrations and monitoring
console.
![Page 5: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/5.jpg)
Hands on!
❏ Download Talend Open Studio for Data Integration
❏ https://www.talend.com/download/data-integration
❏ Download the user manual as well
❏ Install it!
❏ Optional:
❏ Prepare a quick MySQL stack for a ready-to-start
database and other commodities
❏ https://github.com/r8/vagrant-lamp it’s worth the try
![Page 6: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/6.jpg)
Say hello to TOS!
![Page 7: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/7.jpg)
TOS Interface: Designer
The Designer is the “canvas” where you’re going to “draw” your ETL job, graphically connecting components each others using different kinds of connectors.
![Page 8: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/8.jpg)
TOS Interface: Components Palette
The Palette on the right hosts the complete set of 600+ available components, both custom and built.
Use the search field to quickly filter the palette views and find the component you need in a glance.
![Page 9: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/9.jpg)
TOS Interface: Opened Jobs
Currently Opened jobs are tabbed on top...
![Page 10: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/10.jpg)
TOS Interface: Repository Pane
The Repository pane hosts all the metadata, like DB connections credentials, external delimited file schemas, parameters and the whole set of ETL jobs themselves.
![Page 11: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/11.jpg)
TOS Interface: Parameters Pane
The Parameters pane hosts all the select-component settings, job settings and parameters, debug status and the diagnostic tab.
![Page 12: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/12.jpg)
TOS Interface: Perspectives
...and different Perspectives are available on top-notch corner.
Both TOS and standard Eclipse perspectives are available here.
![Page 13: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/13.jpg)
Workspaces
A Workspace is a container of Projects which shares the
same TOS version and the same components palette.
Like Eclipse, you can choose which one to use when the
program starts.
❏ In TOS, it’s a folder in the local drive.
![Page 14: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/14.jpg)
Projects
❏ A Project is a set of jobs and involved metadata;
❏ It’s defined under a subfolder into the Workspace;
❏ Both TOS and Eclipse Preferences are Project-based
❏ In other words, different projects in the same Workspace
have different settings;
❏ Internally, it’s a mix of XML, .items and .properties files
in a classical Eclipse flavour.
![Page 15: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/15.jpg)
Metadata: General Principles
❏ TOS requires preliminary definition and
description of jobs using metadata.
The Repository holds this information.
❏ There are 8 types of metadata,
although custom components can
define their owns. We’ll look the most
important ones in details:
❏ Business Models, Job Projects, Contexts,
Code, Metadata.
![Page 16: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/16.jpg)
Metadata: Business Models
❏ It stores diagrams used to
conveniently describe business models
and to embed them with ETL;
❏ It offers a small set of drawing
capabilities in UML-fashion;
❏ It’s not widely used, but it’s proven to
be useful to quickly sketch-up
transformation goals and for auto-
documenting ETL.
![Page 17: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/17.jpg)
Metadata: Jobs
❏ It’s the warm heart of TOS Repository:
the jobs themselves;
❏ Here you’ll store all the metadata you
need for graphically describing the jobs
❏ Components used, connectors, signals,
parameters, colors and presentation
stuff are hosted here.
❏ You can (you should!) organize them in
a tree manner for better clarity.
![Page 18: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/18.jpg)
Metadata: Contexts
❏ It stores context groups which are
parameters sets that can be used by
any job in current Project.
❏ A group is a set of initialized java
variables of one of the allowed types in
the global scope.
❏ Groups are for presentation only: you’
ve no limitations on how many or how
to use context variables in jobs.
![Page 19: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/19.jpg)
Metadata: Code
❏ It stores routines written in Java;
❏ These routines are typically a set of
static methods inside a class.
❏ If your routine is going to be too much
complex, consider writing a custom
component instead.
❏ Consider using maven and git while
creating a routine for better reliability.❏ https://github.com/theclue/talend-routine-collection
![Page 20: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/20.jpg)
Metadata: ...Metadata?
❏ It stores a heterogeneous set of
reusable, atomic elements for jobs.
❏ They include database parameters and
credentials, external files schema, web
service interfaces, business
applications connections and so on.
❏ User components often add their
metadata types to the list, but this
often breaks compatibility
![Page 21: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/21.jpg)
Anatomy of a Job
❏ A Job is a visual set of components graphically
connected using different connections;
❏ From the visual canvas and the connection topology,
TOS in turn generates Java code;
❏ This code is procedural by design and not really object
oriented:
❏ It’s fast…
❏ ...but the debug is a pain in the neck for the experienced
programmer.
![Page 22: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/22.jpg)
Anatomy of a job
❏ Drag and Drop components from the Palette to the canvas,
then visually connect them each other.
❏ You cannot make closed paths in your jobs!
❏ It’ll become clear later why.
![Page 23: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/23.jpg)
Anatomy of a job: Subjobs
❏ A set of connected components is part of a subjob if they are
all enclosed by a light-blue background;
❏ You can have as many subjobs you need in a given job.
![Page 24: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/24.jpg)
Anatomy of a job: Starting Point
❏ The starting point component of a subjob is the one with a
green background;
❏ Parallel execution is made using unconnected subjobs, but
you won’t be able to predict the execution order!
![Page 25: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/25.jpg)
Anatomy of a job: Main Connections
❏ The Main connections are those that dictate the data flow;
❏ They carry on vectors of data (one vector per row/tuple);
![Page 26: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/26.jpg)
Anatomy of a job: Main Connections
❏ The Main connections are those that dictate the data flow;
❏ They carry on vectors of data (one vector per row/tuple);
❏ When you have a split, the order dictates who’s come first.
You may change it from the contextual menu.
![Page 27: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/27.jpg)
Anatomy of a job: Lookup Connections
❏ Lookup connections, as the name suggests, make data
available for fast-lookup (ie join or match operations).
❏ Typically, lookup data vectors are stored in-memory during
job processing. So watch out for memory shortage!
![Page 28: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/28.jpg)
Anatomy of a job: Endpoints
❏ Endpoints are components that have not outgoing
connection.
❏ A given subjob can have as many endpoints as needed (think
about of what’s going on after a split operation like the above).
![Page 29: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/29.jpg)
Signals and Data Connections
❏ There are three types of connections in standard TOS:
❏ Row
❏ Trigger
❏ Iterator
❏ You may select which connection to use from the
contextual menu of any component instance.
![Page 30: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/30.jpg)
Row
❏ Rows are connections that carry on data, one tuple at
once;
❏ Their content is defined by a Schema;
❏ They are used to connect components;
❏ Components connected this way will end up in the same
subjob;
❏ Main, Lookup, Filter, Merge are all data connections;
❏ Custom components can define their own Data
Connection.
![Page 31: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/31.jpg)
Schema
❏ Schema is an important inner concept in TOS design;
❏ Each Row connection must have non-null schema
declaration which defines the dimensionality of the
vector of data ingoing and outgoing to/from a given
component;
❏ Several primitive java types are supported.
![Page 32: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/32.jpg)
Triggers
❏ Triggers, as the name suggest, won’t carry on data,
but are actually signals.
❏ They are usually used to connect subjobs.
❏ They comes in two main flavours, depending on their
scope: Sub Job Triggers and Component Triggers.
❏ They’re typically Go/No-Go events to trig the execution
of one or more subjobs;
![Page 33: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/33.jpg)
Sub Job Triggers
❏ Sub Job Triggers are the most
widely used in practice;
❏ They are used to connect the
starting points of subjobs;
❏ When connected this way,
subjobs will execute sequentially,
forcing an execution order;
❏ You’ll end up having only one
starting point for the whole chain.
![Page 34: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/34.jpg)
Run If Triggers
❏ Run If Trigger is a special type of trigger that is fired
only if the embedded expression is evaluated to true.
❏ The expression must be written in Java and have a
boolean outcome.
![Page 35: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/35.jpg)
Iterators
❏ Iterators stands in the middle between Data
Connections and Triggers;
❏ They won’t carry on data like Rows…
❏ ...but they’re not fired only once like Triggers.
❏ Think of them like Triggers which will be fired once for
each incoming row.
❏ They are connected to starting points, like SubJob
Triggers, but originates from standard components like
Row Connections.
![Page 36: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/36.jpg)
Component Parameters
❏ When you select a component instance, the parameter
pane will show the relevant fields to you to fill up;
❏ Several types of parameters are allowed: dropdown,
radio buttons, schemas, text fields...
❏ Text fields will often end up writing their value into the
generated java code as-is, so be sure to write them
properly:
❏ Enclose strings in double quotes;
❏ Be sure to match the expected type, or cast
otherwise
![Page 37: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/37.jpg)
Components and Repository
❏ Very often, Components allows you to select a relevant
metadata from the Repository;
❏ Doing so, you will be able to keep parameters between
jobs and component instances “in sync”;
❏ However, this is not mandatory and at any time you
can detach the component from the Repository.
❏ This brings the component in “built in” state, which
means that its parameters are locally defined and won’t
be updated anymore if the Repository is.
![Page 38: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/38.jpg)
The Context
❏ The Context holds parameters defined at compile time
❏ Those parameters are grouped in Context Groups and
defined into the Repository as primitive java types.
❏ Then, they will end up as public attributes of the
context object inside the code.
❏ For example, a parameter named “foo” will be referenced
using the syntax context.foo in code and paramters
fields.
❏ Just like parameters, “built in” Context can be defined,
too, to scope it in local job only.
![Page 39: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/39.jpg)
The Global Map
❏ The Global Map holds parameters defined at runtime
❏ Those parameters live in a pure Java space.
❏ It’s a Key-Value Map used to store generic Objects:
❏ globalMap.put(“key”, Object) to store an object
❏ globalMap.get(“key”) to get an Object
❏ Since it’s a <Object> Java Map, you must explicitly
cast to proper type when getting back the object.
❏ It’s proven very handy when used in conjunction with
Iterators, as they cannot carry data alone.
![Page 40: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/40.jpg)
Talend Open StudioCommon-use Components
gabrielebaldassarre.com
![Page 41: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/41.jpg)
Which component to use…?
❏ TOS comes with more than 600 general-use items;
❏ This because it must assure connectivity with tons of
different data sources (ie RDBMS, appliances…);
❏ Cleaning up those garbage, you’ll end up with a very
small subset of life-saving components. We can group
the most important ones in families and look in details:
❏ Database, File, Custom Code, Processing, Orchestration
![Page 42: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/42.jpg)
File Components
❏ These components are used for input and
output from/to local files;
❏ Notable features includes the archiving
capabilities and a complete set of file
system management stuff, like copy, delete
or directory listing;
❏ Under Linux, you can use named pipe for
streaming data into TOS directly from a
caller shell.
![Page 43: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/43.jpg)
Database Components
❏ These components are used for performing
operations on RDBMS;
❏ Notable features includes the components
for SCD and cloud support (ie AWS
Redshift);
❏ Unfortunately, for licensing issues, you often
have to download the jdbc wrapper from
the RDBMS vendor by yourself in order to
use it in TOS; quite annoying!
![Page 44: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/44.jpg)
Custom Code Components
❏ These components allow you to directly
write java code into your Job;
❏ Although quite hard to manage, these are
real life-saver in lot of different situations;
❏ Typical use case is when you want to import
and use an external java library or method.
❏ Several components are available for
different scopes, ie generate data flows,
processing rows, etc...
![Page 45: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/45.jpg)
Processing Components
❏ These are probably the most important
components at all;
❏ They include sort, filter, aggregation, join,
sampling, XML traversing;
❏ But the most important component ever is
the tMap;
❏ It’s a general purpose multi-input, multi-
output mapper component.
❏ We’ll look on it in details...
![Page 46: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/46.jpg)
tMap in a typical Job
❏ Basically speaking,
think about a set of
joins, a set of splits
and transformations
set in the middle.
❏ That’s why it has a
special user interface.
![Page 47: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/47.jpg)
Say hello to tMap
![Page 48: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/48.jpg)
Say hello to tMap
Here come the Input Data Connections with their own Schemas. Only one is the Main connection, the others are all Lookup connections. Here’d you’ll set the join conditions. Clicking the wrench reveal more options, like the join type and how to load the lookup tables.
![Page 49: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/49.jpg)
Say hello to tMap
While on the right pane we’ve the Output Data Connections, each of them with its Schema, too. Again, the wrench reveal more options, for example if the connection must catch rows where the join has failed and more...
![Page 50: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/50.jpg)
Say hello to tMap
Each output field is a java expression. This mean you can call methods on it, user routines, combine expression and more. Click on it to open the powerful Expression Wizard.
![Page 51: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/51.jpg)
Say hello to tMap
As a commodity, you have the Var pane for adding temp variables. Use it if your inner transformations cannot be easily handled in a single-line java expression.
![Page 52: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/52.jpg)
Say hello to tMap
The Schema Editor is for both input and output connections. Check and set here the data types, the length, the nullable flag for each field.
![Page 53: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/53.jpg)
Orchestration Components
❏ These components, as the name states, are
used to “make order” inside and outside the
jobs;
❏ They allows you to call a TOS jobs from
another, to put a job in wait state and more.
❏ Here’re you will find two components to switch
between Row and Iterator Connections;
❏ Typical use case is when you want to trig an
event for each row in the incoming connection.
![Page 54: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/54.jpg)
Other useful components
❏ tPreJob and tPostJob are two special starting
points that are respectively triggered before
and after all other subjobs in the current job;
❏ tLogRow is to log the content of a given Row
connection into the console;
❏ tHashInput and tHashOutput are useful to
define reusable buffers of data inside a job;
❏ tLibraryLoad is to import external jars into
the classpath of the current job.
![Page 55: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/55.jpg)
Talend Open StudioTips and Tricks
gabrielebaldassarre.com
![Page 56: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/56.jpg)
Tips and Tricks
❏ Use Repository metadata when possible:
it’ll make your design more robust.
❏ Generic Schema metadata, as the name
suggests, are useful to define schema that you
don’t want to be format and platform
dependant, like file schema or database table
schemas.
❏ Always documentate your jobs: this can be
exported to a ready-to-use document then!
![Page 57: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/57.jpg)
Tips and Tricks
❏ Clicking “Sync Schema” will propagate
current schema forward changing any
schema to “built in” in the way.
❏ Built in Schemas won’t get updated when
Repository changes!
❏ If you have large lookups, sort, aggregate
operations, you may need to rise the amount
of ram devoted to jvm in Job Parameters.
❏ You may get a java heap error otherwise.
![Page 58: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/58.jpg)
Tips and Tricks
❏ Every transformation is a java expression
in Talend!
❏ Handle the null value properly to avoid Java
NullPointerExceptions;
❏ Use primitive wrapper when possible (ie.
‘Integer’ instead of ‘int’;
❏ Use methods, not operators (ie .equals() and .
concat()).
❏ Perform filtering as soon as possible to
reduce the memory consumption.
![Page 59: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/59.jpg)
Getting Help
❏ Talend Forge: forum, custom components, tutorials,
bug trackers, example jobs
❏ http://stackoverflow.com/questions/tagged/talend
❏ Stack Overflow
❏ http://stackoverflow.com/questions/tagged/talend
❏ Books from Packt Publishing
❏ “Getting started with Talend Open Studio for Data
Integration” by Jonathan Bowen;
❏ “Talend Open Studio Cookboo” by Rick D. Barton.
![Page 60: Talend Open Studio Fundamentals #1: Workspaces, Jobs, Metadata and Trips & Tricks](https://reader035.vdocuments.us/reader035/viewer/2022081720/557d5f3fd8b42abf3d8b4ffd/html5/thumbnails/60.jpg)
Contacts
❏ Tutorials
❏ Custom components
❏ Ready-made jobs
❏ Use Cases
http://gabrielebaldassarre.com
Need help? Questions? Consulting needs?
http://gabrielebaldassarre.com/contacts/
@cerealping