talend data fabric - btprovider

26
Talend Data Fabric Security architecture overview

Upload: others

Post on 19-Jan-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Talend Data Fabric - btProvider

Talend Data Fabric

Security architecture overview

Page 2: Talend Data Fabric - btProvider

ContentsSummary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4

Talend architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5

Here is an overview of Talend’s functional architecture. . . . . . . . . . . . . 5

Talend Management Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Talend Data Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Talend Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Talend Data Stewardship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Talend API Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Talend API Tester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Talend Pipeline Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Hybrid infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Data Fabric infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Computation resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Talend Management Console and Talend Pipeline Designer . . . . . . 13

Talend Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Data that we collect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Data that customers use with Talend Data Fabric . . . . . . . . . . . . . 14

Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Data flows between Talend Studio and Talend Data Fabric. . . . . . . . . . 15

Metadata is transferred to Talend Cloud via the following URLs: . . . . 16

API designs are retrieved using the following secured endpoints: . . . 16

Talend Studio defaults to uploads of Talend Jobs using the following

pre-signed URLs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Data flows between Talend Studio jobs and Talend Data Fabric. . . . . . . 17

Data flows between Remote Engine and Talend Data Fabric . . . . . . . . . 17

2 Talend Data Fabric Security Architecture Overview

Page 3: Talend Data Fabric - btProvider

MSG service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Repository service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Pair service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

DTS service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Remote Engine service URL. . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Vault gateway service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Data flows in hybrid deployment between Talend Data Inventory, Talend

Data Preparation, Talend Data Stewardship, Talend Data Quality, and Talend

Data Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Public APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Security at Talend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Physical security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Security training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Secure software development . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Cloud workload protection and monitoring . . . . . . . . . . . . . . . . . . . 22

Authentication, authorization, and access control . . . . . . . . . . . . . . . 23

Standard access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Administrative access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Password management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Key management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

On AWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

On Azure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Vulnerability management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Disaster recovery and business continuity . . . . . . . . . . . . . . . . . . . . 25

Security certifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Talend Data Fabric Security Architecture Overview

Page 4: Talend Data Fabric - btProvider

Talend Data Fabric is a managed cloud integration platform that makes it easy for developers and data constituents to collect, transform, and clean data. Talend leverages security and privacy best practices to protect both the Talend platform and Talend, the company. Talend implements a combination of policies, procedures, and technologies to ensure your data is protected and secured. Talend’s chief information security officer (CISO) defines the Talend security strategy, architecture, and program. This document provides an overview of the Talend internal architecture and our policies and procedures as they pertain to employee, physical, network, infrastructure, platform, architecture, and data security.

Talend is SOC 2 Type 2 and HIPAA (Health Insurance Portability and Accountability) certified.

Summary

4 Talend Data Fabric Security Architecture Overview

Page 5: Talend Data Fabric - btProvider

Talend Data Fabric is a multi-tenant platform. All managed components are hosted on either Amazon Web Services (AWS) or Microsoft Azure, according to customer preference.

Talend Data Fabric comprises seven applications:

• Talend Management Console

• Talend Data Inventory

• Talend Data Preparation

• Talend Data Stewardship

• Talend API Designer

• Talend API Tester

• Talend Pipeline Designer

Additionally, Talend Studio, which runs on a local workstation, allows users to design data integration flows (or Talend Jobs) and publish them to Talend Data Fabric.

Here is an overview of Talend’s functional architecture.

Talend architecture

Figure 1: Talend functional architecture

5 Talend Data Fabric Security Architecture Overview

Page 6: Talend Data Fabric - btProvider

The table below summarizes what applications are available or can be installed where. All Talend Data Fabric applications are available on AWS. Some have been released on Azure. Some components can optionally be installed in a hybrid configuration, residing on customer infrastructure. Please refer to the Hybrid Infrastructure section below for more details about hybrid configurations.

Component Amazon Web Services Azure Hybrid Installation

Talend Management Console Yes Yes N/A

Talend Data Inventory Yes Yes N/A

Talend Data Preparation Yes - Yes

Talend Data Stewardship Yes Yes Yes

Talend API Designer Yes - N/A

Talend API Tester - - Yes

Talend Pipeline Designer Yes Yes N/A

Each of the following sections briefly describes a Talend Data Fabric application and gives an overview of its functional architecture. Please refer to our website at www.talend.com for more details about each application and terms used throughout the document.

Talend Management Console

Talend Management Console (TMC) is a browser-based application that provides access to all Talend Data Fabric applications and components as well as the administrative features and configurations that surround them.

TMC lets users schedule the execution of Talend Jobs via discrete components called execution engines. There are two types of engines:

• Cloud Engines are fully managed components that are provisioned, deployed, and controlled by Talend within our platform. Cloud Engines do not share jobs from multiple tenants; they are provisioned at execution time (per job sched-ules), per tenant.

• Remote Engines are execution agents deployed and managed by customers on their own systems, within their own physical or virtual (cloud) networks.

6 Talend Data Fabric Security Architecture Overview

Page 7: Talend Data Fabric - btProvider

Talend Data Inventory

Talend Data Inventory (TDI) provides automated tools for dataset documentation, quality proofing, and promotion. It identifies data silos across data sources and targets to provide visualization of reusable and shareable data assets.

Figure 2: Talend Data Inventory functional architecture

7 Talend Data Fabric Security Architecture Overview

Page 8: Talend Data Fabric - btProvider

Figure 3: Talend Data Preparation functional architecture

Figure 4: Talend Data Preparation functional architecture in hybrid deployment

Talend Data Preparation

Talend Data Preparation (TDP) allows customers to simplify and speed up the process of preparing data for analysis and other tasks. TDP allows customers to create, update, remove, and share datasets, then create preparations on top of the datasets that can be incorporated into Talend Jobs with Talend Studio.

8 Talend Data Fabric Security Architecture Overview

Page 9: Talend Data Fabric - btProvider

Talend Data Stewardship

Talend Data Stewardship (TDS) allows customers to collaboratively curate, validate, and resolve conflicts in data, as well as address potential data integrity issues.

Figure 5: Talend Data Stewardship functional architecture

Figure 6: Talend Data Stewardship functional architecture in hybrid deployment

9 Talend Data Fabric Security Architecture Overview

Page 10: Talend Data Fabric - btProvider

Figure 7: Talend API Services functional architecture

Talend API Designer

Talend API Designer lets users design APIs collaboratively and visually, then run simulations to test APIs and generate reference documentation.

Talend API Tester

Talend API Tester lets users automatically generate test cases from API contracts, then field test APIs by grouping tests together that simulate real-world examples. Users can integrate unit tests into a managed CI/CD process to ensure quality.

10 Talend Data Fabric Security Architecture Overview

Page 11: Talend Data Fabric - btProvider

Figure 8: Talend Pipeline Designer functional architecture

Talend Pipeline Designer

Talend Pipeline Designer (TPD) allows customers to design and run data pipelines in the cloud.

• A data pipeline is a data integration process: a series of transformation steps applied to data. It extracts data from customer-specified sources, transforms it step by step using prebuilt processors, and loads it into other datasets (desti-nations).

• Data pipelines can be started directly from TPD or scheduled in Talend Management Console.

• Data pipelines can be executed on Cloud Engines or Remote Engines.

11 Talend Data Fabric Security Architecture Overview

Page 12: Talend Data Fabric - btProvider

Some organizations use Talend in a hybrid configuration, with some components running on local devices and others running on cloud platforms. The only required component for running Talend in a hybrid environment is the Talend Studio development environment, which is installed on local workstations and offers similar functionality to the cloud-native Talend Pipeline Designer. Users may install additional applications or components in a hybrid configuration:• Talend Cloud API Tester — web browser extension

• Remote Engine — Java-based runtime to execute Talend Jobs on-premises or on a cloud platform that you control. If you do not install Remote Engine, you will use Cloud Engine.

• Remote Engine Gen2 — a Docker-based runtime to execute Talend Pipeline Designer data pipelines on-premises or on a cloud platform that you control. If you do not install Remote Engine Gen2 you will use Cloud Engine for Design. Talend

Hybrid infrastructure

12 Talend Data Fabric Security Architecture Overview

Page 13: Talend Data Fabric - btProvider

Talend Data Fabric is a multitenant integration environment that allows you to design, manage, and check data pipelines. All managed components are hosted on either Amazon Web Services (AWS) or Microsoft Azure, according to customer preference.

Secrets such as passwords, keys, and certificates are managed via third-party technologies and products. We go into more detail about this in the Key Management section below.

Data Fabric infrastructure

Computation resources

Talend Management Console, Talend Data Preparation, and Talend Pipeline Designer are the only Data Fabric applications that give separate computation resources to each tenant. Each is a multitenant application that is hosted and runs on AWS or (except for Talend Data Preparation) Azure.

Talend Management Console and Talend Pipeline Designer

Remote Engines are deployed by customers on their own systems and therefore given computation resources that they manage and control.

Cloud Engines are deployed within Talend Data Fabric as separate tenant-specific AWS EC2 or Azure VM instances and never shared with other tenants. Each tenant gets its own Cloud Engine instance on AWS or Azure.

The live preview feature of Talend Pipeline Designer, which allows users to preview the output of processors while designing a pipeline, is executed in a dedicated Remote Engine or Cloud Engine.

13 Talend Data Fabric Security Architecture Overview

Page 14: Talend Data Fabric - btProvider

Talend Data Preparation

Data Preparation process computations are isolated in separate threads for each tenant. Tenants can choose where the computation results are stored:

1. In an AWS S3 bucket that the tenant manages. AWS S3 credentials are not stored in Talend.

2. In Talend. In this case, results are stored in the bucket/folder specified by the Configuration service, an internal Talend service.

Data storage

Talend works with two general types of data: data that we collect and data that customers use with the software.

Data that we collect

Talend, across its cloud applications, collects only customer information that it needs to provide its services or to manage customer accounts.

All personally identifiable information that we collect (e.g. name, country, and email address) is protected with best security practices: It is encrypted at rest via AES-256 and in transit via TLS 1.3.

No payment information is stored in Talend Data Fabric. We rely on third-party vendors to collect and manage payment information.

Data that customers use with Talend Data Fabric

Whether customers use Remote Engines or Cloud Engines, their datasets remain on systems and data repositories that they manage. Metadata, Designs, Talend Jobs, Artifacts, and any other objects that Talend stores to provide services or for security reasons are isolated via tenant-specific schemas and tenant-specific data encryption keys.

Network

To function properly and deliver its services, Talend Data Fabric may need to communicate with external third-party solutions. All communications between Talend Data Fabric and such external solutions need to be authorized and initiated by Talend Data Fabric. No external solution can communicate with Talend Data Fabric unless the communication was initiated by Talend Data Fabric.

Talend networks and systems are protected via network and application firewalling, visibility mechanisms, and micro segmentation strategies.

14 Talend Data Fabric Security Architecture Overview

Page 15: Talend Data Fabric - btProvider

This section gives an overview of the data flows between Talend Data Fabric applications and components.

Data flows between Talend Studio and Talend Data Fabric

The types of data that can be exchanged between Talend Studio and Talend Data Fabric include:

a) Task artifact binaries

b) Task artifact metadata (e.g. context variables and parameters)

c) Talend API Designer definitions

Users’ credentials (e.g. login name and password or API token generated in TMC) are required to authorize the transfer.

Firewall Firewall

Talend Studio

Status & logs (HTTPS)

Customer data in transit

Metadata in transit (HTTPS)

Cloud Engine

On-premises apps & databases

• Cloud files• Data warehouse• SaaS apps

Cloud Engine data flows

Talend Data

Fabric

Figure 9: Talend data flows when using Cloud Engines

Data flows

15 Talend Data Fabric Security Architecture Overview

Page 16: Talend Data Fabric - btProvider

Metadata is transferred to Talend Cloud via the following URLs:

Cloud Region Talend Inventory service URL

AWS US https://tmc.us.cloud.talend.com/inventory

Europe https://ipaas.eu.cloud.talend.com/ipaas-services/services/inventory

Asia-Pacific https://ipaas.ap.cloud.talend.com/ipaas-services/services/inventory

Azure US https://tmc.us.cloud.talend.com/inventory

API designs are retrieved using the following secured endpoints:

Cloud Region API Design service URL

AWS US https://api-apid-service.us.cloud.talend.com/external/projects https://api-apid-service.us.cloud.talend.com/external/projects/{projectId}

Europe https://api-apid-service.eu.cloud.talend.com/external/projects https://api-apid-service.eu.cloud.talend.com/external/projects/{projectId}

Asia-Pacific https://api-apid-service.ap.cloud.talend.com/external/projects https://api-apid-service.ap.cloud.talend.com/external/projects/{projectId}

AzureUS https://api-apid-service.us.cloud.talend.com/external/projects

https://api-apid-service.us.cloud.talend.com/external/projects/{projectId}

Talend Studio defaults to uploads of Talend Jobs using the following pre-signed URLs:

Cloud Region S3 pre-signed URL

AWS US *-talend-com.s3.us-east-1.amazonaws.com

Europe *-talend-com.s3.eu-central-1.amazonaws.com

Asia-Pacific *-talend-com.s3.ap-northeast-1.amazonaws.com

Azure US minio.us-west.cloud.talend.com

16 Talend Data Fabric Security Architecture Overview

Page 17: Talend Data Fabric - btProvider

Data flows between Talend Studio jobs and Talend Data Fabric

Talend Studio has three components that can communicate with Talend Data Preparation on Talend Data Fabric.

1. tDatasetInput: Calls Talend to retrieve the content of a dataset — more details here

2. tDatasetOutput: Calls Talend to post the content of a dataset — more details here

3. tDataprepRun DI and Spark:

• DI component: Calls Talend to list preparations, and at runtime to retrieve each row that needs to be prepared

• Spark component: Calls Talend to list preparations, and at runtime to retrieve the chosen preparation steps, lookup datasets, and semantic types

• More details here

Data flows between Remote Engine and Talend Data Fabric

Firewall Firewall

Talend Studio

Status & logs (HTTPS)

Customer data in transit

Metadata in transit (HTTPS)

Remote Engine On-premises apps & databases

• Cloud files• Data warehouse• SaaS apps

Remote Engine data flows

Talend Data

Fabric

Figure 10: Talend data flows when using Remote Engines

17 Talend Data Fabric Security Architecture Overview

Page 18: Talend Data Fabric - btProvider

As mentioned earlier, Talend never initiates connections with Remote Engines. Remote Engines must initiate all connections to Talend. Once a connection is established, all data is sent encrypted over HTTPS.

Here are the types of data that can be exchanged between Remote Engines and Talend:

a) Status information and metrics

b) Lifecycle commands

c) Task artifact metadata

d) Logs

e) Task artifact binaries

The next sections discuss each data type in the scope of REST service URLs that are being targeted and the corresponding systems behind them. There are URLs for the US, Europe, and Asia-Pacific regions.

MSG service URL

This is the service URL of the primary gateway to Talend’s ActiveMQ cluster. Data of types a) to d) in the list above are sent on this service channel. Remote Engine status information and lifecycle commands are the first data sent over the wire. This path is a control path to schedule flow deployments and capture execution status (success, fail). Other information transferred is the number of rows successfully processed or being rejected. This also includes the final success message.

Cloud Region Msg service URL

AWS US msg.us.cloud.talend.com

Europe msg.eu.cloud.talend.com

Asia-Pacific msg.ap.cloud.talend.com

Azure US msg.us-west.cloud.talend.com

18 Talend Data Fabric Security Architecture Overview

Page 19: Talend Data Fabric - btProvider

Repository service URL

This is the service URL of the primary access point for Talend Job and action binaries. This REST service provides access to Nexus repositories, which are only accessible via HTTPS and unique Nexus credentials, which are created during Remote Engine pairing.

Cloud Region Repo service URL

AWS US repo.us.cloud.talend.com

Europe repo.eu.cloud.talend.com

Asia-Pacific repo.ap.cloud.talend.com

Azure US repo.us-west.cloud.talend.com

Pair service URL

This is the service URL used during initial pairing of the Remote Engine to its account. It is used to send the heartbeat, availability, and status of the engine itself. It is only accessible via HTTPS.

Cloud Region DTS service URL

AWS US dts.us.cloud.talend.com

Europe dts.eu.cloud.talend.com

Asia-Pacific dts.ap.cloud.talend.com

Azure US dts.us-west.cloud.talend.com

DTS service URL

This is the service URL of the Talend token generation service. It is used to create one-time, time-limited tokens to authorize file uploads from the Remote Engine to Talend. The file uploads are HTTPS POSTs with logs or resource files.

Cloud Region DTS service URL

AWS US dts.us.cloud.talend.com

Europe dts.eu.cloud.talend.com

Asia-Pacific dts.ap.cloud.talend.com

Azure US dts.us-west.cloud.talend.com

19 Talend Data Fabric Security Architecture Overview

Page 20: Talend Data Fabric - btProvider

Remote Engine service URL

This is the service URL of the Talend token generation service. It is used to create one-time, time-limited tokens to authorize file uploads from the Remote Engine to Talend. The file uploads are HTTPS POSTs with logs or resource files.

Cloud Region Remote Engine service URL

AWS US engine.us.cloud.talend.com

Europe engine.eu.cloud.talend.com

Asia-Pacific engine.ap.cloud.talend.com

Azure US engine.us-west.cloud.talend.com

Vault gateway service URL

This service is used with Remote Engine Gen2 to decrypt each tenant’s sensitive data.

Cloud Region Vault Gateway service URL

AWS US vault-gateway.us.cloud.talend.com

Europe vault-gateway.eu.cloud.talend.com

Asia-Pacific vault-gateway.ap.cloud.talend.com

Azure US vault-gateway.us-west.cloud.talend.com

20 Talend Data Fabric Security Architecture Overview

Page 21: Talend Data Fabric - btProvider

Data flows in hybrid deployment between Talend Data Inventory, Talend Data Preparation, Talend Data Stewardship, Talend Data Quality, and Talend Data Fabric

Guiding principle — Talend applications and components initiate HTTPS connections. Talend Data Fabric never initiates any connection to these applications.

Here are the types of data that can be exchanged between these applications and Talend Data Fabric:

a) During user login: Client ID and client secret (as defined in the OIDC specification) of the installed application is used to authorize its communication with Talend Data Fabric.

b) After user login: A JSON Web Token (JWT) that represents the user’s identity, metadata, and claims is transferred back to the application.

Public APIs

In addition to the data flows between Talend applications, Talend exposes public APIs that let developers automate workflows. These APIs are secured with Personal Access Tokens generated with TMC. Security at Talend

Talend’s security organization consists of a dedicated team of security experts distributed across the company who work closely with the Talend CISO. Their mission is to protect Talend and its clients with security best practices. This team supports all aspects of Talend business, including Talend development and operations. The responsibility of Talend security rolls up to the CISO, who also defines Talend security strategy, architecture, and program.

21 Talend Data Fabric Security Architecture Overview

Page 22: Talend Data Fabric - btProvider

Talend’s security organization consists of a dedicated team of security experts distributed across the company who work closely with the Talend CISO. Their mission is to protect Talend and its clients with security best practices. This team supports all aspects of Talend business, including Talend development and operations. The responsibility of Talend security rolls up to the CISO, who also defines Talend security strategy, architecture, and program.

Security at Talend

Physical security

Talend maintains security controls to prevent unauthorized physical access to buildings and data centers and to protect its systems and software, and by extension the Talend environment, from damage, interruption, misuse, or theft.

Authorizations are reviewed regularly, and access is monitored continuously.

Security training

All Talend employees are trained on security best practices. All Talend employees involved in the Talend development lifecycle, from creation to deployment and operation, are guided through trainings, reviews, and drills.

Secure software development

Talend’s security organization is involved throughout the creation of any new application, capability, or feature.

Our security experts conduct architecture, design, and code reviews.

Software composition analysis (SCA) and static security vulnerability (SAST) scans are integrated in the software development lifecycle.

Talend implements a Top 10 Open Web Application Security Project (OWASP) awareness program during application development, and schedules regular internal and external audits to assess compliance with OWASP best practices.

Cloud workload protection and monitoring

We use a combination of security services from third-party vendors to protect Talend Data Fabric.

Our security experts use external scanning tools to ensure that systems and containers are hardened, configured, and patched according to Talend guidelines and best practices.

Our deployments leverage the built-in segmentation capabilities of AWS EC2 Security groups or Microsoft Azure Network Security groups to restrict inter-resource communication.

We use web application firewalls to inspect north/south and east/west traffic flows to our applications.

22 Talend Data Fabric Security Architecture Overview

Page 23: Talend Data Fabric - btProvider

Administrative access

Talend Data Fabric administrative access requires management review and approval. Elevated privilege access requires the same level of approval by management.

Access to any management console, Talend Data Fabric, AWS, or Azure requires multifactor authentication (credentials plus secret keys).

Access to the AWS console is restricted to select members of the Talend Site Reliability Engineering (SRE) or Information Security teams. New account creation follows a strict approval process. Accounts are reviewed quarterly.

System access is provided via SSH private keys. Public keys are automatically deployed with the Talend configuration management tool.

Password management

Talend maintains a password management policy that all employees must comply with. It ensures the creation of strong passwords, the protection of those passwords, and a reasonable frequency of password change.

All system-level passwords (e.g., root, enable, application administration accounts, etc.) must be changed on at least a quarterly basis.

All production system-level passwords must be part of the Talend IT administered secrets server.

All user-level passwords (e.g., email, web, desktop computer, etc.) must be changed at least every three months.

Our SOC monitors all security relevant events captured in our SIEM.We leverage the built-in threat detection capabilities of AWS GuardDuty and Azure Advanced Threat Protection to detect malicious activity and unauthorized behavior.

Authentication, authorization, and ac-cess control

Standard access

Tenant users are authenticated with their own unique credentials: username plus password.

Talend issues X.509 public key certificates, which must be used to secure and encrypt all communications between user systems and Talend Data Fabric. Talend Data Fabric supports HTTPS over TLS.

The authentication process follows the OpenID Connect standard and uses either the authorization code or the implicit flow. Once connected, a session is managed using either cookies or a JWT.

Talend never accesses users’ credentials. Starting in 2020, Talend is progressively introducing a new identity manager based on the Auth0 platform. Auth0 is a third-party service provider that complies with Talend Security standards and certifications. This migration is part of a global security strategy to better enable Talend to concentrate on our core domain by working with trusted third-party security vendors.

Within each operational region, Talend pairs with dedicated Auth0 private instances, ensuring best performance and compliance with local data sovereignty laws.

23 Talend Data Fabric Security Architecture Overview

Page 24: Talend Data Fabric - btProvider

Key management

Currently Talend Key Management is different on AWS and Azure. Soon keys for Talend on AWS will be managed like keys for Talend on Azure.

On AWS

Talend relies on AWS-managed Customer Master Keys (CMK) for encryption. Talend uses its own AWS CMK to generate unique Data Encryption Keys (DEK).

Most DEKs are tenant-specific and are managed (including rotation) by Talend. DEKs that do not need to be tenant-specific are managed via the AWS Encryption SDK.

Front-end TLS endpoints are managed through the AWS Certificate Manager (ACM). The private key is generated by Talend and the associated certificate signed by Talend’s approved Certificate Authority (CA), GoDaddy. The certificates are then published as part of the Certificate Transparency program and uploaded to the ACM.

On Azure

Talend applications and components deployed on Azure obtain and use tenant-specific Master Keys from HashiCorp Vault to encrypt tenant-related data.

Front-end TLS endpoints are managed through Traefik (Edge Router running as a Kubernetes service) and Kubernetes Secrets. Private keys are generated by Talend and certificates are signed by Talend’s approved Certificate Authority (CA), GoDaddy. The certificates are then published as part of the Certificate Transparency program and uploaded to Traefik configuration as Kubernetes secrets.

Vulnerability management

All applications are tested by Talend’s security experts (dynamic application security testing (DAST) and penetration tests) at least twice a year.

In addition, Talend leverages internal and third-party security services to perform external penetration tests.

Third-party penetration tests are scheduled twice a year and prior to any new Talend Data Fabric release and deployment. The penetration tests cover a wide range of security aspects of the application and address modern web best practices.All detected vulnerabilities are logged by the Talend Quality Assurance team and analyzed by the Talend Information Security team, which then supports, tracks, and tests their remediation.

Talend follows the Security Content Automation Protocol (SCAP) framework. Vulnerabilities are rated according to the Common Vulnerability Scoring System (CVSS) v3.0 equation. Vulnerabilities are resolved depending on their severity rating and their potential impact on the infrastructure.

Third-party penetration test reports are available upon request at Talend’s discretion.

Backups

Talend uses various AWS and Azure data storage services. All data storage services are regularly and automatically backed up and mirrored to a remote site. Most backups occur every hour.

Backup executions are monitored.

Integrity checks are systematically made one week following every new production deployment.

24 Talend Data Fabric Security Architecture Overview

Page 25: Talend Data Fabric - btProvider

Disaster recovery and business continuity

Talend maintains a disaster recovery/business continuity (DR/BC) policy that is reviewed, updated, and tested annually.

Talend operates in multiple AWS and Azure regions globally. Any Talend instance in any public cloud region can fail over to another region of the same public cloud vendor.

We are in close contact with both vendors and carefully monitor their service levels to make sure that they meet our required service levels.

Our development team spans six geographical locations: one in the US, four in Europe, and one in Asia. Every development function can be fulfilled by at least two developers.

Our operations team spans five geographical locations: two in the US, three in Europe, and one in Asia. Every operations function can be fulfilled by at least two members of the team.

Security certifications

Talend is SOC 2 Type 2 and HIPAA certified.

We use the Cloud Security Alliance (CSA) Security Trust Assurance and Risk (STAR) program to assess our security practices and validate the security posture of our cloud offerings. You can find more information here.

Please refer to AWS and Azure websites for more details about their security certifications and compliance information.

25 Talend Data Fabric Security Architecture Overview

Page 26: Talend Data Fabric - btProvider

26

About TalendTalend (NASDAQ: TLND), a leader in data integration and data integrity, enables every company to find clarity amidst the data chaos.

Talend is the only company to bring together in a single platform all the necessary capabilities that ensure enterprise data is complete, clean, compliant, and readily available to everyone who needs it throughout the organization. With Talend, organizations are able to deliver exceptional customer experiences, make smarter decisions in the moment, drive innovation, and improve operations.

To learn more, please visit www.talend.com