talend data fabric - btprovider
TRANSCRIPT
Talend Data Fabric
Security architecture overview
ContentsSummary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
Talend architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5
Here is an overview of Talend’s functional architecture. . . . . . . . . . . . . 5
Talend Management Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Talend Data Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Talend Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Talend Data Stewardship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Talend API Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Talend API Tester . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Talend Pipeline Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Hybrid infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Data Fabric infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Computation resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Talend Management Console and Talend Pipeline Designer . . . . . . 13
Talend Data Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data that we collect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data that customers use with Talend Data Fabric . . . . . . . . . . . . . 14
Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Data flows between Talend Studio and Talend Data Fabric. . . . . . . . . . 15
Metadata is transferred to Talend Cloud via the following URLs: . . . . 16
API designs are retrieved using the following secured endpoints: . . . 16
Talend Studio defaults to uploads of Talend Jobs using the following
pre-signed URLs: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Data flows between Talend Studio jobs and Talend Data Fabric. . . . . . . 17
Data flows between Remote Engine and Talend Data Fabric . . . . . . . . . 17
2 Talend Data Fabric Security Architecture Overview
MSG service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Repository service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Pair service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
DTS service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Remote Engine service URL. . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Vault gateway service URL . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Data flows in hybrid deployment between Talend Data Inventory, Talend
Data Preparation, Talend Data Stewardship, Talend Data Quality, and Talend
Data Fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Public APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Security at Talend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Physical security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Security training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Secure software development . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Cloud workload protection and monitoring . . . . . . . . . . . . . . . . . . . 22
Authentication, authorization, and access control . . . . . . . . . . . . . . . 23
Standard access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Administrative access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Password management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Key management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
On AWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
On Azure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Vulnerability management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Disaster recovery and business continuity . . . . . . . . . . . . . . . . . . . . 25
Security certifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3 Talend Data Fabric Security Architecture Overview
Talend Data Fabric is a managed cloud integration platform that makes it easy for developers and data constituents to collect, transform, and clean data. Talend leverages security and privacy best practices to protect both the Talend platform and Talend, the company. Talend implements a combination of policies, procedures, and technologies to ensure your data is protected and secured. Talend’s chief information security officer (CISO) defines the Talend security strategy, architecture, and program. This document provides an overview of the Talend internal architecture and our policies and procedures as they pertain to employee, physical, network, infrastructure, platform, architecture, and data security.
Talend is SOC 2 Type 2 and HIPAA (Health Insurance Portability and Accountability) certified.
Summary
4 Talend Data Fabric Security Architecture Overview
Talend Data Fabric is a multi-tenant platform. All managed components are hosted on either Amazon Web Services (AWS) or Microsoft Azure, according to customer preference.
Talend Data Fabric comprises seven applications:
• Talend Management Console
• Talend Data Inventory
• Talend Data Preparation
• Talend Data Stewardship
• Talend API Designer
• Talend API Tester
• Talend Pipeline Designer
Additionally, Talend Studio, which runs on a local workstation, allows users to design data integration flows (or Talend Jobs) and publish them to Talend Data Fabric.
Here is an overview of Talend’s functional architecture.
Talend architecture
Figure 1: Talend functional architecture
5 Talend Data Fabric Security Architecture Overview
The table below summarizes what applications are available or can be installed where. All Talend Data Fabric applications are available on AWS. Some have been released on Azure. Some components can optionally be installed in a hybrid configuration, residing on customer infrastructure. Please refer to the Hybrid Infrastructure section below for more details about hybrid configurations.
Component Amazon Web Services Azure Hybrid Installation
Talend Management Console Yes Yes N/A
Talend Data Inventory Yes Yes N/A
Talend Data Preparation Yes - Yes
Talend Data Stewardship Yes Yes Yes
Talend API Designer Yes - N/A
Talend API Tester - - Yes
Talend Pipeline Designer Yes Yes N/A
Each of the following sections briefly describes a Talend Data Fabric application and gives an overview of its functional architecture. Please refer to our website at www.talend.com for more details about each application and terms used throughout the document.
Talend Management Console
Talend Management Console (TMC) is a browser-based application that provides access to all Talend Data Fabric applications and components as well as the administrative features and configurations that surround them.
TMC lets users schedule the execution of Talend Jobs via discrete components called execution engines. There are two types of engines:
• Cloud Engines are fully managed components that are provisioned, deployed, and controlled by Talend within our platform. Cloud Engines do not share jobs from multiple tenants; they are provisioned at execution time (per job sched-ules), per tenant.
• Remote Engines are execution agents deployed and managed by customers on their own systems, within their own physical or virtual (cloud) networks.
6 Talend Data Fabric Security Architecture Overview
Talend Data Inventory
Talend Data Inventory (TDI) provides automated tools for dataset documentation, quality proofing, and promotion. It identifies data silos across data sources and targets to provide visualization of reusable and shareable data assets.
Figure 2: Talend Data Inventory functional architecture
7 Talend Data Fabric Security Architecture Overview
Figure 3: Talend Data Preparation functional architecture
Figure 4: Talend Data Preparation functional architecture in hybrid deployment
Talend Data Preparation
Talend Data Preparation (TDP) allows customers to simplify and speed up the process of preparing data for analysis and other tasks. TDP allows customers to create, update, remove, and share datasets, then create preparations on top of the datasets that can be incorporated into Talend Jobs with Talend Studio.
8 Talend Data Fabric Security Architecture Overview
Talend Data Stewardship
Talend Data Stewardship (TDS) allows customers to collaboratively curate, validate, and resolve conflicts in data, as well as address potential data integrity issues.
Figure 5: Talend Data Stewardship functional architecture
Figure 6: Talend Data Stewardship functional architecture in hybrid deployment
9 Talend Data Fabric Security Architecture Overview
Figure 7: Talend API Services functional architecture
Talend API Designer
Talend API Designer lets users design APIs collaboratively and visually, then run simulations to test APIs and generate reference documentation.
Talend API Tester
Talend API Tester lets users automatically generate test cases from API contracts, then field test APIs by grouping tests together that simulate real-world examples. Users can integrate unit tests into a managed CI/CD process to ensure quality.
10 Talend Data Fabric Security Architecture Overview
Figure 8: Talend Pipeline Designer functional architecture
Talend Pipeline Designer
Talend Pipeline Designer (TPD) allows customers to design and run data pipelines in the cloud.
• A data pipeline is a data integration process: a series of transformation steps applied to data. It extracts data from customer-specified sources, transforms it step by step using prebuilt processors, and loads it into other datasets (desti-nations).
• Data pipelines can be started directly from TPD or scheduled in Talend Management Console.
• Data pipelines can be executed on Cloud Engines or Remote Engines.
11 Talend Data Fabric Security Architecture Overview
Some organizations use Talend in a hybrid configuration, with some components running on local devices and others running on cloud platforms. The only required component for running Talend in a hybrid environment is the Talend Studio development environment, which is installed on local workstations and offers similar functionality to the cloud-native Talend Pipeline Designer. Users may install additional applications or components in a hybrid configuration:• Talend Cloud API Tester — web browser extension
• Remote Engine — Java-based runtime to execute Talend Jobs on-premises or on a cloud platform that you control. If you do not install Remote Engine, you will use Cloud Engine.
• Remote Engine Gen2 — a Docker-based runtime to execute Talend Pipeline Designer data pipelines on-premises or on a cloud platform that you control. If you do not install Remote Engine Gen2 you will use Cloud Engine for Design. Talend
Hybrid infrastructure
12 Talend Data Fabric Security Architecture Overview
Talend Data Fabric is a multitenant integration environment that allows you to design, manage, and check data pipelines. All managed components are hosted on either Amazon Web Services (AWS) or Microsoft Azure, according to customer preference.
Secrets such as passwords, keys, and certificates are managed via third-party technologies and products. We go into more detail about this in the Key Management section below.
Data Fabric infrastructure
Computation resources
Talend Management Console, Talend Data Preparation, and Talend Pipeline Designer are the only Data Fabric applications that give separate computation resources to each tenant. Each is a multitenant application that is hosted and runs on AWS or (except for Talend Data Preparation) Azure.
Talend Management Console and Talend Pipeline Designer
Remote Engines are deployed by customers on their own systems and therefore given computation resources that they manage and control.
Cloud Engines are deployed within Talend Data Fabric as separate tenant-specific AWS EC2 or Azure VM instances and never shared with other tenants. Each tenant gets its own Cloud Engine instance on AWS or Azure.
The live preview feature of Talend Pipeline Designer, which allows users to preview the output of processors while designing a pipeline, is executed in a dedicated Remote Engine or Cloud Engine.
13 Talend Data Fabric Security Architecture Overview
Talend Data Preparation
Data Preparation process computations are isolated in separate threads for each tenant. Tenants can choose where the computation results are stored:
1. In an AWS S3 bucket that the tenant manages. AWS S3 credentials are not stored in Talend.
2. In Talend. In this case, results are stored in the bucket/folder specified by the Configuration service, an internal Talend service.
Data storage
Talend works with two general types of data: data that we collect and data that customers use with the software.
Data that we collect
Talend, across its cloud applications, collects only customer information that it needs to provide its services or to manage customer accounts.
All personally identifiable information that we collect (e.g. name, country, and email address) is protected with best security practices: It is encrypted at rest via AES-256 and in transit via TLS 1.3.
No payment information is stored in Talend Data Fabric. We rely on third-party vendors to collect and manage payment information.
Data that customers use with Talend Data Fabric
Whether customers use Remote Engines or Cloud Engines, their datasets remain on systems and data repositories that they manage. Metadata, Designs, Talend Jobs, Artifacts, and any other objects that Talend stores to provide services or for security reasons are isolated via tenant-specific schemas and tenant-specific data encryption keys.
Network
To function properly and deliver its services, Talend Data Fabric may need to communicate with external third-party solutions. All communications between Talend Data Fabric and such external solutions need to be authorized and initiated by Talend Data Fabric. No external solution can communicate with Talend Data Fabric unless the communication was initiated by Talend Data Fabric.
Talend networks and systems are protected via network and application firewalling, visibility mechanisms, and micro segmentation strategies.
14 Talend Data Fabric Security Architecture Overview
This section gives an overview of the data flows between Talend Data Fabric applications and components.
Data flows between Talend Studio and Talend Data Fabric
The types of data that can be exchanged between Talend Studio and Talend Data Fabric include:
a) Task artifact binaries
b) Task artifact metadata (e.g. context variables and parameters)
c) Talend API Designer definitions
Users’ credentials (e.g. login name and password or API token generated in TMC) are required to authorize the transfer.
Firewall Firewall
Talend Studio
Status & logs (HTTPS)
Customer data in transit
Metadata in transit (HTTPS)
Cloud Engine
On-premises apps & databases
• Cloud files• Data warehouse• SaaS apps
Cloud Engine data flows
Talend Data
Fabric
Figure 9: Talend data flows when using Cloud Engines
Data flows
15 Talend Data Fabric Security Architecture Overview
Metadata is transferred to Talend Cloud via the following URLs:
Cloud Region Talend Inventory service URL
AWS US https://tmc.us.cloud.talend.com/inventory
Europe https://ipaas.eu.cloud.talend.com/ipaas-services/services/inventory
Asia-Pacific https://ipaas.ap.cloud.talend.com/ipaas-services/services/inventory
Azure US https://tmc.us.cloud.talend.com/inventory
API designs are retrieved using the following secured endpoints:
Cloud Region API Design service URL
AWS US https://api-apid-service.us.cloud.talend.com/external/projects https://api-apid-service.us.cloud.talend.com/external/projects/{projectId}
Europe https://api-apid-service.eu.cloud.talend.com/external/projects https://api-apid-service.eu.cloud.talend.com/external/projects/{projectId}
Asia-Pacific https://api-apid-service.ap.cloud.talend.com/external/projects https://api-apid-service.ap.cloud.talend.com/external/projects/{projectId}
AzureUS https://api-apid-service.us.cloud.talend.com/external/projects
https://api-apid-service.us.cloud.talend.com/external/projects/{projectId}
Talend Studio defaults to uploads of Talend Jobs using the following pre-signed URLs:
Cloud Region S3 pre-signed URL
AWS US *-talend-com.s3.us-east-1.amazonaws.com
Europe *-talend-com.s3.eu-central-1.amazonaws.com
Asia-Pacific *-talend-com.s3.ap-northeast-1.amazonaws.com
Azure US minio.us-west.cloud.talend.com
16 Talend Data Fabric Security Architecture Overview
Data flows between Talend Studio jobs and Talend Data Fabric
Talend Studio has three components that can communicate with Talend Data Preparation on Talend Data Fabric.
1. tDatasetInput: Calls Talend to retrieve the content of a dataset — more details here
2. tDatasetOutput: Calls Talend to post the content of a dataset — more details here
3. tDataprepRun DI and Spark:
• DI component: Calls Talend to list preparations, and at runtime to retrieve each row that needs to be prepared
• Spark component: Calls Talend to list preparations, and at runtime to retrieve the chosen preparation steps, lookup datasets, and semantic types
• More details here
Data flows between Remote Engine and Talend Data Fabric
Firewall Firewall
Talend Studio
Status & logs (HTTPS)
Customer data in transit
Metadata in transit (HTTPS)
Remote Engine On-premises apps & databases
• Cloud files• Data warehouse• SaaS apps
Remote Engine data flows
Talend Data
Fabric
Figure 10: Talend data flows when using Remote Engines
17 Talend Data Fabric Security Architecture Overview
As mentioned earlier, Talend never initiates connections with Remote Engines. Remote Engines must initiate all connections to Talend. Once a connection is established, all data is sent encrypted over HTTPS.
Here are the types of data that can be exchanged between Remote Engines and Talend:
a) Status information and metrics
b) Lifecycle commands
c) Task artifact metadata
d) Logs
e) Task artifact binaries
The next sections discuss each data type in the scope of REST service URLs that are being targeted and the corresponding systems behind them. There are URLs for the US, Europe, and Asia-Pacific regions.
MSG service URL
This is the service URL of the primary gateway to Talend’s ActiveMQ cluster. Data of types a) to d) in the list above are sent on this service channel. Remote Engine status information and lifecycle commands are the first data sent over the wire. This path is a control path to schedule flow deployments and capture execution status (success, fail). Other information transferred is the number of rows successfully processed or being rejected. This also includes the final success message.
Cloud Region Msg service URL
AWS US msg.us.cloud.talend.com
Europe msg.eu.cloud.talend.com
Asia-Pacific msg.ap.cloud.talend.com
Azure US msg.us-west.cloud.talend.com
18 Talend Data Fabric Security Architecture Overview
Repository service URL
This is the service URL of the primary access point for Talend Job and action binaries. This REST service provides access to Nexus repositories, which are only accessible via HTTPS and unique Nexus credentials, which are created during Remote Engine pairing.
Cloud Region Repo service URL
AWS US repo.us.cloud.talend.com
Europe repo.eu.cloud.talend.com
Asia-Pacific repo.ap.cloud.talend.com
Azure US repo.us-west.cloud.talend.com
Pair service URL
This is the service URL used during initial pairing of the Remote Engine to its account. It is used to send the heartbeat, availability, and status of the engine itself. It is only accessible via HTTPS.
Cloud Region DTS service URL
AWS US dts.us.cloud.talend.com
Europe dts.eu.cloud.talend.com
Asia-Pacific dts.ap.cloud.talend.com
Azure US dts.us-west.cloud.talend.com
DTS service URL
This is the service URL of the Talend token generation service. It is used to create one-time, time-limited tokens to authorize file uploads from the Remote Engine to Talend. The file uploads are HTTPS POSTs with logs or resource files.
Cloud Region DTS service URL
AWS US dts.us.cloud.talend.com
Europe dts.eu.cloud.talend.com
Asia-Pacific dts.ap.cloud.talend.com
Azure US dts.us-west.cloud.talend.com
19 Talend Data Fabric Security Architecture Overview
Remote Engine service URL
This is the service URL of the Talend token generation service. It is used to create one-time, time-limited tokens to authorize file uploads from the Remote Engine to Talend. The file uploads are HTTPS POSTs with logs or resource files.
Cloud Region Remote Engine service URL
AWS US engine.us.cloud.talend.com
Europe engine.eu.cloud.talend.com
Asia-Pacific engine.ap.cloud.talend.com
Azure US engine.us-west.cloud.talend.com
Vault gateway service URL
This service is used with Remote Engine Gen2 to decrypt each tenant’s sensitive data.
Cloud Region Vault Gateway service URL
AWS US vault-gateway.us.cloud.talend.com
Europe vault-gateway.eu.cloud.talend.com
Asia-Pacific vault-gateway.ap.cloud.talend.com
Azure US vault-gateway.us-west.cloud.talend.com
20 Talend Data Fabric Security Architecture Overview
Data flows in hybrid deployment between Talend Data Inventory, Talend Data Preparation, Talend Data Stewardship, Talend Data Quality, and Talend Data Fabric
Guiding principle — Talend applications and components initiate HTTPS connections. Talend Data Fabric never initiates any connection to these applications.
Here are the types of data that can be exchanged between these applications and Talend Data Fabric:
a) During user login: Client ID and client secret (as defined in the OIDC specification) of the installed application is used to authorize its communication with Talend Data Fabric.
b) After user login: A JSON Web Token (JWT) that represents the user’s identity, metadata, and claims is transferred back to the application.
Public APIs
In addition to the data flows between Talend applications, Talend exposes public APIs that let developers automate workflows. These APIs are secured with Personal Access Tokens generated with TMC. Security at Talend
Talend’s security organization consists of a dedicated team of security experts distributed across the company who work closely with the Talend CISO. Their mission is to protect Talend and its clients with security best practices. This team supports all aspects of Talend business, including Talend development and operations. The responsibility of Talend security rolls up to the CISO, who also defines Talend security strategy, architecture, and program.
21 Talend Data Fabric Security Architecture Overview
Talend’s security organization consists of a dedicated team of security experts distributed across the company who work closely with the Talend CISO. Their mission is to protect Talend and its clients with security best practices. This team supports all aspects of Talend business, including Talend development and operations. The responsibility of Talend security rolls up to the CISO, who also defines Talend security strategy, architecture, and program.
Security at Talend
Physical security
Talend maintains security controls to prevent unauthorized physical access to buildings and data centers and to protect its systems and software, and by extension the Talend environment, from damage, interruption, misuse, or theft.
Authorizations are reviewed regularly, and access is monitored continuously.
Security training
All Talend employees are trained on security best practices. All Talend employees involved in the Talend development lifecycle, from creation to deployment and operation, are guided through trainings, reviews, and drills.
Secure software development
Talend’s security organization is involved throughout the creation of any new application, capability, or feature.
Our security experts conduct architecture, design, and code reviews.
Software composition analysis (SCA) and static security vulnerability (SAST) scans are integrated in the software development lifecycle.
Talend implements a Top 10 Open Web Application Security Project (OWASP) awareness program during application development, and schedules regular internal and external audits to assess compliance with OWASP best practices.
Cloud workload protection and monitoring
We use a combination of security services from third-party vendors to protect Talend Data Fabric.
Our security experts use external scanning tools to ensure that systems and containers are hardened, configured, and patched according to Talend guidelines and best practices.
Our deployments leverage the built-in segmentation capabilities of AWS EC2 Security groups or Microsoft Azure Network Security groups to restrict inter-resource communication.
We use web application firewalls to inspect north/south and east/west traffic flows to our applications.
22 Talend Data Fabric Security Architecture Overview
Administrative access
Talend Data Fabric administrative access requires management review and approval. Elevated privilege access requires the same level of approval by management.
Access to any management console, Talend Data Fabric, AWS, or Azure requires multifactor authentication (credentials plus secret keys).
Access to the AWS console is restricted to select members of the Talend Site Reliability Engineering (SRE) or Information Security teams. New account creation follows a strict approval process. Accounts are reviewed quarterly.
System access is provided via SSH private keys. Public keys are automatically deployed with the Talend configuration management tool.
Password management
Talend maintains a password management policy that all employees must comply with. It ensures the creation of strong passwords, the protection of those passwords, and a reasonable frequency of password change.
All system-level passwords (e.g., root, enable, application administration accounts, etc.) must be changed on at least a quarterly basis.
All production system-level passwords must be part of the Talend IT administered secrets server.
All user-level passwords (e.g., email, web, desktop computer, etc.) must be changed at least every three months.
Our SOC monitors all security relevant events captured in our SIEM.We leverage the built-in threat detection capabilities of AWS GuardDuty and Azure Advanced Threat Protection to detect malicious activity and unauthorized behavior.
Authentication, authorization, and ac-cess control
Standard access
Tenant users are authenticated with their own unique credentials: username plus password.
Talend issues X.509 public key certificates, which must be used to secure and encrypt all communications between user systems and Talend Data Fabric. Talend Data Fabric supports HTTPS over TLS.
The authentication process follows the OpenID Connect standard and uses either the authorization code or the implicit flow. Once connected, a session is managed using either cookies or a JWT.
Talend never accesses users’ credentials. Starting in 2020, Talend is progressively introducing a new identity manager based on the Auth0 platform. Auth0 is a third-party service provider that complies with Talend Security standards and certifications. This migration is part of a global security strategy to better enable Talend to concentrate on our core domain by working with trusted third-party security vendors.
Within each operational region, Talend pairs with dedicated Auth0 private instances, ensuring best performance and compliance with local data sovereignty laws.
23 Talend Data Fabric Security Architecture Overview
Key management
Currently Talend Key Management is different on AWS and Azure. Soon keys for Talend on AWS will be managed like keys for Talend on Azure.
On AWS
Talend relies on AWS-managed Customer Master Keys (CMK) for encryption. Talend uses its own AWS CMK to generate unique Data Encryption Keys (DEK).
Most DEKs are tenant-specific and are managed (including rotation) by Talend. DEKs that do not need to be tenant-specific are managed via the AWS Encryption SDK.
Front-end TLS endpoints are managed through the AWS Certificate Manager (ACM). The private key is generated by Talend and the associated certificate signed by Talend’s approved Certificate Authority (CA), GoDaddy. The certificates are then published as part of the Certificate Transparency program and uploaded to the ACM.
On Azure
Talend applications and components deployed on Azure obtain and use tenant-specific Master Keys from HashiCorp Vault to encrypt tenant-related data.
Front-end TLS endpoints are managed through Traefik (Edge Router running as a Kubernetes service) and Kubernetes Secrets. Private keys are generated by Talend and certificates are signed by Talend’s approved Certificate Authority (CA), GoDaddy. The certificates are then published as part of the Certificate Transparency program and uploaded to Traefik configuration as Kubernetes secrets.
Vulnerability management
All applications are tested by Talend’s security experts (dynamic application security testing (DAST) and penetration tests) at least twice a year.
In addition, Talend leverages internal and third-party security services to perform external penetration tests.
Third-party penetration tests are scheduled twice a year and prior to any new Talend Data Fabric release and deployment. The penetration tests cover a wide range of security aspects of the application and address modern web best practices.All detected vulnerabilities are logged by the Talend Quality Assurance team and analyzed by the Talend Information Security team, which then supports, tracks, and tests their remediation.
Talend follows the Security Content Automation Protocol (SCAP) framework. Vulnerabilities are rated according to the Common Vulnerability Scoring System (CVSS) v3.0 equation. Vulnerabilities are resolved depending on their severity rating and their potential impact on the infrastructure.
Third-party penetration test reports are available upon request at Talend’s discretion.
Backups
Talend uses various AWS and Azure data storage services. All data storage services are regularly and automatically backed up and mirrored to a remote site. Most backups occur every hour.
Backup executions are monitored.
Integrity checks are systematically made one week following every new production deployment.
24 Talend Data Fabric Security Architecture Overview
Disaster recovery and business continuity
Talend maintains a disaster recovery/business continuity (DR/BC) policy that is reviewed, updated, and tested annually.
Talend operates in multiple AWS and Azure regions globally. Any Talend instance in any public cloud region can fail over to another region of the same public cloud vendor.
We are in close contact with both vendors and carefully monitor their service levels to make sure that they meet our required service levels.
Our development team spans six geographical locations: one in the US, four in Europe, and one in Asia. Every development function can be fulfilled by at least two developers.
Our operations team spans five geographical locations: two in the US, three in Europe, and one in Asia. Every operations function can be fulfilled by at least two members of the team.
Security certifications
Talend is SOC 2 Type 2 and HIPAA certified.
We use the Cloud Security Alliance (CSA) Security Trust Assurance and Risk (STAR) program to assess our security practices and validate the security posture of our cloud offerings. You can find more information here.
Please refer to AWS and Azure websites for more details about their security certifications and compliance information.
25 Talend Data Fabric Security Architecture Overview
26
About TalendTalend (NASDAQ: TLND), a leader in data integration and data integrity, enables every company to find clarity amidst the data chaos.
Talend is the only company to bring together in a single platform all the necessary capabilities that ensure enterprise data is complete, clean, compliant, and readily available to everyone who needs it throughout the organization. With Talend, organizations are able to deliver exceptional customer experiences, make smarter decisions in the moment, drive innovation, and improve operations.
To learn more, please visit www.talend.com