product introduction - huawei › intl › en-us › product... · cloud data migration product...

35
Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD.

Upload: others

Post on 24-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Cloud Data Migration

Product Introduction

Issue 01

Date 2020-04-10

HUAWEI TECHNOLOGIES CO., LTD.

Page 2: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without priorwritten consent of Huawei Technologies Co., Ltd. Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.All other trademarks and trade names mentioned in this document are the property of their respectiveholders. NoticeThe purchased products, services and features are stipulated by the contract made between Huawei andthe customer. All or part of the products, services and features described in this document may not bewithin the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,information, and recommendations in this document are provided "AS IS" without warranties, guaranteesor representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in thepreparation of this document to ensure accuracy of the contents, but all statements, information, andrecommendations in this document do not constitute a warranty of any kind, express or implied.

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. i

Page 3: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Contents

1 What Is CDM?...........................................................................................................................1

2 Product Advantages................................................................................................................ 3

3 Supported Data Sources.........................................................................................................5

4 Application Scenarios............................................................................................................. 9

5 Basic Concepts........................................................................................................................11

6 CDM Migration Principles................................................................................................... 16

7 Related Services.....................................................................................................................19

8 Constraints.............................................................................................................................. 23

9 Quotas......................................................................................................................................29

10 Permissions Management................................................................................................. 30

Cloud Data MigrationProduct Introduction Contents

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. ii

Page 4: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

1 What Is CDM?

Product OverviewCloud Data Migration (CDM) implements data mobility by enabling batch datamigration among homogeneous and heterogeneous data sources. It supports on-premises and cloud file systems, relational databases, data warehouses, NoSQL,big data, and object storage.

Based on the distributed computing framework and the parallel processingtechnology, CDM helps you migrate massive sets of data stably and efficiently. Youcan migrate data online and quickly construct a desired data structure.

Figure 1-1 CDM positioning

Functions● Table/file/entire DB migration

Tables or files can be migrated in batches. An entire database can bemigrated between homogeneous and heterogeneous databases. A job canmigrate hundreds of tables.

● Incremental data migrationCDM supports incremental migration of files, relational databases, andHBase/CloudTable, as well as with WHERE clauses and macro variables ofdate and time.

● Migration in transaction modeWhen a CDM job fails to be executed, CDM rolls back the data to the statebefore the job starts and automatically deletes data from the destinationtable.

Cloud Data MigrationProduct Introduction 1 What Is CDM?

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 1

Page 5: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

● Field conversionCDM supports field conversion functions, such as anonymization, characterstring operations, and date operations.

● File encryptionWhen files are migrated to a file system, CDM can encrypt the files written tothe cloud.

● MD5 verificationMD5 verification is supported to check the file consistency from end to endand output verification result.

● Dirty data archivingCDM can archive the data that fails to be processed during migration, hasbeen filtered out, or is not compliant with conversion or cleaning rules todirty data logs. The threshold for dirty data ratio can be set to determinewhether a task is successful.

Cloud Data MigrationProduct Introduction 1 What Is CDM?

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 2

Page 6: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

2 Product Advantages

Data migration is involved when you consolidate or back up data, or develop newapplications on the cloud. Generally, if you want to migrate data, you may developdata migration scripts to read data from the source and write data to thedestination. Compared with this method, CDM has the following advantages:

Table 2-1 CDM advantages

Item User-Developed Script CDM

Ease ofuse

You need to prepare serverresources, and install andconfigure software, which istime-consuming.Because the data source typesare different, the programuses different accessinterfaces, such as JDBC andnative APIs, to read and writedata. In this case, variouslibraries and SDKs are requiredwhen you write datamigration scripts, resulting inhigh development andmanagement costs.

CDM provides a web-basedmanagement console for enablingservices on web pages in real time.You can migrate data byconfiguring data sources andmigration jobs on the GUI andCDM will manage and maintainthe data sources and migrationjobs for you. In other words, youonly need to focus on the datamigration logic without worryingabout the environment, whichgreatly reduces development andmaintenance costs.CDM also provides RESTful APIs tosupport third-party system callingand integration.

Real-timemonitoring

You need to select specificversions to develop asrequired.

You can use Cloud Eye toautomatically monitor CDMclusters in real time and managealarms and notifications, so thatyou can keep track of CDM clusterperformance metrics.

Cloud Data MigrationProduct Introduction 2 Product Advantages

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 3

Page 7: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Item User-Developed Script CDM

O&M free You need to develop andoptimize O&M functions,especially alarm andnotification functions, toensure system availability.Otherwise, manual attendanceis required.

With CDM, you do not need tomaintain resources such as serversand VMs. CDM has the log,monitoring, and alarm functions,which send notifications to relatedpersonnel in a timely manner toavoid 24/7 hours of manual O&M.

Highefficiency

During data migration, theread and write process iscompleted in one job. Limitedby available resources, theperformance is poor andcannot meet the requirementsof scenarios where massivesets of data need to bemigrated.

Based on the distributedcomputing framework, CDM jobsare split into independent sub-jobsand executed concurrently, whichdrastically improves datamigration efficiency. In addition,efficient data import interfaces areprovided to import data from Hive,HBase, MySQL databases, andData Warehouse Service (DWS).

Variousdatasources

Different tasks must bedeveloped for different datasources, generating a numberof scripts.

Various data sources such asdatabases, Hadoop, NoSQL, datawarehouses, and files aresupported. For details, seeSupported Data Sources.

Differentnetworkenvironments

As the cloud computingtechnology develops, userdata may be stored indifferent environments, suchas public clouds, on-premisesor hosted Internet datacenters (IDCs), and hybridscenarios. In heterogeneousenvironments, data migrationis subject to various factors,for example, networkconnectivity, which causesinconvenience fordevelopment andmaintenance.

CDM helps you easily cope withvarious data migration scenarios,including data migration to thecloud, data exchange on the cloud,and data migration to on-premisesservice systems, regardless ofwhether the data is stored on on-premises IDCs, cloud services,third-party clouds, or self-builtdatabases or file systems on ECSs.

Cloud Data MigrationProduct Introduction 2 Product Advantages

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 4

Page 8: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

3 Supported Data Sources

Cloud Data Migration supports the following migration modes for different datasources:

● Table/file migration: It is applicable to data migration to the cloud, dataexchange on the cloud, and data migration to on-premises service systems.For details, see Supported Data Sources in Table/File Migration.

● Entire DB migration: It is applicable to database migration to the cloud. Fordetails, see Supported Data Sources in Entire DB Migration.

Supported Data Sources in Table/File Migration

Table 3-1 describes the supported data sources.

Table 3-1 Supported data sources during table/file migration

Category Data Source Export Import

Data warehouse Data Warehouse Service(DWS)

Supported Supported

Data Lake Insight (DLI) Notsupported

Supported

FusionInsight LibrA Supported Supported

Hadoop MRS HDFS Supported Supported

MRS HBase Supported Supported

MRS Hive Supported Supported

FusionInsight HDFS Supported Supported

Apache HDFS Supported Supported

Hadoop release version Notsupported

Notsupported

FusionInsight HBase Supported Supported

Cloud Data MigrationProduct Introduction 3 Supported Data Sources

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 5

Page 9: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Category Data Source Export Import

FusionInsight Hive Supported Supported

Apache HBase Supported Supported

Apache Hive Supported Supported

Object storage Object Storage Service (OBS) Supported Supported

Alibaba Cloud ObjectStorage Service (OSS)

Supported Notsupported

Qiniu Cloud Object Storage(KODO)

Supported Notsupported

Amazon Simple StorageService (Amazon S3)

Supported Notsupported

Tencent Cloud ObjectStorage (COS)

Supported Notsupported

File system FTP Supported Supported

SFTP Supported Supported

HTTP Supported Notsupported

Network Attached Storage(NAS)

Supported Supported

Scalable File Service (SFSTurbo)

Supported Supported

Relationaldatabase

RDS for MySQL Supported Supported

RDS for PostgreSQL Supported Supported

RDS for SQL Server Supported Supported

Distributed DatabaseMiddleware (DDM)

Supported Supported

MySQL Supported Supported

PostgreSQL Supported Supported

Microsoft SQL Server Supported Supported

Oracle Supported Supported

IBM Db2 Supported Supported

Derecho (GaussDB) Supported Supported

SAP HANA Supported Supported

Cloud Data MigrationProduct Introduction 3 Supported Data Sources

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 6

Page 10: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Category Data Source Export Import

NoSQL Distributed Cache Service(DCS)

Notsupported

Supported

Document Database Service(DDS)

Supported Supported

CloudTable Service(CloudTable)

Supported Supported

CloudTable OpenTSDB Supported Supported

Redis Supported Supported

MongoDB Supported Supported

Search Cloud Search Service (CSS) Supported Supported

Elasticsearch Supported Supported

Message system Data Ingestion Service (DIS) Supported Supported

Apache Kafka Supported Supported

DMS Kafka Supported Supported

● In the preceding table, the non-cloud data sources, such as MySQL, can be the on-premises MySQL, MySQL built on ECSs, or MySQL on the third-party cloud.

● When DIS, Apache Kafka, or DMS Kafka functions as the migration source, CDM canmigrate data to only the following types of data sources:

● CSS

● DIS

● Apache Kafka

● DMS Kafka

Supported Data Sources in Entire DB Migration

Entire DB migration is applicable to the scenario where an on-premises datacenter or a database created on an ECS is synchronized to a database service orbig data service on the cloud. It is suitable for offline database migration but notonline real-time migration.

Table 3-2 lists the data sources supporting entire DB migration using Cloud DataMigration.

Table 3-2 Supported data sources in entire DB migration

Source DataType

Destination Data Type

Cloud Data MigrationProduct Introduction 3 Supported Data Sources

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 7

Page 11: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

RDS forMySQL

MRS(Hive)

DWS CSS OBS CloudTable

MySQL √ √ √ × √ ×

PostgreSQL √ √ √ × √ ×

MicrosoftSQL Server

√ √ √ × × ×

Oracle √ √ √ × √ ×

Elasticsearch × × × √ × ×

MongoDB × × √ × × ×

HBase × × × × × √

IBM Db2 √ √ √ × √ ×

Derecho(Gauss DB)

√ √ √ × √ ×

DDM √ √ √ × √ ×

SAP HANA √ √ √ × √ ×

DWS √ √ √ × × ×

Cloud Data MigrationProduct Introduction 3 Supported Data Sources

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 8

Page 12: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

4 Application Scenarios

Migrating Local Data to the Cloud

Local data is stored in the IDC that you have built or rent, or on the third-partycloud, including data stored in relational databases, NoSQL databases, OLAPdatabases, and file systems.

In this scenario, if you want to use the cloud computing and storage resources,you must migrate local data to the cloud in advance. Ensure that the localnetwork can communicate with the cloud network.

Figure 4-1 Migrating local data to the public cloud

Migrating Data Between Cloud Services

In this scenario, you are allowed to exchange data between the following cloudservices:● OBS● Relational Database Service (RDS)● MapReduce Service (MRS)● DWS● DDS● DCS● CSS● DIS

Cloud Data MigrationProduct Introduction 4 Application Scenarios

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 9

Page 13: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

● CloudTable● DLI● DDM● SFS● Databases or file systems deployed on the ECSs

Migrating Cloud Data to On-Premises EnvironmentsA local environment is a data storage system in the IDC that you have built orrent, or on the third-party cloud, including relational databases and file systems.

In this scenario, after data is processed using the cloud computing and storageresources, the processed data can be returned to on-premises service systems,specifically relational databases and file systems. Ensure that the local networkcan communicate with the cloud network.

Figure 4-2 Migrating public cloud data to on-premises environments

Cloud Data MigrationProduct Introduction 4 Application Scenarios

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 10

Page 14: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

5 Basic Concepts

CDM Cluster

A CDM cluster is a CDM instance. It consists of one or more VMs. You can createmultiple CDM clusters for different purposes. For example, you can create a CDMcluster for the financial department and the procurement department respectivelyto isolate data access permissions.

Local Environment

A local environment is a data storage system in the IDC that you have built orrent, or on the third-party cloud, including relational databases and file systems.

Local Data

Local data is stored in the IDC that you have built or rent, or on the third-partycloud, including data stored in relational databases, NoSQL databases, OLAPdatabases, and file systems.

Connector

A connector is a built-in object template used for connecting to a data source.Currently, CDM uses connectors to connect to OBS, MRS, and databases. Newconnectors can be added to CDM as well.

Link

A link is an object set up based on a connector and used to connect to a specificdata source.

To create a link, you must specify the link name, connector, data source address,and authentication information. For example, to connect to a MySQL database,you must set the host IP address, port number, username, and password.

After a link is set up, it can be used by multiple jobs as either a source or adestination link.

Cloud Data MigrationProduct Introduction 5 Basic Concepts

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 11

Page 15: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

JobA job is a data migration task that you have created to migrate data from aspecific data source to another. To create a job, you must specify a source link,destination link, and data mapping rules.

Source Job ConfigurationDuring job creation, the source link specifies the data source from which data isextracted. The job parameters of different source links vary. For example, the tableor directory from which data is exported is specified in the job configuration at thesource end.

Destination Job ConfigurationDuring job creation, the destination link specifies the data source to which data isloaded. The job parameters of different destination links vary. For example, thetable or directory to which data is imported is specified in the job configuration ofthe destination end.

Field MappingDuring job creation, especially jobs of migrating data between heterogeneous datasources, you must configure the mapping between the source and destinationdata sources, such as field mapping and field type mapping.

AccountThe account registered with the cloud owns your cloud resources and has fullaccess permissions for the resources. You can use the account to reset userpasswords and assign permissions. Your account receives and pays all billsgenerated by your IAM users' use of resources. To log in to the managementconsole using an account, choose Account Login.

IAM UserIAM users are created by an account to use cloud services. Each IAM user has theirown password and access key to access cloud services using the console or APIs.The users manage cloud resources for the account based on assigned permissions.IAM users do not own resources or make payments. It is the account that controlsuser permissions and pays the bills. To log in to the management console as anIAM user, choose IAM User Login.

Cloud Data MigrationProduct Introduction 5 Basic Concepts

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 12

Page 16: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Figure 5-1 Relationship between an account and its IAM users

User GroupUser groups facilitate centralized user management and streamlined permissionsmanagement. IAM users must be added to a user group to obtain the permissionsrequired for accessing specified resources or cloud services in the account. A usercan be added to multiple groups, which allows them to inherit differentpermissions.

The default user group admin has all of the permissions required to use all of thecloud resources. Users in this group can perform operations on all the resources,including but not limited to creating user groups and users, assigning permissions,and managing resources.

Policy and AuthorizationA policy is a set of permissions defined in JSON format. It defines whichoperations on which cloud resources are allowed. There are two types of policies:system policy and custom policy.

● System policies are pre-defined by IAM and cannot be modified.● If system policies do not meet your requirements, you can create custom

policies for fine-grained access control.

Authorization is the process of granting required permissions for a user to performa task. After a system or custom policy is assigned to a user group, users in thegroup inherit the permissions defined by the policy to manage resources.

For example, the content of the CDM Administrator policy defining allpermissions of CDM is as follows:

{ "Version": "1.1",

Cloud Data MigrationProduct Introduction 5 Basic Concepts

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 13

Page 17: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

"Statement": [ { "Action": [ "cdm:*:*", "ecs:*:*", "vpc:*:*", "evs:*:*", "bss:*:*", "CTS:*:*", "eps:*:*", "obs:*:*", "CES:*:*" ], "Effect": "Allow" } ]}

Project

A project corresponds to a service region. Default projects are defined to groupand physically isolate resources (including computing, storage, and networkresources) across regions.

● Users can be granted permissions in a default project to access all resources inthe region associated with the project.

● If you need more refined access control, you can create subprojects under adefault project and create resources in subprojects. Then you can assignrequired permissions for users to access only the resources in specificsubprojects.

Figure 5-2 Project isolation model

Identity Credentials

Identity credentials are used for authentication when you or your IAM users accesscloud services through the console or APIs. Identity credentials include thepassword and access keys, which can be managed in IAM.

● Password: A common identity credential for logging in to the managementconsole or calling cloud service APIs.

● Access key: An access key ID/secret access key (AK/SK) pair, which is used onlyfor calling cloud service APIs. Each access key provides a signature for

Cloud Data MigrationProduct Introduction 5 Basic Concepts

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 14

Page 18: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

cryptographic authentication to ensure that access requests are secret,complete, and correct.

Cloud Data MigrationProduct Introduction 5 Basic Concepts

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 15

Page 19: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

6 CDM Migration Principles

Migration PrinciplesWhen a tenant uses CDM, the CDM system provisions a fully-managed CDMinstance in the tenant's VPC. The instance allows only console and RESTful APIaccess. Therefore the tenant cannot access the instance through other interfaces(such as SSH). This ensures data isolation between CDM tenants, prevents dataleakage, and ensures transmission security during data migration betweendifferent cloud services in a VPC. Tenants can also use the VPN to migrate datafrom the on-premises data center to cloud services to ensure migration security.

CDM works in push-pull mode. CDM pulls data from the migration source andpushes the data to the migration destination. Data access operations are initiatedby CDM. SSL will be used if the data source (such as RDS) supports it. During themigration, the usernames and passwords of the migration source and destinationare required. Such information is stored in the database of the CDM instance.Protecting such information is critical to ensure CDM security.

Figure 6-1 Migration principles

Cloud Data MigrationProduct Introduction 6 CDM Migration Principles

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 16

Page 20: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Security Boundary and Risk Mitigation

Figure 6-2 Risk mitigation

As shown in Figure 6-2, CDM may have the following threats:

1. Threats from the Internet: Malicious tenants may attack CDM through theCDM console.

2. Threats from the data center: Malicious CDM administrators obtain tenants'data source access information (usernames and passwords).

3. Threats from malicious tenants: Malicious tenants steal data from othertenants.

4. Data exposure to the public network: Data is exposed when it is migratedfrom the public network.

CDM offers the following mechanisms to prevent potential security risks:

1. Threats from the Internet: Tenants cannot log in to the CDM console throughthe public network. CDM provides a two-layer security mechanism.

a. On the one hand, the cloud console framework requires userauthentication when tenants access management consoles of cloudservices.

b. On the other hand, Web Application Firewall (WAF) filters requests fromall consoles and stops request attack code or content.

2. Threats from the data center: Tenants must provide the usernames andpasswords of the migration source and destination to complete datamigration. To prevent the CDM administrators from obtaining suchinformation and attacking important data sources of tenants, CDM provides athree-level protection mechanism.

a. CDM stores passwords encrypted by AES-256 in the database of theinstance to ensure tenant isolation. The database is run by user Ruby andlistens to only 127.0.0.1. Therefore, tenants cannot remotely access thedatabase.

b. After the instance is provisioned, CDM changes the passwords of usersroot and Ruby to random passwords and does not store them in anyplace. This prevents the CDM administrators from accessing tenants'instances and databases containing password information.

Cloud Data MigrationProduct Introduction 6 CDM Migration Principles

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 17

Page 21: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

c. CDM instances work in push-pull mode. Therefore, the instances do nothave any listening port enabled in the VPC, and tenants cannot accessthe local database or operating system from the VPC.

3. Threats from malicious tenants: CDM runs instances on independent VMs, sothat tenants' instances are completely isolated and secure. Malicious tenantscannot access instances of other tenants.

4. Data exposure to the public network: In push-pull mode, even if elastic IPaddresses (EIPs) are bound to the CDM clusters, no port is enabled for theEIPs. In this way, attackers cannot access and attack CDM using the EIPs.However, when data is migrated from the public network, tenants' datasources are exposed to the public network and threatened by third-partyattacks. Therefore, tenants are advised to use ACLs or firewalls on the datasource server for security. In this case, for example, only the access requestsfrom the EIPs bound to the CDM clusters are allowed.

Cloud Data MigrationProduct Introduction 6 CDM Migration Principles

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 18

Page 22: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

7 Related Services

IAM

Your registered cloud account has full access to its resources and cloud services. Ifyou need to assign different permissions to employees in your enterprise to accessyour CDM resources, Identity and Access Management (IAM) is a good choice forfine-grained permissions management. IAM provides identity authentication,permissions management, and access control, helping you secure access to yourcloud resources.

VPC

CDM clusters are created in the subnets of a Virtual Private Cloud (VPC). VPCsprovide a secure, isolated, and logical network environment for CDM clusters.

MRS

CDM supports data import and export using MRS.

OBS

CDM supports data import and export using OBS, which also stores backup filesand logs of CDM clusters.

Cloud Eye

CDM uses Cloud Eye to monitor cluster performance metrics, delivering statusinformation in a concise and efficient manner, as shown in Table 7-1.

Table 7-1 CDM performance metrics

Metric Description ValueRange

MonitoredObject

Bytes In Measures the network inboundrate of the monitored object.Unit: byte/s

≥ 0 bytes/s Cloud DataMigration

Cloud Data MigrationProduct Introduction 7 Related Services

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 19

Page 23: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Metric Description ValueRange

MonitoredObject

Bytes Out Measures the network outboundrate of the monitored object.Unit: byte/s

≥ 0 bytes/s Cloud DataMigration

CPU Usage Measures the CPU usage of themonitored object.Unit: %

0% to100%

Cloud DataMigration

MemoryUsage

Measures the memory usage ofthe monitored object.Unit: %

0% to100%

Cloud DataMigration

CTSCDM uses Cloud Trace Service (CTS) to record operations for later query, audit,and backtrack operations. Table 7-2 displays the recorded CDM operations.

Table 7-2 CDM operations recorded by CTS

Operation Resource Type Trace Name

Creating a cluster cluster createCluster

Deleting a cluster cluster deleteCluster

Modifying clusterconfigurations

cluster modifyCluster

Starting a cluster cluster startCluster

Stopping a cluster cluster stopCluster

Restarting a cluster cluster restartCluster

Importing a job cluster clusterImportJob

Binding an EIP cluster bindEip

Unbinding an EIP cluster unbindEip

Creating a link link createLink

Modifying a link link modifyLink

Deleting a link link deleteLink

Creating a job job createJob

Modifying a job job modifyJob

Deleting a job job deleteJob

Cloud Data MigrationProduct Introduction 7 Related Services

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 20

Page 24: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Operation Resource Type Trace Name

Starting a job job startJob

Stopping a job job stopJob

DWS

CDM allows you to import data to and export data from DWS.

RDS

CDM allows you to import data to and export data from RDS, including RDS forMySQL, RDS for PostgreSQL, and RDS for SQL Server.

DDS

CDM allows you to export data from DDS, but it does not allow you to importdata to DDS.

DCS

CDM allows you to import data to DCS, but it does not allow you to export datafrom DDS.

CSS

CDM allows you to import data to and export data from CSS.

DIS

CDM allows you to import data to DIS, and export data from DIS to CSS, ApacheKafka, or DMS Kafka.

CloudTable

CDM allows you to import data to and export data from CloudTable.

DLI

CDM allows you to import data to DLI, but it does not allow you to export datafrom DLI.

DDM

CDM allows you to import data to and export data from DDM.

SFS

CDM allows you to import data to and export data from SFS.

Cloud Data MigrationProduct Introduction 7 Related Services

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 21

Page 25: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Distributed Message Service (DMS)CDM allows you to import data to DMS, and export data from DMS to CSS,Apache Kafka, DIS, or DMS Kafka.

Data Lake Factory (DLF)CDM can be orchestrated and scheduled as a node task of DLF.

Cloud Data MigrationProduct Introduction 7 Related Services

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 22

Page 26: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

8 Constraints

Due to various factors such as technology and cost, CDM has constraints on datamigration.

CDM System Constraints1. You cannot modify the flavor of an existing cluster. If you require a higher

flavor, create a cluster with your desired flavor.2. CDM does not support the function of controlling the data migration speed.

Therefore, do not perform data migration during peak hours.3. Currently, the network bandwidth of all CDM instances is 10 Gbit/s.

Theoretically, the maximum volume of data transmission per instance per dayis 52 TB. If you have specific requirement on the transmission speed, usemultiple CDM instances.The preceding data volume is the theoretical value. The actual data volume isrestricted by the data source type, read and write performance of the sourceand destination data sources, and bandwidth. The actual data volume canreach about 8 TB per day (large file migration to OBS). It is recommendedthat you test the speed with a small amount of data before migration.

4. CDM supports incremental file migration (by skipping repeated files), butdoes not support resumable transfer.For example, if three files are to be migrated and the second file fails to bemigrated due to the network fault. When the migration task is started again,the first file is skipped. The second file, however, cannot be migrated from thepoint where the fault occurs, but can only be migrated again.

5. During file migration, a single task supports millions of files. If there are toomany files in the directory to be migrated, you are advised to split the filesinto different directories and create multiple tasks.

6. The number of tasks executed by a single CDM instance at a time is 30(cdm.large), 20 (cdm.medium), or 10 (cdm.small). The number of queuedjobs (in the pending state) to be executed is 10,000, 5,000, and 2,000respectively.In database migration, a job is equivalent to migrating a table. In filemigration, multiple files can be migrated in a job.

7. You can export links and jobs configured on CDM to a local directory. Toensure password security, CDM does not export the link password of the

Cloud Data MigrationProduct Introduction 8 Constraints

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 23

Page 27: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

corresponding data source. Therefore, before importing job configurations toCDM, you need to manually input the password in the exported JSON file.

8. The cluster cannot automatically upgrade to a new version. You need to usethe job export and import functions to upgrade the cluster to the new version.

9. CDM does not automatically back up users' job configurations. You need toexport and back up configuration data using the export function.

10. If VPC peering connection is configured, the peer VPC subnet may overlapwith the CDM management network. As a result, data sources in the peerVPC cannot be accessed. You are advised to use the public network for cross-VPC data migration, or contact the administrator to add specific routes to theVPC peering connection in the CDM background.

General Constraints on Database Migration1. CDM is mainly used for batch migration. It supports only limited incremental

migration but does not support real-time incremental migration. You areadvised to use Data Replication Service (DRS) to migrate the incrementaldata of the database to RDS.

2. The entire DB migration of CDM supports only data table migration but notmigration of database objects such as stored procedures, triggers, functions,and views. Views are migrated as tables.CDM applies only to scenarios where databases are migrated to the cloud at atime, including homogeneous and heterogeneous database migrations. CDMis not applicable to data synchronization, for example, disaster recovery andreal-time synchronization.

3. When CDM fails to migrate the entire database or data table, the data thathas been imported to the target table will not be rolled back automatically. Ifyou want to perform migration in transaction mode, configure the Import toStaging Tablehttps://support.huaweicloud.com/intl/en-us/usermanual-cdm/cdm_01_0106.html parameter to roll back data when the migrationfails.In extreme cases, the created stage table or temporary table cannot beautomatically deleted. You need to manually clear the table (the table nameof the stage table ends with _cdm_stage), for example, cdmtet_cdm_stage).

4. If CDM needs to access data sources in the on-premises data center (forexample, the on-premises MySQL database), the data sources must supportInternet access and the CDM instances must be bound with elastic IPaddresses. In this case, the security practice is to configure the firewall orsecurity policies to allow only the EIPs of the CDM instances to access thelocal data sources.

5. Only common data types are supported, including character strings, digits,and dates. Object types are limited. If objects are too large, migration cannotbe performed.

6. Only the GBK and UTF-8 character sets are supported.

Constraints on MRSEach CDM cluster supports data import and export of only one MRS data source.To import and export data of different MRS data sources, create multiple CDMclusters.

Cloud Data MigrationProduct Introduction 8 Constraints

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 24

Page 28: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Constraints on FusionInsight HD and Apache HadoopIf the FusionInsight HD and Apache Hadoop data sources are deployed in the on-premises data center, CDM must access all nodes in the cluster for reading andwriting the Hadoop files. Therefore, the network access must be enabled for eachnode.

You are advised to use Direct Connect to improve the migration speed whileensuring network access.

Constraints on DWS and FusionInsight LibrA1. If the DWS primary key or table contains only one field, the field type must be

a common character string, value, or date. When data is migrated fromanother database to DWS, if automatic table creation is selected, the primarykey must be of the following types. If no primary key is set, at least one of thefollowing fields must be set. Otherwise, the table cannot be created and theCDM job fails.– INTEGER TYPES: TINYINT, SMALLINT, INT, BIGINT, NUMERIC/DECIMAL– CHARACTER TYPES: CHAR, BPCHAR, VARCHAR, VARCHAR2, NVARCHAR2,

TEXT– DATA/TIME TYPES: DATE, TIME, TIMETZ, TIMESTAMP, TIMESTAMPTZ,

INTERVAL, SMALLDATETIME2. In DWS, the character string '' is null. A null character string cannot be

inserted into a field with non-null constraints. This is inconsistent with theMySQL behavior. MySQL does not consider that '' is null. Migration fromMySQL to DWS may fail due to the preceding reason.

3. When the Gauss Data Service (GDS) mode is used to quickly import data toDWS, you need to configure a security group or firewall policy to allowDataNodes of DWS or FusionInsight LibrA to access port 25000 of the CDM IPaddress.

4. When data is imported to DWS in GDS mode, CDM automatically creates aforeign table for data import. The table name ends with a universally uniqueidentifier (UUID), for example, cdmtest_aecf3f8n0z73dsl72d0d1dk4lcir8cd.If a job fails, it will be automatically deleted. In extreme cases, you may needto manually delete it.

Constraints on OBS1. During file migration, the system automatically transfers the files

concurrently. In this case, Concurrent Extractors in the task configuration isinvalid.

2. Resumable transfer is not supported. If CDM fails to transfer files, OBSfragments are generated. You need to clear fragments on the OBS console toprevent space occupation.

3. CDM does not support the versioning control function of OBS.4. During incremental migration, the number of files or objects in the source

directory of a single job depends on the CDM cluster flavor. A cdm.largecluster supports a maximum of 300,000 files; a cdm.medium cluster supportsa maximum of 200,000 files; and a cdm.small cluster supports a maximum of100,000 files.

Cloud Data MigrationProduct Introduction 8 Constraints

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 25

Page 29: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

If the number of files or objects in a single directory exceeds the upper limit,split the files or objects into multiple migration jobs based on subdirectories.

Constraints on OracleReal-time incremental data synchronization is not supported for Oracle databases.

Constraints on DCS and Redis1. Because DCS restricts the commands for obtaining keys, it cannot serve as the

migration source but can be the migration destination. The Redis service ofthe third-party cloud cannot serve as the migration source. However, theRedis set up in the on-premises data center or on the ECS can be themigration source and destination.

2. Only the hash and string data formats are supported.

Constraints on DDS and MongoDBWhen you migrate data from MongoDB to a relational database, CDM reads thefirst row of the collection as an example of the field list. If the first row of datadoes not contain all fields of the collection, you need to manually add fields.

Constraints on CSS and Elasticsearch1. CDM supports automatic creation of indexes and field types. The index and

field type names can contain only lowercase letters.2. You cannot modify the field type under an index after it is created, but only

create another field.If you need to modify the field type, you need to create an index or run theElasticsearch command on Kibana to delete the existing index and createanother index (the data is also deleted).

3. When the field type of the index created by CDM is date, the data formatmust be yyyy-MM-dd HH:mm:ss.SSS Z. For example, 2018-08-0808:08:08.888 +08:00.During data migration to CSS, if the original data of the date field does notmeet the format requirements, you can use the expression conversionfunction of CDM to convert the data to the preceding format.

Constraints on DIS and Kafka1. The data in the message body is a record in CSV format that supports

multiple delimiters. Messages cannot be parsed in binary or other formats.2. If a job is set to run for a long time, the job will fail if the DIS system is

interrupted.

Constraints on CloudTable and HBase1. When you migrate data from CloudTable or HBase, CDM reads the first row

of the table as an example of the field list. If the first row of data does notcontain all fields of the table, you need to manually add fields.

2. Because HBase is schema-less, CDM cannot obtain the data types. If the datais stored in binary format, CDM cannot parse the data.

Cloud Data MigrationProduct Introduction 8 Constraints

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 26

Page 30: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Constraints on HiveWhen Hive serves as the migration destination, if the storage format is TEXTFILE,delimiters must be explicitly specified in the statement for creating Hive tables.The following gives an example:CREATE TABLE csv_tbl(smallint_value smallint,tinyint_value tinyint,int_value int,bigint_value bigint,float_value float,double_value double,decimal_value decimal(9, 7),timestmamp_value timestamp,date_value date,varchar_value varchar(100),string_value string,char_value char(20),boolean_value boolean,binary_value binary,varchar_null varchar(100),string_null string,char_null char(20),int_null int)ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'WITH SERDEPROPERTIES ("separatorChar" = "\t","quoteChar" = "'","escapeChar" = "\\")STORED AS TEXTFILE;

Constraints on Incremental Data Migration in MySQL Binlog Mode● Currently, this mode can be used to migrate MySQL to DWS only.● In the migration from MySQL to DWS, the constraints on the incremental

data migration function in MySQL Binlog mode are as follows:

a. A single cluster supports only one incremental migration job in MySQLBinlog mode in the current version.

b. In the current version, you are not allowed to delete or update 10,000data records at a time.

c. Entire DB migration is not supported.d. Data Definition Language (DDL) operations are not supported.e. Event migration is not supported.f. If you set Migrate Incremental Data to Yes, binlog_format in the source

MySQL database must be set to ROW.g. If you set Migrate Incremental Data to Yes and binlog file ID disorder

occurs on the source MySQL instance due to cross-machine migration orrebuilding during incremental data migration, incremental data may belost.

h. If a primary key exists in the destination table and incremental data isgenerated during the restart of the CDM cluster or full migration,duplicate data may exist in the primary key. As a result, the migrationfails.

i. If the destination DWS database is restarted, the migration will fail. Inthis case, restart the CDM cluster and the migration job.

Cloud Data MigrationProduct Introduction 8 Constraints

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 27

Page 31: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

● The recommended MySQL configuration is as follows:# Enable the bin-log function.log-bin=mysql-bin# ROW modebinlog-format=ROW# gtid mode. The recommended version is 5.6.10 or later.gtid-mode=ONenforce_gtid_consistency = ON

Cloud Data MigrationProduct Introduction 8 Constraints

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 28

Page 32: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

9 Quotas

CDM uses the following infrastructure resources:

● ECS● VPC● EIP● Simple Message Notification (SMN)● IAM

For details about how to view and modify the quota, see Quotas.

Cloud Data MigrationProduct Introduction 9 Quotas

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 29

Page 33: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

10 Permissions Management

If you need to assign different permissions to employees in your enterprise toaccess your CDM resources, IAM is a good choice for fine-grained permissionsmanagement. IAM provides identity authentication, permissions management,and access control, helping you secure access to your cloud resources.

With IAM, you can use your cloud account to create IAM users, and assignpermissions to the users to control their access to specific resources. For example,some employees in your enterprise need to use CDM resources but should not beallowed to delete CDM clusters or perform any other high-risk operations. In thisscenario, you can create IAM users for the employees and grant them only thepermissions required for using CDM resources.

If your cloud account does not require individual IAM users for permissionsmanagement, skip this section.

IAM can be used free of charge. You pay only for the resources in your account.For more information about IAM, see IAM Service Overview.

CDM PermissionsBy default, new IAM users do not have permissions assigned. You need to add auser to one or more groups, and attach permissions policies or roles to thesegroups. Users inherit permissions from the groups to which they are added andcan perform specified operations on cloud services based on the permissions.

CDM is a project-level service deployed and accessed in specific physical regions.Therefore, CDM permissions are assigned to users in specific regions and only takeeffect for these regions. If you want the permissions to take effect for all regions,you need to assign the permissions to users in each region. When accessing CDM,the users need to switch to a region where they have been authorized to use theCDM service.

You can grant users permissions by using roles and policies.● Roles: A type of coarse-grained authorization mechanism that defines

permissions related to user responsibilities. This mechanism provides only alimited number of service-level roles for authorization. When using roles togrant permissions, you need to also assign other roles on which thepermissions depend to take effect. However, roles are not an ideal choice forfine-grained authorization and secure access control.

Cloud Data MigrationProduct Introduction 10 Permissions Management

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 30

Page 34: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

● Policies: A type of fine-grained authorization mechanism that definespermissions required to perform operations on specific cloud resources undercertain conditions. This mechanism allows for more flexible policy-basedauthorization, meeting requirements for secure access control. For example, aspecific user group is not allowed to delete a cluster. Only basic CDMoperations (such as creating and querying jobs) are allowed.

Table 1 lists all the system-defined roles and policies supported by CDM.

Table 10-1 System-defined roles and policies supported by CDM

Role/PolicyName

Description Type

CDMAdministrator

Permissions:● Administrator permissions for all

operations on CDM resources. Usersgranted these permissions must alsobe granted permissions of the TenantGuest and Server Administratorpolicies.

● Users granted permissions of the VPCAdministrator policy can create VPCsand subnets.

● Users granted permissions of theCloud Eye Administrator policy canview monitoring information of CDMclusters.

System role

CDM FullAccess Administrator permissions for CDM. Usersgranted these permissions can perform alloperations on CDM resources.

System-definedpolicy

CDMFullAccessExceptUpdateEIP

Users granted these permissions canperform all operations on CDM resourcesexcept binding and unbinding EIPs.

System-definedpolicy

CDMCommonOperations

Users granted these permissions canoperate CDM jobs and links.

System-definedpolicy

CDMReadOnlyAccess

Read-only permissions for CDM. Usersgranted these permissions can only viewCDM clusters, links, and jobs.

System-definedpolicy

Table 10-2 lists the common operations supported by each system-defined policyor role of CDM. Select the policies or roles as required.

Cloud Data MigrationProduct Introduction 10 Permissions Management

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 31

Page 35: Product Introduction - Huawei › intl › en-us › product... · Cloud Data Migration Product Introduction Issue 01 Date 2020-04-10 HUAWEI TECHNOLOGIES CO., LTD

Table 10-2 Common operations supported by each system-defined policy or roleof CDM

Operation CDM FullAccess CDMFullAccessExceptUpdateEIP

CDMCommonOperations

CDMReadOnlyAccess

Creating clusters √ √ × ×

Binding orunbinding EIPs

√ × × ×

Querying thecluster list

√ √ √ √

Querying clusterdetails

√ √ √ √

Restartingclusters

√ √ × ×

Modifyingclusterconfigurations

√ √ × ×

Deleting clusters √ √ × ×

Creating links √ √ √ ×

Querying links √ √ √ √

Modifying links √ √ √ ×

Deleting links √ √ √ ×

Creating jobs √ √ √ ×

Querying jobs √ √ √ √

Modifying jobs √ √ √ ×

Starting jobs √ √ √ ×

Stopping jobs √ √ √ ×

Querying jobstatus

√ √ √ √

Querying jobexecutionhistory

√ √ √ √

Deleting jobs √ √ √ ×

Cloud Data MigrationProduct Introduction 10 Permissions Management

Issue 01 (2020-04-10) Copyright © Huawei Technologies Co., Ltd. 32