user guide - huawei cloudmapreduce service user guide issue 04 date 2019-12-31 huawei technologies...

1225
MapReduce Service User Guide Issue 04 Date 2019-12-31 HUAWEI TECHNOLOGIES CO., LTD.

Upload: others

Post on 26-Mar-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

  • MapReduce Service

    User Guide

    Issue 04

    Date 2019-12-31

    HUAWEI TECHNOLOGIES CO., LTD.

  • Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.

    No part of this document may be reproduced or transmitted in any form or by any means without priorwritten consent of Huawei Technologies Co., Ltd. Trademarks and Permissions

    and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.All other trademarks and trade names mentioned in this document are the property of their respectiveholders. NoticeThe purchased products, services and features are stipulated by the contract made between Huawei andthe customer. All or part of the products, services and features described in this document may not bewithin the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,information, and recommendations in this document are provided "AS IS" without warranties, guaranteesor representations of any kind, either express or implied.

    The information in this document is subject to change without notice. Every effort has been made in thepreparation of this document to ensure accuracy of the contents, but all statements, information, andrecommendations in this document do not constitute a warranty of any kind, express or implied.

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. i

  • Contents

    1 IAM Permissions Management.............................................................................................11.1 Creating a User and Granting Permissions.....................................................................................................................11.2 MRS Custom Policies.............................................................................................................................................................. 21.3 Synchronizing IAM Users to MRS...................................................................................................................................... 3

    2 MRS Quick Start.......................................................................................................................92.1 How to Use MRS..................................................................................................................................................................... 92.2 Creating a Cluster................................................................................................................................................................. 102.3 Managing Files...................................................................................................................................................................... 112.4 Creating a Job........................................................................................................................................................................ 142.5 Terminating a Cluster.......................................................................................................................................................... 19

    3 Configuring Clusters............................................................................................................. 203.1 Overview.................................................................................................................................................................................. 203.2 Cluster List............................................................................................................................................................................... 213.3 Methods of Purchasing MRS Clusters............................................................................................................................ 243.4 Quick Purchase of a Hadoop Analysis Cluster............................................................................................................ 253.5 Quick Purchase of an HBase Cluster.............................................................................................................................. 263.6 Quick Purchase of a Kafka Cluster................................................................................................................................. 283.7 Custom Purchase of a Cluster.......................................................................................................................................... 293.8 Creating a Cluster................................................................................................................................................................. 453.9 Creating the Smallest Cluster........................................................................................................................................... 603.10 Creating a Cluster (History Versions).......................................................................................................................... 623.11 Configuring a Cluster with Storage and Compute Separated.............................................................................873.12 Managing Cluster Tags..................................................................................................................................................... 963.13 Bootstrap Actions............................................................................................................................................................... 983.13.1 Introduction to Bootstrap Actions............................................................................................................................. 983.13.2 Preparing the Bootstrap Action Script..................................................................................................................... 993.13.3 Adding a Bootstrap Action........................................................................................................................................ 1003.13.4 View Execution Records.............................................................................................................................................. 1033.13.5 Sample Scripts............................................................................................................................................................... 104

    4 Managing Active Clusters................................................................................................. 1114.1 Viewing and Monitoring Clusters................................................................................................................................. 1114.1.1 Viewing Basic Information About an Active Cluster........................................................................................... 111

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. ii

  • 4.1.2 Viewing Patch Information About an Active Cluster.......................................................................................... 1154.1.3 Viewing and Customizing Cluster Monitoring Metrics...................................................................................... 1164.1.4 Managing Service and Host Monitoring................................................................................................................. 1184.2 Manually Scaling Out a Cluster.....................................................................................................................................1254.3 Manually Scaling In a Cluster........................................................................................................................................ 1284.4 Using Auto Scaling in a Cluster..................................................................................................................................... 1304.5 Configuring Auto Scaling Rules When Creating a Cluster................................................................................... 1444.6 Scaling Up Master Node Specifications...................................................................................................................... 1504.7 Configuring Message Notification................................................................................................................................ 1514.8 O&M........................................................................................................................................................................................1544.8.1 Authorizing O&M............................................................................................................................................................ 1544.8.2 Sharing Logs..................................................................................................................................................................... 1554.9 Terminating a Cluster........................................................................................................................................................1554.10 Unsubscribing from a Cluster...................................................................................................................................... 1564.11 Deleting a Failed Task.................................................................................................................................................... 1574.12 Managing a Component................................................................................................................................................1574.12.1 Introduction.................................................................................................................................................................... 1574.12.2 Querying Configurations............................................................................................................................................ 1584.12.3 Managing Services....................................................................................................................................................... 1614.12.4 Configuring Service Parameters.............................................................................................................................. 1624.12.5 Configuring Customized Service Parameters...................................................................................................... 1664.12.6 Synchronizing Service Configurations....................................................................................................................1704.12.7 Managing Role Instances...........................................................................................................................................1724.12.8 Configuring Role Instance Parameters.................................................................................................................. 1744.12.9 Synchronizing Role Instance Configuration.........................................................................................................1774.12.10 Decommissioning and Recommissioning Role Instances............................................................................. 1794.12.11 Managing a Host (Node)........................................................................................................................................ 1814.12.12 Isolating a Host...........................................................................................................................................................1824.12.13 Canceling Isolation of a Host.................................................................................................................................1844.12.14 Starting and Stopping a Cluster............................................................................................................................1854.12.15 Synchronizing Cluster Configurations................................................................................................................. 1854.12.16 Exporting Configuration Data of a Cluster........................................................................................................1864.12.17 Performing Rolling Restart......................................................................................................................................1874.13 Managing Jobs.................................................................................................................................................................. 1944.13.1 Introduction to Jobs..................................................................................................................................................... 1954.13.2 Running a MapReduce Job........................................................................................................................................ 1984.13.3 Running a Spark Job.................................................................................................................................................... 2044.13.4 Running a HiveSql Job................................................................................................................................................ 2084.13.5 Running a SparkSql Job.............................................................................................................................................. 2124.13.6 Running a Flink Job......................................................................................................................................................2174.13.7 Running a Kafka Job....................................................................................................................................................2204.13.8 Viewing Job Configurations and Logs................................................................................................................... 222

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. iii

  • 4.13.9 Stopping Jobs................................................................................................................................................................. 2224.13.10 Copying Jobs................................................................................................................................................................ 2234.13.11 Deleting Jobs................................................................................................................................................................2254.13.12 Using Encrypted OBS Data for Job Running.....................................................................................................2264.13.13 Configuring Job Notification Rules...................................................................................................................... 2334.14 Managing Data Files....................................................................................................................................................... 2354.15 Alarm Management........................................................................................................................................................ 2394.15.1 Viewing the Alarm List............................................................................................................................................... 2394.15.2 Viewing and Manually Clearing an Alarm...........................................................................................................2404.16 Alarm Reference............................................................................................................................................................... 2414.16.1 ALM-12001 Audit Log Dump Failure..................................................................................................................... 2424.16.2 ALM-12002 HA Resource Is Abnormal..................................................................................................................2434.16.3 ALM-12004 OLdap Resource Is Abnormal...........................................................................................................2454.16.4 ALM-12005 OKerberos Resource Is Abnormal................................................................................................... 2474.16.5 ALM-12006 Node Fault.............................................................................................................................................. 2484.16.6 ALM-12007 Process Fault.......................................................................................................................................... 2504.16.7 ALM-12010 Manager Heartbeat Interruption Between the Active and Standby Nodes.....................2524.16.8 ALM-12011 Manager Data Synchronization Exception Between the Active and Standby Nodes... 2534.16.9 ALM-12012 NTP Service Is Abnormal................................................................................................................... 2554.16.10 ALM-12016 CPU Usage Exceeds the Threshold...............................................................................................2584.16.11 ALM-12017 Insufficient Disk Capacity................................................................................................................ 2594.16.12 ALM-12018 Memory Usage Exceeds the Threshold...................................................................................... 2624.16.13 ALM-12027 Host PID Usage Exceeds the Threshold..................................................................................... 2634.16.14 ALM-12028 Number of Processes in the D State on the Host Exceeds the Threshold......................2654.16.15 ALM-12031 User omm or Password Is About to Expire............................................................................... 2664.16.16 ALM-12032 User ommdba or Password Is About to Expire........................................................................ 2684.16.17 ALM-12033 Slow Disk Fault................................................................................................................................... 2704.16.18 ALM-12034 Periodic Backup Failure....................................................................................................................2714.16.19 ALM-12035 Unknown Data Status After Recovery Task Failure............................................................... 2724.16.20 ALM-12037 NTP Server Is Abnormal.................................................................................................................. 2734.16.21 ALM-12038 Monitoring Indicator Dump Failure.............................................................................................2754.16.22 ALM-12039 GaussDB Data Is Not Synchronized.............................................................................................2774.16.23 ALM-12040 Insufficient System Entropy............................................................................................................ 2794.16.24 ALM-13000 ZooKeeper Service Unavailable.....................................................................................................2814.16.25 ALM-13001 Available ZooKeeper Connections Are Insufficient.................................................................2844.16.26 ALM-13002 ZooKeeper Heap Memory or Direct Memory Usage Exceeds the Threshold................2874.16.27 ALM-14000 HDFS Service Unavailable............................................................................................................... 2894.16.28 ALM-14001 HDFS Disk Usage Exceeds the Threshold.................................................................................. 2914.16.29 ALM-14002 DataNode Disk Usage Exceeds the Threshold......................................................................... 2924.16.30 ALM-14003 Number of Lost HDFS Blocks Exceeds the Threshold........................................................... 2944.16.31 ALM-14004 Number of Damaged HDFS Blocks Exceeds the Threshold................................................ 2964.16.32 ALM-14006 Number of HDFS Files Exceeds the Threshold........................................................................ 297

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. iv

  • 4.16.33 ALM-14007 HDFS NameNode Memory Usage Exceeds the Threshold.................................................. 2994.16.34 ALM-14008 HDFS DataNode Memory Usage Exceeds the Threshold.....................................................3004.16.35 ALM-14009 Number of Dead DataNodes Exceeds the Threshold............................................................3014.16.36 ALM-14010 NameService Service Is Abnormal............................................................................................... 3044.16.37 ALM-14011 HDFS DataNode Data Directory Is Not Configured Properly............................................. 3064.16.38 ALM-14012 HDFS JournalNode Data Is Not Synchronized......................................................................... 3104.16.39 ALM-16000 Percentage of Sessions Connected to the HiveServer to Maximum Number AllowedExceeds the Threshold..............................................................................................................................................................3124.16.40 ALM-16001 Hive Warehouse Space Usage Exceeds the Threshold..........................................................3134.16.41 ALM-16002 Successful Hive SQL Operations Are Lower than the Threshold.......................................3154.16.42 ALM-16004 Hive Service Unavailable................................................................................................................. 3184.16.43 ALM-18000 Yarn Service Unavailable................................................................................................................. 3214.16.44 ALM-18002 NodeManager Heartbeat Lost...................................................................................................... 3244.16.45 ALM-18003 NodeManager Unhealthy............................................................................................................... 3254.16.46 ALM-18006 MapReduce Job Execution Timeout.............................................................................................3264.16.47 ALM-19000 HBase Service Unavailable............................................................................................................. 3284.16.48 ALM-19006 HBase Replication Synchronization Failed................................................................................ 3294.16.49 ALM-25000 LdapServer Service Unavailable.................................................................................................... 3324.16.50 ALM-25004 Abnormal LdapServer Data Synchronization........................................................................... 3344.16.51 ALM-25500 KrbServer Service Unavailable.......................................................................................................3374.16.52 ALM-27001 DBService Unavailable..................................................................................................................... 3394.16.53 ALM-27003 DBService Heartbeat Interruption Between the Active and Standby Nodes................ 3414.16.54 ALM-27004 Data Inconsistency Between Active and Standby DBServices............................................ 3434.16.55 ALM-28001 Spark Service Unavailable...............................................................................................................3454.16.56 ALM-26051 Storm Service Unavailable.............................................................................................................. 3474.16.57 ALM-26052 Number of Available Supervisors in Storm Is Lower Than the Threshold..................... 3494.16.58 ALM-26053 Slot Usage of Storm Exceeds the Threshold.............................................................................3504.16.59 ALM-26054 Heap Memory Usage of Storm Nimbus Exceeds the Threshold....................................... 3524.16.60 ALM-38000 Kafka Service Unavailable...............................................................................................................3544.16.61 ALM-38001 Insufficient Kafka Disk Space.........................................................................................................3564.16.62 ALM-38002 Heap Memory Usage of Kafka Exceeds the Threshold.........................................................3594.16.63 ALM-24000 Flume Service Unavailable..............................................................................................................3604.16.64 ALM-24001 Flume Agent Is Abnormal............................................................................................................... 3624.16.65 ALM-24003 Flume Client Connection Failure.................................................................................................. 3644.16.66 ALM-24004 Flume Fails to Read Data................................................................................................................ 3654.16.67 ALM-24005 Data Transmission by Flume Is Abnormal.................................................................................3684.16.68 ALM-12041 Permission of Key Files Is Abnormal........................................................................................... 3704.16.69 ALM-12042 Key File Configurations Are Abnormal....................................................................................... 3724.16.70 ALM-23001 Loader Service Unavailable............................................................................................................ 3734.16.71 ALM-12357 Failed to Export Audit Logs to the OBS..................................................................................... 3774.16.72 ALM-12014 Partition Lost....................................................................................................................................... 3794.16.73 ALM-12015 Partition Filesystem Readonly....................................................................................................... 3804.16.74 ALM-12043 DNS Resolution Duration Exceeds the Threshold................................................................... 382

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. v

  • 4.16.75 ALM-12045 Network Read Packet Dropped Rate Exceeds the Threshold............................................. 3844.16.76 ALM-12046 Network Write Packet Dropped Rate Exceeds the Threshold............................................ 3894.16.77 ALM-12047 Network Read Packet Error Rate Exceeds the Threshold.....................................................3914.16.78 ALM-12048 Network Write Packet Error Rate Exceeds the Threshold....................................................3934.16.79 ALM-12049 Network Read Throughput Rate Exceeds the Threshold..................................................... 3954.16.80 ALM-12050 Network Write Throughput Rate Exceeds the Threshold.................................................... 3974.16.81 ALM-12051 Disk Inode Usage Exceeds the Threshold.................................................................................. 3994.16.82 ALM-12052 TCP Temporary Port Usage Exceeds the Threshold................................................................4004.16.83 ALM-12053 File Handle Usage Exceeds the Threshold................................................................................ 4034.16.84 ALM-12054 The Certificate File Is Invalid..........................................................................................................4054.16.85 ALM-12055 The Certificate File Is About to Expire........................................................................................ 4074.16.86 ALM-18008 Heap Memory Usage of Yarn ResourceManager Exceeds the Threshold...................... 4104.16.87 ALM-18009 Heap Memory Usage of MapReduce JobHistoryServer Exceeds the Threshold.......... 4124.16.88 ALM-20002 Hue Service Unavailable..................................................................................................................4144.16.89 ALM-43001 Spark Service Unavailable...............................................................................................................4164.16.90 ALM-43006 Heap Memory Usage of the JobHistory Process Exceeds the Threshold........................4184.16.91 ALM-43007 Non-Heap Memory Usage of the JobHistory Process Exceeds the Threshold..............4204.16.92 ALM-43008 Direct Memory Usage of the JobHistory Process Exceeds the Threshold...................... 4214.16.93 ALM-43009 JobHistory GC Time Exceeds the Threshold..............................................................................4234.16.94 ALM-43010 Heap Memory Usage of the JDBCServer Process Exceeds the Threshold...................... 4244.16.95 ALM-43011 Non-Heap Memory Usage of the JDBCServer Process Exceeds the Threshold............ 4264.16.96 ALM-43012 Direct Memory Usage of the JDBCServer Process Exceeds the Threshold.....................4284.16.97 ALM-43013 JDBCServer GC Time Exceeds the Threshold............................................................................ 4294.17 Patch Operation Guide...................................................................................................................................................4314.17.1 Patch Operation Guide .............................................................................................................................................. 4314.17.2 Rolling Patches.............................................................................................................................................................. 4324.17.3 Restoring Patches for the Isolated Hosts............................................................................................................. 4364.18 MRS Patch Description................................................................................................................................................... 4364.18.1 MRS 1.8.10.1 Patch Description............................................................................................................................... 4364.18.2 MRS 2.0.1.1 Patch Description................................................................................................................................. 4374.18.3 MRS 2.0.1.2 Patch Description................................................................................................................................. 4384.18.4 MRS 2.0.1.3 Patch Description................................................................................................................................. 4384.18.5 MRS 2.1.0.1 Patch Description................................................................................................................................. 4404.19 Log Management.............................................................................................................................................................4414.19.1 Viewing and Exporting Audit Logs......................................................................................................................... 4414.19.2 Exporting Services Logs.............................................................................................................................................. 4424.19.3 Configuring Audit Log Export Parameters........................................................................................................... 4434.20 Health Check Management......................................................................................................................................... 4454.20.1 Performing a Health Check....................................................................................................................................... 4454.20.2 Viewing and Exporting a Check Report................................................................................................................ 4464.20.3 DBService Health Check............................................................................................................................................. 4464.20.4 Flume Health Check.................................................................................................................................................... 447

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. vi

  • 4.20.5 HBase Health Check.................................................................................................................................................... 4474.20.6 Host Health Check....................................................................................................................................................... 4484.20.7 HDFS Health Check......................................................................................................................................................4554.20.8 Hive Health Check........................................................................................................................................................ 4564.20.9 Kafka Health Check..................................................................................................................................................... 4574.20.10 KrbServer Health Check........................................................................................................................................... 4574.20.11 LdapServer Health Check........................................................................................................................................ 4584.20.12 Loader Health Check.................................................................................................................................................4594.20.13 MapReduce Health Check....................................................................................................................................... 4604.20.14 OMS Health Check.................................................................................................................................................... 4614.20.15 Spark Health Check................................................................................................................................................... 4664.20.16 Storm Health Check.................................................................................................................................................. 4664.20.17 Yarn Health Check..................................................................................................................................................... 4674.20.18 ZooKeeper Health Check......................................................................................................................................... 4684.21 Tenant Management...................................................................................................................................................... 4694.21.1 Introduction.................................................................................................................................................................... 4694.21.2 Creating a Tenant......................................................................................................................................................... 4704.21.3 Creating a Sub-tenant.................................................................................................................................................4734.21.4 Deleting a Tenant......................................................................................................................................................... 4774.21.5 Managing a Tenant Directory.................................................................................................................................. 4784.21.6 Recovering Tenant Data............................................................................................................................................. 4824.21.7 Creating a Resource Pool........................................................................................................................................... 4834.21.8 Modifying a Resource Pool........................................................................................................................................4854.21.9 Deleting a Resource Pool........................................................................................................................................... 4874.21.10 Configuring a Queue.................................................................................................................................................4884.21.11 Configuring the Queue Capacity Policy of a Resource Pool........................................................................4904.21.12 Clearing the Configuration of a Queue.............................................................................................................. 4924.22 Backup and Restoration................................................................................................................................................. 4934.22.1 Introduction.................................................................................................................................................................... 4934.22.2 Backing Up Metadata................................................................................................................................................. 4964.22.3 Recovering Metadata.................................................................................................................................................. 4994.22.4 Modifying a Backup Task........................................................................................................................................... 5024.22.5 Viewing Backup and Recovery Tasks..................................................................................................................... 5044.23 Security Management.................................................................................................................................................... 5054.23.1 Default Users of Clusters with Kerberos Authentication Disabled..............................................................5054.23.2 Default Users of Clusters with Kerberos Authentication Enabled............................................................... 5094.23.3 Changing the Password for an OS User............................................................................................................... 5154.23.4 Changing the Password for User admin............................................................................................................... 5164.23.5 Changing the Password for the Kerberos Administrator................................................................................ 5174.23.6 Changing the Password for the LDAP Administrator and the LDAP User (including OMS LDAP).. 5184.23.7 Changing the Password for a Component Running User............................................................................... 5194.23.8 Changing the Password for the OMS Database Administrator.................................................................... 520

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. vii

  • 4.23.9 Changing the Password for the Data Access User of the OMS Database................................................ 5214.23.10 Changing the Password for a Component Database User.......................................................................... 5224.23.11 Replacing HA Certificates........................................................................................................................................ 5234.23.12 Updating the Key of a Cluster............................................................................................................................... 5254.24 MRS Multi-User Permission Management.............................................................................................................. 5264.24.1 Users and Permissions of Clusters with Kerberos Authentication Enabled..............................................5264.24.2 Default Users of Clusters with Kerberos Authentication Enabled............................................................... 5304.24.3 Creating a Role.............................................................................................................................................................. 5374.24.4 Creating a User Group................................................................................................................................................ 5434.24.5 Creating a User..............................................................................................................................................................5444.24.6 Modifying User Information..................................................................................................................................... 5464.24.7 Locking a User............................................................................................................................................................... 5464.24.8 Unlocking a User.......................................................................................................................................................... 5474.24.9 Deleting a User............................................................................................................................................................ 5494.24.10 Changing the Password of an Operation User................................................................................................ 5504.24.11 Initializing the Password of a System User....................................................................................................... 5514.24.12 Downloading a User Authentication File...........................................................................................................5524.24.13 Modifying a Password Policy..................................................................................................................................5534.24.14 Configuring Cross-Cluster Mutual Trust Relationships................................................................................. 5554.24.15 Configuring Users to Access Resources of a Trusted Cluster...................................................................... 5584.24.16 Configuring Fine-Grained Permissions for MRS Multi-User Access to OBS...........................................559

    5 Managing Historical Clusters........................................................................................... 5675.1 Viewing Basic Information About a Historical Cluster.......................................................................................... 5675.2 Viewing Job Configurations in a Historical Cluster.................................................................................................571

    6 Querying Operation Logs..................................................................................................572

    7 Managing Data Connections............................................................................................ 574

    8 Connecting to Clusters.......................................................................................................5778.1 Logging In to a Master Node......................................................................................................................................... 5778.1.1 Overview............................................................................................................................................................................ 5778.1.2 Logging In to an ECS..................................................................................................................................................... 5788.1.3 Determining Active or Standby Management Nodes of MRS Manager......................................................5808.2 Using an MRS Client..........................................................................................................................................................5818.2.1 Using an MRS Client on Nodes Inside a Cluster.................................................................................................. 5818.2.2 Using an MRS Client on Nodes Outside a Cluster...............................................................................................5828.2.3 Updating a Client............................................................................................................................................................ 5868.3 Accessing Web Pages of Open Source Components Managed in MRS Clusters..........................................5908.3.1 Web UIs of Open Source Components.................................................................................................................... 5908.3.2 List of Open Source Component Ports.................................................................................................................... 5938.3.3 EIP-based Access............................................................................................................................................................. 6048.3.4 Access Using a Windows ECS......................................................................................................................................6078.3.5 Creating an SSH Channel to Connect an MRS Cluster and Configuring the Browser............................ 608

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. viii

  • 9 MRS Manager Operation Guide...................................................................................... 6129.1 MRS Manager Introduction............................................................................................................................................ 6129.2 Accessing MRS Manager.................................................................................................................................................. 6159.3 Accessing MRS Manager Supporting Kerberos Authentication..........................................................................6199.4 Viewing Running Tasks in a Cluster............................................................................................................................. 6249.5 Monitoring Management................................................................................................................................................ 6259.5.1 Dashboard......................................................................................................................................................................... 6259.5.2 Managing Service and Host Monitoring................................................................................................................. 6279.5.3 Managing Resource Distribution............................................................................................................................... 6329.5.4 Configuring Monitoring Metric Dumping...............................................................................................................6339.6 Alarm Management.......................................................................................................................................................... 6349.6.1 Viewing and Manually Clearing an Alarm............................................................................................................. 6349.6.2 Configuring an Alarm Threshold............................................................................................................................... 6359.6.3 Configuring Syslog Northbound Interface..............................................................................................................6379.6.4 Configuring SNMP Northbound Interface.............................................................................................................. 6419.7 Alarm Reference..................................................................................................................................................................6439.7.1 ALM-12001 Audit Log Dump Failure....................................................................................................................... 6439.7.2 ALM-12002 HA Resource Is Abnormal.................................................................................................................... 6449.7.3 ALM-12004 OLdap Resource Is Abnormal............................................................................................................. 6469.7.4 ALM-12005 OKerberos Resource Is Abnormal......................................................................................................6489.7.5 ALM-12006 Node Fault.................................................................................................................................................6499.7.6 ALM-12007 Process Fault............................................................................................................................................. 6519.7.7 ALM-12010 Manager Heartbeat Interruption Between the Active and Standby Nodes....................... 6539.7.8 ALM-12011 Manager Data Synchronization Exception Between the Active and Standby Nodes..... 6549.7.9 ALM-12012 NTP Service Is Abnormal...................................................................................................................... 6569.7.10 ALM-12016 CPU Usage Exceeds the Threshold................................................................................................. 6589.7.11 ALM-12017 Insufficient Disk Capacity.................................................................................................................. 6609.7.12 ALM-12018 Memory Usage Exceeds the Threshold.........................................................................................6629.7.13 ALM-12027 Host PID Usage Exceeds the Threshold........................................................................................ 6649.7.14 ALM-12028 Number of Processes in the D State on the Host Exceeds the Threshold........................6659.7.15 ALM-12031 User omm or Password Is About to Expire..................................................................................6679.7.16 ALM-12032 User ommdba or Password Is About to Expire.......................................................................... 6699.7.17 ALM-12033 Slow Disk Fault......................................................................................................................................6709.7.18 ALM-12034 Periodic Backup Failure...................................................................................................................... 6719.7.19 ALM-12035 Unknown Data Status After Recovery Task Failure..................................................................6729.7.20 ALM-12037 NTP Server Is Abnormal..................................................................................................................... 6739.7.21 ALM-12038 Monitoring Indicator Dump Failure............................................................................................... 6759.7.22 ALM-12039 GaussDB Data Is Not Synchronized............................................................................................... 6779.7.23 ALM-12040 Insufficient System Entropy.............................................................................................................. 6809.7.24 ALM-13000 ZooKeeper Service Unavailable....................................................................................................... 6829.7.25 ALM-13001 Available ZooKeeper Connections Are Insufficient................................................................... 6849.7.26 ALM-13002 ZooKeeper Heap Memory or Direct Memory Usage Exceeds the Threshold.................. 686

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. ix

  • 9.7.27 ALM-14000 HDFS Service Unavailable................................................................................................................. 6889.7.28 ALM-14001 HDFS Disk Usage Exceeds the Threshold.....................................................................................6909.7.29 ALM-14002 DataNode Disk Usage Exceeds the Threshold........................................................................... 6929.7.30 ALM-14003 Number of Lost HDFS Blocks Exceeds the Threshold..............................................................6939.7.31 ALM-14004 Number of Damaged HDFS Blocks Exceeds the Threshold...................................................6959.7.32 ALM-14006 Number of HDFS Files Exceeds the Threshold........................................................................... 6969.7.33 ALM-14007 HDFS NameNode Memory Usage Exceeds the Threshold.....................................................6989.7.34 ALM-14008 HDFS DataNode Memory Usage Exceeds the Threshold....................................................... 6999.7.35 ALM-14009 Number of Dead DataNodes Exceeds the Threshold.............................................................. 7009.7.36 ALM-14010 NameService Service Is Abnormal.................................................................................................. 7029.7.37 ALM-14011 HDFS DataNode Data Directory Is Not Configured Properly............................................... 7059.7.38 ALM-14012 HDFS JournalNode Data Is Not Synchronized........................................................................... 7089.7.39 ALM-16000 Percentage of Sessions Connected to the HiveServer to Maximum Number AllowedExceeds the Threshold..............................................................................................................................................................7109.7.40 ALM-16001 Hive Warehouse Space Usage Exceeds the Threshold............................................................ 7119.7.41 ALM-16002 Successful Hive SQL Operations Are Lower than the Threshold......................................... 7139.7.42 ALM-16004 Hive Service Unavailable................................................................................................................... 7169.7.43 ALM-18000 Yarn Service Unavailable................................................................................................................... 7199.7.44 ALM-18002 NodeManager Heartbeat Lost......................................................................................................... 7219.7.45 ALM-18003 NodeManager Unhealthy.................................................................................................................. 7229.7.46 ALM-18006 MapReduce Job Execution Timeout............................................................................................... 7239.7.47 ALM-19000 HBase Service Unavailable................................................................................................................ 7259.7.48 ALM-19006 HBase Replication Synchronization Failed...................................................................................7269.7.49 ALM-25000 LdapServer Service Unavailable...................................................................................................... 7299.7.50 ALM-25004 Abnormal LdapServer Data Synchronization..............................................................................7319.7.51 ALM-25500 KrbServer Service Unavailable......................................................................................................... 7339.7.52 ALM-27001 DBService Unavailable........................................................................................................................ 7359.7.53 ALM-27003 DBService Heartbeat Interruption Between the Active and Standby Nodes...................7379.7.54 ALM-27004 Data Inconsistency Between Active and Standby DBServices.............................................. 7399.7.55 ALM-28001 Spark Service Unavailable................................................................................................................. 7419.7.56 ALM-26051 Storm Service Unavailable................................................................................................................ 7439.7.57 ALM-26052 Number of Available Supervisors in Storm Is Lower Than the Threshold........................7459.7.58 ALM-26053 Slot Usage of Storm Exceeds the Threshold............................................................................... 7469.7.59 ALM-26054 Heap Memory Usage of Storm Nimbus Exceeds the Threshold..........................................7489.7.60 ALM-38000 Kafka Service Unavailable................................................................................................................. 7509.7.61 ALM-38001 Insufficient Kafka Disk Space........................................................................................................... 7519.7.62 ALM-38002 Heap Memory Usage of Kafka Exceeds the Threshold........................................................... 7549.7.63 ALM-24000 Flume Service Unavailable................................................................................................................ 7569.7.64 ALM-24001 Flume Agent Is Abnormal................................................................................................................. 7579.7.65 ALM-24003 Flume Client Connection Failure..................................................................................................... 7599.7.66 ALM-24004 Flume Fails to Read Data.................................................................................................................. 7619.7.67 ALM-24005 Data Transmission by Flume Is Abnormal................................................................................... 7639.7.68 ALM-12041 Permission of Key Files Is Abnormal..............................................................................................765

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. x

  • 9.7.69 ALM-12042 Key File Configurations Are Abnormal..........................................................................................7679.7.70 ALM-23001 Loader Service Unavailable...............................................................................................................7689.7.71 ALM-12357 Failed to Export Audit Logs to the OBS........................................................................................7719.7.72 ALM-12014 Partition Lost..........................................................................................................................................7739.7.73 ALM-12015 Partition Filesystem Readonly..........................................................................................................7759.7.74 ALM-12043 DNS Resolution Duration Exceeds the Threshold..................................................................... 7769.7.75 ALM-12045 Network Read Packet Dropped Rate Exceeds the Threshold................................................7789.7.76 ALM-12046 Network Write Packet Dropped Rate Exceeds the Threshold...............................................7849.7.77 ALM-12047 Network Read Packet Error Rate Exceeds the Threshold....................................................... 7859.7.78 ALM-12048 Network Write Packet Error Rate Exceeds the Threshold...................................................... 7879.7.79 ALM-12049 Network Read Throughput Rate Exceeds the Threshold........................................................7899.7.80 ALM-12050 Network Write Throughput Rate Exceeds the Threshold.......................................................7919.7.81 ALM-12051 Disk Inode Usage Exceeds the Threshold.................................................................................... 7939.7.82 ALM-12052 TCP Temporary Port Usage Exceeds the Threshold.................................................................. 7959.7.83 ALM-12053 File Handle Usage Exceeds the Threshold................................................................................... 7979.7.84 ALM-12054 The Certificate File Is Invalid............................................................................................................ 7999.7.85 ALM-12055 The Certificate File Is About to Expire...........................................................................................8029.7.86 ALM-18008 Heap Memory Usage of Yarn ResourceManager Exceeds the Threshold.........................8049.7.87 ALM-18009 Heap Memory Usage of MapReduce JobHistoryServer Exceeds the Threshold.............8069.7.88 ALM-20002 Hue Service Unavailable.................................................................................................................... 8089.7.89 ALM-43001 Spark Service Unavailable................................................................................................................. 8109.7.90 ALM-43006 Heap Memory Usage of the JobHistory Process Exceeds the Threshold.......................... 8129.7.91 ALM-43007 Non-Heap Memory Usage of the JobHistory Process Exceeds the Threshold................ 8139.7.92 ALM-43008 Direct Memory Usage of the JobHistory Process Exceeds the Threshold.........................8159.7.93 ALM-43009 JobHistory GC Time Exceeds the Threshold................................................................................ 8169.7.94 ALM-43010 Heap Memory Usage of the JDBCServer Process Exceeds the Threshold.........................8189.7.95 ALM-43011 Non-Heap Memory Usage of the JDBCServer Process Exceeds the Threshold.............. 8199.7.96 ALM-43012 Direct Memory Usage of the JDBCServer Process Exceeds the Threshold....................... 8219.7.97 ALM-43013 JDBCServer GC Time Exceeds the Threshold.............................................................................. 8229.8 Object Management......................................................................................................................................................... 8249.8.1 Introduction.......................................................................................................................................................................8249.8.2 Querying Configurations.............................................................................................................................................. 8259.8.3 Managing Services.......................................................................................................................................................... 8269.8.4 Configuring Service Parameters................................................................................................................................. 8269.8.5 Configuring Customized Service Parameters.........................................................................................................8279.8.6 Synchronizing Service Configurations...................................................................................................................... 8309.8.7 Managing Role Instances............................................................................................................................................. 8319.8.8 Configuring Role Instance Parameters.................................................................................................................... 8319.8.9 Synchronizing Role Instance Configuration........................................................................................................... 8329.8.10 Decommissioning and Recommissioning Role Instances............................................................................... 8339.8.11 Managing a Host.......................................................................................................................................................... 8349.8.12 Isolating a Host............................................................................................................................................................. 834

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. xi

  • 9.8.13 Canceling Isolation of a Host................................................................................................................................... 8359.8.14 Starting and Stopping a Cluster.............................................................................................................................. 8359.8.15 Synchronizing Cluster Configurations....................................................................................................................8369.8.16 Exporting Configuration Data of a Cluster.......................................................................................................... 8369.9 Log Management............................................................................................................................................................... 8379.9.1 Viewing and Exporting Audit Logs............................................................................................................................8379.9.2 Exporting Services Logs.................................................................................................................................................8389.9.3 Configuring Audit Log Export Parameters............................................................................................................. 8399.10 Health Check Management......................................................................................................................................... 8419.10.1 Performing a Health Check....................................................................................................................................... 8419.10.2 Viewing and Exporting a Check Report................................................................................................................ 8429.10.3 Configuring the Number of Health Check Reports to Be Reserved............................................................8439.10.4 Managing Health Check Reports............................................................................................................................ 8439.10.5 DBService Health Check............................................................................................................................................. 8449.10.6 Flume Health Check.................................................................................................................................................... 8449.10.7 HBase Health Check.................................................................................................................................................... 8459.10.8 Host Health Check....................................................................................................................................................... 8459.10.9 HDFS Health Check......................................................................................................................................................8539.10.10 Hive Health Check..................................................................................................................................................... 8539.10.11 Kafka Health Check................................................................................................................................................... 8549.10.12 KrbServer Health Check........................................................................................................................................... 8559.10.13 LdapServer Health Check........................................................................................................................................ 8569.10.14 Loader Health Check.................................................................................................................................................8569.10.15 MapReduce Health Check....................................................................................................................................... 8589.10.16 OMS Health Check.................................................................................................................................................... 8589.10.17 Spark Health Check................................................................................................................................................... 8639.10.18 Storm Health Check.................................................................................................................................................. 8639.10.19 Yarn Health Check..................................................................................................................................................... 8649.10.20 ZooKeeper Health Check......................................................................................................................................... 8659.11 Static Service Pool Management................................................................................................................................ 8669.11.1 Viewing the Status of a Static Service Pool.........................................................................................................8669.11.2 Configuring a Static Service Pool............................................................................................................................ 8679.12 Tenant Management...................................................................................................................................................... 8709.12.1 Introduction.................................................................................................................................................................... 8709.12.2 Creating a Tenant......................................................................................................................................................... 8719.12.3 Creating a Sub-tenant.................................................................................................................................................8739.12.4 Deleting a Tenant......................................................................................................................................................... 8769.12.5 Managing a Tenant Directory.................................................................................................................................. 8779.12.6 Recovering Tenant Data............................................................................................................................................. 8799.12.7 Creating a Resource Pool........................................................................................................................................... 8799.12.8 Modifying a Resource Pool........................................................................................................................................8809.12.9 Deleting a Resource Pool........................................................................................................................................... 881

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. xii

  • 9.12.10 Configuring a Queue.................................................................................................................................................8819.12.11 Configuring the Queue Capacity Policy of a Resource Pool........................................................................8829.12.12 Clearing the Configuration of a Queue.............................................................................................................. 8839.13 Backup and Restoration................................................................................................................................................. 8849.13.1 Introduction.................................................................................................................................................................... 8849.13.2 Backing Up Metadata................................................................................................................................................. 8869.13.3 Recovering Metadata.................................................................................................................................................. 8889.13.4 Modifying a Backup Task........................................................................................................................................... 8919.13.5 Viewing Backup and Recovery Tasks..................................................................................................................... 8919.14 Security Management.................................................................................................................................................... 8929.14.1 Default Users of Clusters with Kerberos Authentication Disabled..............................................................8939.14.2 Changing the Password for an OS User............................................................................................................... 8969.14.3 Changing the Password for User admin............................................................................................................... 8979.14.4 Changing the Password for the Kerberos Administrator................................................................................ 8999.14.5 Changing the Password for the LDAP Administrator and the LDAP User (including OMS LDAP).. 9009.14.6 Changing the Password for a Component Running User............................................................................... 9019.14.7 Changing the Password for the OMS Database Administrator.................................................................... 9029.14.8 Changing the Password for the Data Access User of the OMS Database................................................ 9039.14.9 Changing the Password for a Component Database User.............................................................................9039.14.10 Replacing HA Certificates........................................................................................................................................ 9049.14.11 Updating the Key of a Cluster............................................................................................................................... 9069.15 Permission Management...............................................................................................................................................9079.15.1 Creating a Role.............................................................................................................................................................. 9079.15.2 Creating a User Group................................................................................................................................................ 9139.15.3 Creating a User..............................................................................................................................................................9149.15.4 Modifying User Information..................................................................................................................................... 9169.15.5 Locking a User............................................................................................................................................................... 9169.15.6 Unlocking a User.......................................................................................................................................................... 9179.15.7 Deleting a User..............................................................................................................................................................9189.15.8 Changing the Password of an Operation User................................................................................................... 9209.15.9 Initializing the Password of a System User......................................................................................................... 9209.15.10 Downloading a User Authentication File...........................................................................................................9229.15.11 Modifying a Password Policy..................................................................................................................................9229.16 Patch Operation Guide...................................................................................................................................................9249.16.1 Patch Operation Guide............................................................................................................................................... 9249.16.2 Supporting Rolling Patches....................................................................................................................................... 9259.17 Restoring Patches for the Isolated Hosts.................................................................................................................9299.18 Rolling Restart................................................................................................................................................................... 930

    10 Data Migration.................................................................................................................. 93910.1 Making Preparations....................................................................................................................................................... 93910.2 Exporting Metadata........................................................................................................................................................ 94010.3 Copying Data..................................................................................................................................................................... 941

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. xiii

  • 10.4 Restoring Data.................................................................................................................................................................. 942

    11 Data Backup and Restoration........................................................................................ 94411.1 HDFS Data.......................................................................................................................................................................... 94411.2 Hive Metadata.................................................................................................................................................................. 94611.3 Hive Data............................................................................................................................................................................ 94711.4 HBase Data........................................................................................................................................................................ 94711.5 Kafka Data..........................................................................................................................................................................953

    12 MRS Cluster Component Operation Guide.................................................................95612.1 Using Hadoop from Scratch......................................................................................................................................... 95612.2 Using Spark SQL from Scratch.................................................................................................................................... 95912.3 Using Spark........................................................................................................................................................................ 96112.3.1 Using Spark from Scratch.......................................................................................................................................... 96112.3.2 Accessing the Spark Web UI..................................................................................................................................... 96312.3.3 Interconnecting Spark with OpenTSDB................................................................................................................ 96512.3.3.1 Creating a Table and Associating It with OpenTSDB................................................................................... 96512.3.3.2 Inserting Data to the OpenTSDB Table............................................................................................................. 96612.3.3.3 Querying an OpenTSDB Table.............................................................................................................................. 96712.3.3.4 Modifying the Default Configuration Data......................................................................................................96812.4 Using Hive.......................................................................................................................................................................... 96812.4.1 Using Hive from Scratch............................................................................................................................................ 96812.4.2 Configuring Hive Parameters................................................................................................................................... 97312.4.3 Configuring Hive Permissions...................................................................................................................................97412.5 Using HBase....................................................................................................................................................................... 97612.5.1 Using HBase from Scratch......................................................................................................................................... 97612.5.2 Configuring the HBase Replication Function...................................................................................................... 98012.5.3 Configuring HBase Parameters................................................................................................................................ 99112.5.4 Enabling the Cross-Cluster Copy Function...........................................................................................................99212.5.5 Using the ReplicationSyncUp Tool.......................................................................................................................... 99412.5.6 Using HIndex.................................................................................................................................................................. 99512.5.6.1 Introduction to HIndex............................................................................................................................................ 99512.5.6.2 Loading Index Data in Batches.......................................................................................................................... 100412.5.6.3 Using an Index Generation Tool........................................................................................................................ 100612.5.6.4 Migrating Index Data............................................................................................................................................ 100912.6 Using Hue.........................................................................................................................................................................101112.6.1 Accessing the Hue Web UI......................................................................................................................................101112.6.2 Using HiveQL Editor on the Hue Web UI.......................................................................................................... 101212.6.3 Using the Metadata Browser on the Hue Web UI..........................................................................................101412.6.4 Using File Browser on the Hue Web UI..............................................................................................................101812.6.5 Using Job Browser on the Hue Web UI.............................................................................................................. 102012.7 Using Kafka......................................................................................................................................................................102112.7.1 Managing Kafka Topics............................................................................................................................................ 102212.7.2 Querying Kafka Topics..............................................................................................................................................1023

    MapReduce ServiceUser Guide Contents

    Issue 04 (2019-12-31) Copyright © Huawei Technologies Co., Ltd. xiv

  • 12.7.3 Managing Kafka User Permission........................................................................................................................ 102312.7.4 Managing Messages in Kafka Topics...................................................................................................................102512.7.5 Synchronizing Binlog-based MySQL Data to the MRS Cluster...................................................................102712.8 Using Storm..................................................................................................................................................................... 103312.8.1 Submitting Storm Topologies on the Client......................................................................................................103312.8.2 Accessing the Storm Web UI.................................................................................................................................. 103512.8.3 Managing Storm Topologies.................................................................................................................................. 103612.8.4 Querying Storm Topology Logs.............................................................................................................................103712.9 Using CarbonData......................................................................................................................................................... 103712.9.1 Getting Started with CarbonData........................................................................................................................ 103812.9.2 About CarbonData Table......................................................................................................................................... 104012.9.3 Creating a CarbonData Table.................................................................................................................................104112.9.4 Deleting a CarbonData Table................................................................................................................................ 104312.10 Using Flume.................................................................................................................................................................. 104312.10.1 Introduction....................................