optical character recognition - huaweithe optical character recognition management console is...

21
Optical Character Recognition FAQs Issue 01 Date 2019-09-05 HUAWEI TECHNOLOGIES CO., LTD.

Upload: others

Post on 18-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

Optical Character Recognition

FAQs

Issue 01

Date 2019-09-05

HUAWEI TECHNOLOGIES CO., LTD.

Page 2: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

Copyright © Huawei Technologies Co., Ltd. 2020. All rights reserved.

No part of this document may be reproduced or transmitted in any form or by any means without priorwritten consent of Huawei Technologies Co., Ltd. Trademarks and Permissions

and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.All other trademarks and trade names mentioned in this document are the property of their respectiveholders. NoticeThe purchased products, services and features are stipulated by the contract made between Huawei andthe customer. All or part of the products, services and features described in this document may not bewithin the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,information, and recommendations in this document are provided "AS IS" without warranties, guaranteesor representations of any kind, either express or implied.

The information in this document is subject to change without notice. Every effort has been made in thepreparation of this document to ensure accuracy of the contents, but all statements, information, andrecommendations in this document do not constitute a warranty of any kind, express or implied.

Huawei Technologies Co., Ltd.Address: Huawei Industrial Base

Bantian, LonggangShenzhen 518129People's Republic of China

Website: https://www.huawei.com

Email: [email protected]

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. i

Page 3: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

Contents

1 Common.....................................................................................................................................11.1 Can the OCR Result Be Converted into Word or TXT Files?..................................................................................... 11.2 Can OCR Export Recognition Results Right After Images Are Uploaded?.......................................................... 11.3 Can OCR Recognize Text Files?...........................................................................................................................................11.4 How Do I Disable a Service?............................................................................................................................................... 11.5 What Are the Username, Domain Name, and Project Name in the Token Message Body?........................ 1

2 Billing......................................................................................................................................... 32.1 Can a Package Be Refunded After Being Purchased?.................................................................................................32.2 How Does a Member Account to Use a Package That Is Subscribed to by an Enterprise MasterAccount?............................................................................................................................................................................................ 3

3 SDK..............................................................................................................................................53.1 What SDK Versions Does OCR Provide?.......................................................................................................................... 53.2 Does the OCR SDK Need to Be Purchased?................................................................................................................... 53.3 Does the OCR SDK Need Maven to Manage Dependency Packages?................................................................. 5

4 API............................................................................................................................................... 64.1 Why Is the Actual Number of API Calls Inconsistent with the Record Displayed on the ManagementConsole?.............................................................................................................................................................................................64.2 Why Is the Result of OCR Inaccurate?............................................................................................................................. 64.3 Why Is Status Code 401 Returned After a Token Is Obtained?...............................................................................74.4 What Should I Do When I Fail to Invoke an OCR API?.............................................................................................. 74.5 What Do I Do If the Token Fails to Be Obtained When Postman Is Used to Call the ORC API?................ 84.6 How Do I Obtain the Base64 Code of an Image?........................................................................................................84.7 How Do I Check the Number of OCR Calls?..................................................................................................................84.8 Can OCR Recognize Characters in Video Streams in Real Time?........................................................................... 94.9 How Is the Concurrency Capability of OCR?................................................................................................................. 94.10 How Do I Use OCR APIs?................................................................................................................................................... 94.11 What VAT Invoices Can Be Recognized by VAT Invoice OCR?.............................................................................. 9

5 Error Code............................................................................................................................... 105.1 Why Is a Message Stating "APIG.0301" Displayed When the OCR API Is Called?.........................................105.2 Why Is a Message Stating "ModelArts.4603" or "ModelArts.4704" Displayed When the OCR API IsCalled?............................................................................................................................................................................................. 105.3 Why Is a Message Stating "APIG.0201" Displayed When the OCR API Is Called?.........................................11

Optical Character RecognitionFAQs Contents

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. ii

Page 4: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

5.4 Why Is a Message Stating "ModelArts.4204" Displayed When the OCR API Is Called?.............................. 115.5 Why Is a Message Stating "APIG.0308" Displayed When the OCR API Is Called?.........................................11

6 Deployment............................................................................................................................ 136.1 Can OCR Deployed in Customers' Equipment Rooms?........................................................................................... 13

7 Data Security.......................................................................................................................... 147.1 How Does OCR Protect Data Security and Privacy?................................................................................................. 14

8 Regions and AZs.................................................................................................................... 158.1 What Are Regions and AZs?..............................................................................................................................................158.2 How Do I Select a Region for an OCR Package?....................................................................................................... 17

Optical Character RecognitionFAQs Contents

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. iii

Page 5: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

1 Common

1.1 Can the OCR Result Be Converted into Word or TXTFiles?

OCR extracts the result in JSON format. You need to program the result and saveit in Word or TXT format.

1.2 Can OCR Export Recognition Results Right AfterImages Are Uploaded?

After uploading images, you can export recognition results by calling the serviceAPI. For details, see the Optical Character Recognition Getting Started.

1.3 Can OCR Recognize Text Files?OCR detects and recognizes characters in images and do not recognize Word, PDF,and Excel files.

1.4 How Do I Disable a Service?OCR cannot be disabled once being enabled. An enabled service is paid on an asneeded basis by default. No fees will be deducted if you do not call the service.

1.5 What Are the Username, Domain Name, andProject Name in the Token Message Body?

Username indicates the name of the user, and Domain Name indicates the nameof the account to which the user belongs. If the token is obtained by an account,the username and domain name are the same. If the token is obtained by an IAMuser (multiple IAM users can be created under an account), the username is areal-world username and is different from the domain name.

Optical Character RecognitionFAQs 1 Common

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 1

Page 6: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

The project name can be set to cn-north-4.For details about how to obtain aproject name, see Obtaining the Username, User ID, Project Name, and ProjectID.

Optical Character RecognitionFAQs 1 Common

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 2

Page 7: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

2 Billing

2.1 Can a Package Be Refunded After Being Purchased?The package cannot be refunded after being purchased.

2.2 How Does a Member Account to Use a PackageThat Is Subscribed to by an Enterprise Master Account?

The enterprise master account and its member accounts are accurately named theaccount and the IAM users. The account pays and owns the resources and has fullaccess permissions for these resources. IAM users are created by the account, andonly have the permissions granted by the account. The account can modify orcancel the IAM users' permissions at any time. Fees generated by IAM users arepaid by the account. An account can be used to create IAM users and assignpermissions to the IAM users. IAM users can view and use the resourcesauthorized by the account after logging in to the system.

Optical Character RecognitionFAQs 2 Billing

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 3

Page 8: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

Optical Character RecognitionFAQs 2 Billing

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 4

Page 9: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

3 SDK

3.1 What SDK Versions Does OCR Provide?OCR provides SDK of various languages, including Java, Python, iOS, Android, andNode.js. You can use other programming languages to call OCR APIs throughtoken authentication. For details about how to call APIs through tokenauthentication, see the Optical Character Recognition API Reference.

3.2 Does the OCR SDK Need to Be Purchased?OCR SDKs are provided for users to download and use free of charge. OCR ischarged based on the number of API calls.

3.3 Does the OCR SDK Need Maven to ManageDependency Packages?

● Some packages on which the Python SDK depends, such as requests, need tobe installed in the local environment. If the Python SDK is connected to theexternal network, you can use the pip install + package name to install thepackages.

● Java SDK does not need Maven to manage local dependency packages andcan be directly used.

Optical Character RecognitionFAQs 3 SDK

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 5

Page 10: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

4 API

4.1 Why Is the Actual Number of API Calls Inconsistentwith the Record Displayed on the ManagementConsole?

The OCR console only records the number of successful API calls. The number offailed API calls is not recorded.

To view the number of failed calls, perform the following operations:

1. Log in to the management console.2. On the Console page, choose Optical Character Recognition. The Optical

Character Recognition management console is displayed.3. Click the target service, for example, Auto Classification OCR. Click View

Monitoring Graph to go to the Cloud Eye console and view detailed serviceusage such as the number of successful or failed API calls.

4.2 Why Is the Result of OCR Inaccurate?The OCR result may be inaccurate due to the following reasons:

1. The image format is not supported.Only images in PNG, JPEG, BMP, or TIFF format can be recognized.

2. The image is too small to recognize.For details about constraints on the image size of each service, see section"Constraints."

3. The image quality is poor. For example, the image is too dark to recognize.4. The image style is incorrect.

Only special VAT invoices and plain VAT invoices (including electronicinvoices) can be recognized. However, volume invoices and toll invoices arenot included.

5. If error code is returned, locate the fault according to the error code.

Optical Character RecognitionFAQs 4 API

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 6

Page 11: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

For details about error code, see the Optical Character Recognition APIReference.

4.3 Why Is Status Code 401 Returned After a Token IsObtained?

If status code 401 is returned when OCR is called in token mode, the token hasexpired. The validity period of a token is 24 hours. You are advised to set a timelyupdate mechanism.

The retry mechanism has been configured in the OCR SDK to update the token. Ifthe token is invalid and status code 401 is returned, the OCR SDK sends a requestto obtain a token again.

For details about how to use the Python programming language to obtain a tokenagain when the existing one is expired, see the HWOcrClientToken.py code in theSDK(Python) file, as shown in the following figure.

4.4 What Should I Do When I Fail to Invoke an OCRAPI?

Fault Locating1. Locate the fault according to the returned result or error code.2. Check whether the API has been subscribed to.3. Check whether the AK and SK have been successfully obtained.4. Check whether the token has been correctly entered or expired.5. Check whether the API has been correctly invoked.

If you cannot determine the cause and rectify the fault, contact us.

Optical Character RecognitionFAQs 4 API

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 7

Page 12: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

4.5 What Do I Do If the Token Fails to Be ObtainedWhen Postman Is Used to Call the ORC API?

Check the following items:

● Check whether the service region in the URI is correct.● Check whether the service region in the body and the corresponding key value

are correct.

4.6 How Do I Obtain the Base64 Code of an Image?The input image parameter of OCR is a Base64 code of the image. This sectiondescribes how to use the Google Chrome browser to obtain the Base64 code of animage.

1. Open the Google Chrome browser and drag an image file to the browser. Theimage is displayed on the browser.

2. Press F12, click Sources, and select the image file from the navigation tree onthe left. The Base64 code of the image is displayed on the right, as shown inred frame 3 in the following figure.

3. Double-click the Base64 code of the selected image and press Ctrl+C to copythe Base64 code. Do not right-click the image to copy it.

Figure 4-1 Base64-encoded image

4.7 How Do I Check the Number of OCR Calls?Log in to the OCR console. See Figure 4-2.

Optical Character RecognitionFAQs 4 API

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 8

Page 13: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

Figure 4-2 Console page of ID Card OCR

4.8 Can OCR Recognize Characters in Video Streams inReal Time?

OCR is mainly used for image processing. It can identify video streams byextracting frame images, or extract text information from videos by using VCR.

4.9 How Is the Concurrency Capability of OCR?For details about the concurrency capability of OCR, contact us.

4.10 How Do I Use OCR APIs?You can send a request based on constructed request messages using any of thefollowing three methods:

● cURL

cURL is a command-line tool used to perform URL operations and transferfiles. It serves as the HTTP client that can send HTTP requests to the HTTPserver and receive response messages. cURL is suitable for use in API tuningscenarios. For more information about cURL, visit https://curl.haxx.se/.

● Code

You can call APIs through code to assemble, send, and process requestmessages.

● REST client

Both Mozilla Firefox and Google Chrome provide a graphical browser plug-in,that is, REST client, to send and process requests. To download Postman, visithttps://www.getpostman.com/.

4.11 What VAT Invoices Can Be Recognized by VATInvoice OCR?

Currently, the API supports the special and plain VAT invoices (including plainelectronic VAT invoices). Volume invoices and toll invoices, as well as fields such asthe invoice remarks, supervision seal, special seal, and serial number of voucherform will be supported soon.

Optical Character RecognitionFAQs 4 API

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 9

Page 14: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

5 Error Code

5.1 Why Is a Message Stating "APIG.0301" DisplayedWhen the OCR API Is Called?

If an error message and error code are returned when an API is called:

1. "error_msg":"Incorrect IAM authentication information: decrypt tokenfail","error_code":"APIG.0301": indicates that the token fails to be decrypted.Check whether the token is complete, whether it has expired, whether theregion where the token is obtained and the region where the service isinvoked are different, and whether the account is restricted due to arrears.

2. "error_msg":"Incorrect IAM authentication information: verify aksksignature fail","error_code":"APIG.0301": indicates that the AK/SKauthentication fails. Check whether the AK/SK is correct and whether theaccount is restricted due to arrears.

5.2 Why Is a Message Stating "ModelArts.4603" or"ModelArts.4704" Displayed When the OCR API IsCalled?

If an error message and error code are returned when an API is called:

1. "error_code":"ModelArts.4603","error_msg":"Obtaining the file from theURL failed. connect timed out.": indicates that the image data fails to beobtained from the URL. Ensure that the URL supports the HTTP/HTTPSrequest protocol.

2. "error_code":"ModelArts.4704","error_msg":"Obtaining the file from theOBS failed. ": indicates that the image data fails to be obtained from OBS.Ensure that the OBS bucket for storing images is a public bucket.

Optical Character RecognitionFAQs 5 Error Code

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 10

Page 15: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

5.3 Why Is a Message Stating "APIG.0201" DisplayedWhen the OCR API Is Called?

If error_msg":"Backend timeout.","error_code":"APIG.0201" are returned whenan API is called, the request times out.

You can try to rectify the fault with the following solution:

Manually call an API using a tool such as Postman. If the API is successfully called:

1. Check whether the original API call requests are too frequent. Check thereturn value in the code using the retry mechanism. If a concurrent erroroccurs, retry the request after a short period of time (for example, 2 to 5seconds). You can also check the result of the previous request at the backendand send the next request after the previous request is returned to avoid toofrequent requests.

2. Check whether the image is too large or the network delay is too long. If theimage is too large, compress the image in proportion while ensuring theimage definition. If the network delay is long, you can increase the networktransmission speed.

If you cannot rectify the fault, contact us.

5.4 Why Is a Message Stating "ModelArts.4204"Displayed When the OCR API Is Called?

If "ModelArts.4204: Request api error! Have not subscribed this api" isreturned when an API is called, the corresponding service is not enabled. You needto enable the service. For details about how to enable the service, see Applyingfor a Service.

If the service has been enabled, check whether the region (or account) where theservice is enabled is the same as the region (or account) where the service iscalled. If they are the same, check whether the URL of the API is spelled correctly.

5.5 Why Is a Message Stating "APIG.0308" DisplayedWhen the OCR API Is Called?

If error_msg":"The throttling threshold has been reached: policy user overratelimit,limit:XX,time:1 minute","error_code":"APIG.0308" are returned whena service is called, the maximum API call concurrency has been reached. Eachservice has a maximum API call concurrency. For example, the maximum numberof concurrent calls of a service is XX calls per minute.

You can use the following methods to solve the problem:

1. Check the return value in the code using the retry mechanism. If a concurrenterror occurs, retry the request after a short period of time (for example, 2 to 5seconds).

Optical Character RecognitionFAQs 5 Error Code

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 11

Page 16: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

2. Check the result of the previous request at the backend and send the nextrequest after the previous request is returned to avoid too frequent requests.

If you need higher concurrency, contact us.

Optical Character RecognitionFAQs 5 Error Code

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 12

Page 17: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

6 Deployment

6.1 Can OCR Deployed in Customers' EquipmentRooms?

Yes, OCR can be deployed in the edge-cloud synergy mode. For details, contact us.

Optical Character RecognitionFAQs 6 Deployment

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 13

Page 18: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

7 Data Security

7.1 How Does OCR Protect Data Security and Privacy?With trustworthiness as the core quality concept, OCR provides you with leading-edge, future-ready, and trustworthy cloud services by meeting the requirements onsecurity, compliance, privacy, resilience, and transparency. For details about thestatement, see the Privacy Statement and Site Terms. For trusted resources, seethe White Papers.

Optical Character RecognitionFAQs 7 Data Security

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 14

Page 19: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

8 Regions and AZs

8.1 What Are Regions and AZs?

Concept

A region and availability zone (AZ) identify the location of a data center. You cancreate resources in a specific region and AZ.

● Regions are divided based on geographical location and network latency.Public services, such as Elastic Cloud Server (ECS), Elastic Volume Service(EVS), Object Storage Service (OBS), Virtual Private Cloud (VPC), Elastic IP(EIP), and Image Management Service (IMS), are shared within the sameregion. Regions are classified as universal regions and dedicated regions. Auniversal region provides universal cloud services for common tenants. Adedicated region provides specific services for specific tenants.

● An AZ contains one or more physical data centers. Each AZ has independentcooling, fire extinguishing, moisture-proof, and electricity facilities. Within anAZ, compute, network, storage, and other resources are logically divided intomultiple clusters. AZs within a region are interconnected using high-speedoptical fibers to allow you to build cross-AZ high-availability systems.

Figure 8-1 shows the relationship between regions and AZs.

Figure 8-1 Regions and AZs

Optical Character RecognitionFAQs 8 Regions and AZs

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 15

Page 20: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

HUAWEI CLOUD provides services in many regions around the world. You canselect a region and AZ as needed.

How to Select a Region?

When selecting a region, consider the following factors:

● Location

You are advised to select a region close to you or your target users. Thisreduces the network latency and improves the access speed. However, Chinesemainland regions provide basically the same infrastructure, BGP networkquality, as well as operations and configurations on resources. Therefore, ifyou or your target users are in the Chinese mainland, you do not need toconsider the network latency differences when selecting a region.

The countries and regions outside the Chinese mainland, such as Bangkok andHong Kong, provide services for users outside the Chinese mainland. If you oryour target users are in the Chinese mainland, these regions are notrecommended due to high access latency.

– If you or your target users are in Asia Pacific excepting the Chinesemainland, select the AP-Hong Kong, AP-Bangkok, or AP-Singaporeregion.

– If you or your target users are in Africa, select the AF-Johannesburgregion.

– If you or your target users are in Europe, select the EU-Paris region.

● Relationship between cloud services

When using multiple cloud services, pay attention to the followingrestrictions:

– ECSs, RDS instances, and OBS buckets in different regions cannotcommunicate with each other through an internal network.

– ECSs in different regions cannot be bound to the same load balancer.

● Resource price

Resource prices may vary in different regions. For details, see Product PricingDetails.

How to Select an AZ?

When determining whether to deploy resources in the same AZ, consider yourapplication's requirements on disaster recovery (DR) and network latency.

● For high DR capability, deploy resources in different AZs in the same region.

● For low network latency, deploy resources in the same AZ.

Regions and Endpoints

Before using an API to call resources, specify its region and endpoint. For moredetails, see Regions and Endpoints.

Optical Character RecognitionFAQs 8 Regions and AZs

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 16

Page 21: Optical Character Recognition - HuaweiThe Optical Character Recognition management console is displayed. 3. Click the target service, for example, Auto Classification OCR. Click View

8.2 How Do I Select a Region for an OCR Package?Resource packages in different regions are isolated. Select a region according toyour business requirements. For details about the regions where services aredeployed, see Endpoints.

Determine the service region before purchasing a service package.

Optical Character RecognitionFAQs 8 Regions and AZs

Issue 01 (2019-09-05) Copyright © Huawei Technologies Co., Ltd. 17