腾讯云OCR API使用教程

本教程介绍腾讯云OCR API的使用方法，包括Java、.NET、C++、Node.js、Python、GO六种编程语言的调用方式，从环境配置到案例实现，每种语言都配备了示例代码和可直接运行的Demo以方便使用。

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

准备工作

在开始使用腾讯云OCR API之前，您需要先进行以下准备工作：

开通文字识别服务：进入文字识别控制台，注册腾讯云账号并通过实名认证，阅读《文字识别服务条款》后勾选同意并单击立即开通；
获取个人密钥：单击查看密钥，进入控制台的 API 密钥管理界面，可查看您的个人密钥，若是新用户可单击【新建密钥】按钮创建个人密钥。
开发环境准备：下载安装各自编程语言的开发环境，如Java、.NET、C++、Node.js、Python、GO等，六种选其一。
了解输出格式：您可以参考腾讯云 API Explorer

Java调用OCR API

步骤1：环境配置

JDK 7版本及以上。

安装SDK（以下二选一）：

方法一：通过Maven安装（推荐）：

对Maven不熟悉的同学可以通过黑马程序员Maven教程1至11集学习，1小时左右就能上手

访问Maven官网（https://maven.apache.org/）下载适合您系统的Maven安装包并进行安装并配置环境变量；

在项目文件夹下创建pom.xml文件作为Maven 项目的配置文件（可通过IDEA模板直接创建，也可以直接使用\Demo\1 Java Demo中提供的）向dependencies 标签添加依赖项：

<dependency>
    <groupId>com.tencentcloudapi</groupId>
    <artifactId>tencentcloud-sdk-java-ocr</artifactId>
    <version>3.1.701</version>
</dependency>

除此之外，还需要配置一些插件，以实现maven编译和打包依赖等操作。 groupId、artifactId、version、name、url 字段可以按照你的项目进行修改，适用于本项目的完整pom.xml文件示例如下：

<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>org.example</groupId>
  <artifactId>OCR</artifactId>
  <version>1.0-SNAPSHOT</version>

  <name>OCR</name>
  <!-- FIXME change it to the project's website -->
  <url>http://www.example.com</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <maven.compiler.source>1.7</maven.compiler.source>
    <maven.compiler.target>1.7</maven.compiler.target>
  </properties>

  <dependencies>
    <dependency>
      <groupId>com.tencentcloudapi</groupId>
      <artifactId>tencentcloud-sdk-java-ocr</artifactId>
      <version>3.1.701</version>
    </dependency>
  </dependencies>

  <build>
    <pluginManagement>
      <plugins>
        <plugin>
          <artifactId>maven-clean-plugin</artifactId>
          <version>3.1.0</version>
        </plugin>
        <plugin>
          <artifactId>maven-resources-plugin</artifactId>
          <version>3.0.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-compiler-plugin</artifactId>
          <version>3.8.0</version>
        </plugin>
        <plugin>
          <artifactId>maven-surefire-plugin</artifactId>
          <version>2.22.1</version>
        </plugin>
        <plugin>
          <artifactId>maven-jar-plugin</artifactId>
          <version>3.0.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-install-plugin</artifactId>
          <version>2.5.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-deploy-plugin</artifactId>
          <version>2.8.2</version>
        </plugin>
        <plugin>
          <artifactId>maven-site-plugin</artifactId>
          <version>3.7.1</version>
        </plugin>
        <plugin>
          <artifactId>maven-project-info-reports-plugin</artifactId>
          <version>3.0.0</version>
        </plugin>
      </plugins>
    </pluginManagement>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <configuration>
          <source>8</source>
          <target>8</target>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-jar-plugin</artifactId>
        <version>3.2.0</version>
        <configuration>
          <archive>
            <manifest>
              <addClasspath>true</addClasspath>
              <mainClass>org.example.RecognizeTableOCR</mainClass>
            </manifest>
          </archive>
        </configuration>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-assembly-plugin</artifactId>
        <version>3.3.0</version>
        <configuration>
          <archive>
            <manifest>
              <mainClass>org.example.RecognizeTableOCR</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>
</project>

设置镜像源以加快下载速度，编辑 maven 的 settings.xml 配置文件（<maven安装路径>\apache-maven-3.x.x\conf），在 mirrors 段落增加镜像配置：

<mirror>
      <id>tencent</id>
      <name>tencent maven mirror</name>
      <url>https://mirrors.tencent.com/nexus/repository/maven-public/</url>
      <mirrorOf>*</mirrorOf>
</mirror>

方法二：通过源码包安装：
- 前往 Github 代码托管地址下载源码压缩包；
- 解压源码包到您项目合适的位置；
- 需要将 vendor 目录下的 jar 包放在 java 可找到的路径中。

步骤2：身份认证

在 Java 代码中使用 OCR API 前，需要进行身份认证，即在代码中设置 SecretId 和 SecretKey。

import com.tencentcloudapi.common.Credential;
import com.tencentcloudapi.ocr.v20181119.OcrClient;
import com.tencentcloudapi.ocr.v20181119.models.*;

public class RecognizeTableOCR {
    public static void main(String[] args) {
        // 设置 SecretId 和 SecretKey
        Credential cred = new Credential("secretId", "secretKey");

		// 实例化一个http选项
        HttpProfile httpProfile = new HttpProfile();
        httpProfile.setEndpoint("ocr.tencentcloudapi.com");
        
        // 实例化一个client选项
        ClientProfile clientProfile = new ClientProfile();
        clientProfile.setHttpProfile(httpProfile);
        
        // 实例化要请求产品的client对象,clientProfile是可选的
        OcrClient client = new OcrClient(cred, "ap-guangzhou", clientProfile);
    }
}

步骤3：Base64编码

OCR API 要求将图片转换为 Base64 编码的字符串格式。在 Java 代码中，可以使用以下方式将图片转换为 Base64 编码的字符串：

import java.io.*;
import java.util.Base64;

public class RecognizeTableOCR {
    public static void main(String[] args) {
        // ...省略身份认证的代码...

        // 读取图片文件
        File file = new File("path/to/image.jpg");
        InputStream inputStream = new FileInputStream(file);
        byte[] buffer = new byte[(int) file.length()];
        inputStream.read(buffer);
        inputStream.close();

        // 将图片转换为 Base64 编码的字符串
        String imageBase64 = Base64.getEncoder().encodeToString(buffer);
    }
}

步骤4：调用API

将图片转换为 Base64 编码的字符串后，即可调用 OCR API 进行文字识别。

import com.tencentcloudapi.common.profile.ClientProfile;
import com.tencentcloudapi.common.profile.HttpProfile;
import com.tencentcloudapi.common.exception.TencentCloudSDKException;

public class RecognizeTableOCR {
    public static void main(String [] args) {
        try {
            // ...省略身份认证和生成imageBase64的代码...
            
            // 实例化一个请求对象
            RecognizeTableOCRRequest req = new RecognizeTableOCRRequest();
            req.setImageBase64(imageBase64);
            // 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
            RecognizeTableOCRResponse resp = client.RecognizeTableOCR(req);
            // 输出json格式的字符串回包
            System.out.println(RecognizeTableOCRResponse.toJsonString(resp));
        } catch (TencentCloudSDKException | IOException e) {
            System.out.println(e.toString());
        }
    }
}

完整示例

以下是一个完整的示例，演示了如何使用 Java调用腾讯云的 OCR 服务来识别表格，并打印出识别结果。

package org.example;

import java.io.*;
import java.util.Base64;
import com.tencentcloudapi.common.Credential;
import com.tencentcloudapi.common.profile.ClientProfile;
import com.tencentcloudapi.common.profile.HttpProfile;
import com.tencentcloudapi.common.exception.TencentCloudSDKException;
import com.tencentcloudapi.ocr.v20181119.OcrClient;
import com.tencentcloudapi.ocr.v20181119.models.*;

public class RecognizeTableOCR {
    public static void main(String [] args) {
        try {
            // 读取图片文件
            String imagePath = "src/main/resources/images/1.jpg";
            File file = new File(imagePath);
            InputStream inputStream = new FileInputStream(file);
            byte[] buffer = new byte[(int) file.length()];
            inputStream.read(buffer);
            inputStream.close();

            // 将图片转换为 Base64 编码的字符串
            String imageBase64 = Base64.getEncoder().encodeToString(buffer);

            // 实例化一个认证对象
            Credential cred = new Credential("secretId", "secretKey");
            // 实例化一个http选项
            HttpProfile httpProfile = new HttpProfile();
            httpProfile.setEndpoint("ocr.tencentcloudapi.com");
            // 实例化一个client选项
            ClientProfile clientProfile = new ClientProfile();
            clientProfile.setHttpProfile(httpProfile);
            // 实例化要请求产品的client对象,clientProfile是可选的
            OcrClient client = new OcrClient(cred, "ap-guangzhou", clientProfile);
            // 实例化一个请求对象
            RecognizeTableOCRRequest req = new RecognizeTableOCRRequest();
            req.setImageBase64(imageBase64);
            // 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
            RecognizeTableOCRResponse resp = client.RecognizeTableOCR(req);
            // 输出json格式的字符串回包
            System.out.println(RecognizeTableOCRResponse.toJsonString(resp));
        } catch (TencentCloudSDKException | IOException e) {
            System.out.println(e.toString());
        }
    }
}

具体案例

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

路径：/Demo/1 Java Demo 中提供了可直接运行的程序。

项目架构如下所示：

1 Java Demo
|-- pom.xml
`-- src
    |-- main
    |   |-- java
    |   |   `-- com
    |   |       `-- example
    |   |           `-- RecognizeTableOCR.java
    |   `-- resources
    |       `-- images
    |           `-- 1.jpg
    |               2.jpg
    `-- test
        |-- java
        |   `-- com
        |       `-- example
        `-- resources

配置好相关SDK和Maven环境后，将RecognizeTableOCR.java第27行中的 "SecretId", "SecretKey" 替换为您自己的凭证。

1	`Credential cred = new Credential("secretId", "secretKey");`

在路径：/Demo/1 Java Demo下打开命令行，执行以下命令带依赖地编译程序：

1	`mvn compile assembly:single`

运行程序：

1	`java -jar ./target/OCR-1.0-SNAPSHOT-jar-with-dependencies.jar`

程序运行后，变量 resp 中将保存 API 返回的 JSON 格式字符串。下面是打印内容的示例：

.Net调用OCR API

步骤1：环境配置

安装.NET Framework 4.5+ 或者 .NET Core 2.1；
创建一个ocr项目，并进入项目内：
1
2
dotnet new console -o ocr cd ./ocr
通过 nuget 安装SDK：
1
dotnet add package TencentCloudSDK.Ocr

步骤2：身份认证

在此步骤中，我们将配置身份认证凭证。请确保已获取您的腾讯云 API 密钥。

Credential cred = new Credential
{
    SecretId = "SecretId",
    SecretKey = "SecretKey"
};

在以上代码中，您需要将SecretId和SecretKey替换为您的实际API密钥信息。

步骤3：Base64编码

将待识别的图片转换为Base64编码格式，并将编码后的字符串赋值给请求对象。

// 将图片转换为base64编码
byte[] imgBytes = File.ReadAllBytes("./images/1.jpg");
string imgBase64 = Convert.ToBase64String(imgBytes);
req.ImageBase64 = imgBase64;

步骤4：调用API

在身份认证和Base64编码后，即可调用API并获取响应结果。

// 实例化要请求产品的client对象,clientProfile是可选的
OcrClient client = new OcrClient(cred, "ap-guangzhou", clientProfile);
// 实例化一个请求对象,每个接口都会对应一个request对象
RecognizeTableOCRRequest req = new RecognizeTableOCRRequest();
// 将图片转换为base64编码
byte[] imgBytes = File.ReadAllBytes("./images/1.jpg");
string imgBase64 = Convert.ToBase64String(imgBytes);
req.ImageBase64 = imgBase64;
// 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
RecognizeTableOCRResponse resp = client.RecognizeTableOCRSync(req);
// 输出json格式的字符串回包
Console.WriteLine(AbstractModel.ToJsonString(resp));

完整示例

以下是一个完整的示例，演示了如何使用 .Net 调用腾讯云的 OCR 服务来识别表格，并打印出识别结果。

using System;
using System.Threading.Tasks;
using TencentCloud.Common;
using TencentCloud.Common.Profile;
using TencentCloud.Ocr.V20181119;
using TencentCloud.Ocr.V20181119.Models;

namespace TencentCloudExamples
{
    class RecognizeTableOCR
    {
        static void Main(string[] args)
        {
            try
            {
                // 实例化一个认证对象，入参需要传入腾讯云账户 SecretId 和 SecretKey，此处还需注意密钥对的保密
                Credential cred = new Credential {
                    SecretId = "SecretId",
                    SecretKey = "SecretKey"
                };
                // 实例化一个client选项，可选的，没有特殊需求可以跳过
                ClientProfile clientProfile = new ClientProfile();
                // 实例化一个http选项，可选的，没有特殊需求可以跳过
                HttpProfile httpProfile = new HttpProfile();
                httpProfile.Endpoint = ("ocr.tencentcloudapi.com");
                clientProfile.HttpProfile = httpProfile;

                // 实例化要请求产品的client对象,clientProfile是可选的
                OcrClient client = new OcrClient(cred, "ap-guangzhou", clientProfile);
                // 实例化一个请求对象,每个接口都会对应一个request对象
                RecognizeTableOCRRequest req = new RecognizeTableOCRRequest();
                // 将图片转换为base64编码
                byte[] imgBytes = File.ReadAllBytes("./images/1.jpg");
                string imgBase64 = Convert.ToBase64String(imgBytes);
                req.ImageBase64 = imgBase64;
                // 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
                RecognizeTableOCRResponse resp = client.RecognizeTableOCRSync(req);
                // 输出json格式的字符串回包
                Console.WriteLine(AbstractModel.ToJsonString(resp));
            }
            catch (Exception e)
            {
                Console.WriteLine(e.ToString());
            }
            Console.Read();
        }
    }
}

具体案例

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

路径：/Demo/2 .Net Demo 中提供了程序代码，但不能直接运行。

按照步骤1配置好相关环境后，拷贝``/Demo/2 .Net Demo下的images和Program.cs`到ocr项目中：
将Program.cs第18、19行中的 "SecretId","SecretKey" 替换为您自己的凭证。

Credential cred = new Credential {
                    SecretId = "SecretId",
                    SecretKey = "SecretKey"
                };

执行以下命令运行程序：

1	`dotnet run`

程序运行后，变量 response 中将保存 API 返回的 JSON 格式字符串。下面是打印内容的示例：

C++调用OCR API

步骤1：环境配置

（仅支持Linux环境）

运行 sudo apt-get update 命令更新软件包列表
运行 sudo apt-get upgrade 命令更新系统中已安装软件包的版本。
运行sudo apt install git命令安装git
安装 cmake 编译工具：
- ubuntu：sudo apt-get install cmake
- centos：yum install cmake3
安装依赖库 libcurl：
- ubuntu：sudo apt-get install libcurl4-openssl-dev
- centos：yum install libcurl-devel
安装依赖库 openssl：
- ubuntu：sudo apt-get install libssl-dev
- centos：yum install openssl-devel
安装依赖库 libuuid：
- ubuntu：sudo apt-get install uuid-dev
- centos：yum install libuuid-devel

从源代码构建 SDK：

git clone https://github.com/TencentCloud/tencentcloud-sdk-cpp
cd tencentcloud-sdk-cpp
mkdir sdk_build
cd sdk_build
# centos 下使用 cmake3 ..
# 指定产品编译，分号;分隔
cmake -DBUILD_MODULES="ocr" ..
make
sudo make install

步骤2：身份认证

在此步骤中，我们将配置身份认证凭证。请确保已获取您的腾讯云 API 密钥。

1	`Credential cred = Credential("Your SecretId", "Your SecretKey");`

在以上代码中，您需要将yourSecretId和yourSecretKey替换为您的实际API密钥信息。

步骤3：Base64编码

OCR API 接受 Base64 编码后的图像数据作为输入，因此需要将本地图片数据转换成 Base64 编码。这里提供了一个简单的 C++ 函数 base64_encode 来进行 Base64 编码。

static const std::string base64_chars =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    "abcdefghijklmnopqrstuvwxyz"
    "0123456789+/";

static inline bool is_base64(unsigned char c) {
  return (isalnum(c) || (c == '+') || (c == '/'));
}

std::string base64_encode(unsigned char const* bytes_to_encode, unsigned int in_len) {
  std::string ret;
  int i = 0;
  int j = 0;
  unsigned char char_array_3[3];
  unsigned char char_array_4[4];

  while (in_len--) {
    char_array_3[i++] = *(bytes_to_encode++);
    if (i == 3) {
      char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
      char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
      char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
      char_array_4[3] = char_array_3[2] & 0x3f;

      for (i = 0; (i < 4); i++)
        ret += base64_chars[char_array_4[i]];
      i = 0;
    }
  }

  if (i) {
    for (j = i; j < 3; j++)
      char_array_3[j] = '\0';

    char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
    char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
    char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);

    for (j = 0; (j < i + 1); j++)
      ret += base64_chars[char_array_4[j]];

    while ((i++ < 3))
      ret += '=';
  }

  return ret;
}

调用该函数对图片进行编码

// 加载本地图片
string image_path = "./images/1.jpg";
ifstream in(image_path, ios::in | ios::binary);
ostringstream oss;
oss << in.rdbuf();
string image_binary = oss.str();

// 将二进制图像数据转换为Base64编码
string image_base64 = base64_encode(reinterpret_cast<const unsigned char*>(image_binary.data()), image_binary.size());

步骤4：调用API

调用API的过程需按照以下步骤：

实例化http选项、client选项、client对象、请求对象，每个接口都会对应一个request对象。

// 实例化一个http选项，可选的，没有特殊需求可以跳过
HttpProfile httpProfile = HttpProfile();
httpProfile.SetEndpoint("ocr.tencentcloudapi.com");

// 实例化一个client选项，可选的，没有特殊需求可以跳过
ClientProfile clientProfile = ClientProfile();
clientProfile.SetHttpProfile(httpProfile);

// 实例化要请求产品的client对象,clientProfile是可选的
OcrClient client = OcrClient(cred, "ap-guangzhou", clientProfile);

// 实例化一个请求对象,每个接口都会对应一个request对象
RecognizeTableOCRRequest req = RecognizeTableOCRRequest();

填充请求对象的参数，如图片数据等。

1 2	`// 将Base64编码图像数据赋值到请求对象中 req.SetImageBase64(image_base64);`

调用client对象的接口，并传入请求对象，使用腾讯云的RecognizeTableOCR接口来识别表格图片。

1 2	`// 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应 auto outcome = client.RecognizeTableOCR(req);`

处理返回结果。

if (!outcome.IsSuccess())
{
    cout << outcome.GetError().PrintAll() << endl;
    return -1;
}
RecognizeTableOCRResponse resp = outcome.GetResult();
// 输出json格式的字符串回包
cout << resp.ToJsonString() << endl;

完整示例

下面是一个完整的示例程序，它将读取本地的一张表格图片，并调用腾讯云的RecognizeTableOCR接口来识别表格。具体的代码如下：

#include <tencentcloud/core/Credential.h>
#include <tencentcloud/core/profile/ClientProfile.h>
#include <tencentcloud/core/profile/HttpProfile.h>
#include <tencentcloud/ocr/v20181119/OcrClient.h>
#include <tencentcloud/ocr/v20181119/model/RecognizeTableOCRRequest.h>
#include <tencentcloud/ocr/v20181119/model/RecognizeTableOCRResponse.h>
#include <iostream>
#include <string>
#include <vector>
#include <fstream>

using namespace TencentCloud;
using namespace TencentCloud::Ocr::V20181119;
using namespace TencentCloud::Ocr::V20181119::Model;
using namespace std;


static const std::string base64_chars =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    "abcdefghijklmnopqrstuvwxyz"
    "0123456789+/";

static inline bool is_base64(unsigned char c) {
  return (isalnum(c) || (c == '+') || (c == '/'));
}

std::string base64_encode(unsigned char const* bytes_to_encode, unsigned int in_len) {
  std::string ret;
  int i = 0;
  int j = 0;
  unsigned char char_array_3[3];
  unsigned char char_array_4[4];

  while (in_len--) {
    char_array_3[i++] = *(bytes_to_encode++);
    if (i == 3) {
      char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
      char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
      char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
      char_array_4[3] = char_array_3[2] & 0x3f;

      for (i = 0; (i < 4); i++)
        ret += base64_chars[char_array_4[i]];
      i = 0;
    }
  }

  if (i) {
    for (j = i; j < 3; j++)
      char_array_3[j] = '\0';

    char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
    char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
    char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);

    for (j = 0; (j < i + 1); j++)
      ret += base64_chars[char_array_4[j]];

    while ((i++ < 3))
      ret += '=';
  }

  return ret;
}



int main() {
        // 实例化一个认证对象，入参需要传入腾讯云账户 SecretId 和 SecretKey
        Credential cred = Credential("AKIDfM1fEkDezHUdgdTEpJcjcGWaR4UhfSVL", "3rUQID2iHlFZot7A5tbEsNix7KakmcMx");

        // 实例化一个http选项，可选的，没有特殊需求可以跳过
        HttpProfile httpProfile = HttpProfile();
        httpProfile.SetEndpoint("ocr.tencentcloudapi.com");

        // 实例化一个client选项，可选的，没有特殊需求可以跳过
        ClientProfile clientProfile = ClientProfile();
        clientProfile.SetHttpProfile(httpProfile);
        // 实例化要请求产品的client对象,clientProfile是可选的
        OcrClient client = OcrClient(cred, "ap-guangzhou", clientProfile);

        // 实例化一个请求对象,每个接口都会对应一个request对象
        RecognizeTableOCRRequest req = RecognizeTableOCRRequest();
        
        // 加载本地图片
        string image_path = "./images/1.jpg";
        ifstream in(image_path, ios::in | ios::binary);
        ostringstream oss;
        oss << in.rdbuf();
        string image_binary = oss.str();
        
        // 将二进制图像数据转换为Base64编码
        string image_base64 = base64_encode(reinterpret_cast<const unsigned char*>(image_binary.data()), image_binary.size());
        
        // 将Base64编码图像数据赋值到请求对象中
        req.SetImageBase64(image_base64);


        // 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
        auto outcome = client.RecognizeTableOCR(req);
        if (!outcome.IsSuccess())
        {
            cout << outcome.GetError().PrintAll() << endl;
            return -1;
        }
        RecognizeTableOCRResponse resp = outcome.GetResult();
        // 输出json格式的字符串回包
        cout << resp.ToJsonString() << endl;
    
    return 0;
}

具体案例

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

路径：/Demo/3 C++ Demo 中提供了可直接运行的程序。

配置好相关环境后，将main.cpp第71行中的 "SecretId", "SecretKey" 替换为您自己的凭证。

1	`Credential cred = Credential("SecretId", "SecretKey");`

执行以下命令编译并运行程序：

1
2
3

g++ -o main main.cpp -I/usr/local/include/tencentcloud/ocr/v20181119 -L/usr/local/lib -ltencentcloud-sdk-cpp-core -ltencentcloud-sdk-cpp-ocr

./main

程序运行后，变量 resp 中将保存 API 返回的 JSON 格式字符串。下面是打印内容的示例：

Node.js调用OCR API

步骤1：环境配置

安装 Node.js 环境。您可以在 Node.js 官网下载安装包并进行安装。
通过npm安装Tencent Cloud SDK:

1	`npm install tencentcloud-sdk-nodejs --save`

步骤2：身份认证

在您的代码中，您需要设置以下配置：

const tencentcloud = require("tencentcloud-sdk-nodejs");
// 导入 OCR 客户端
const OcrClient = tencentcloud.ocr.v20181119.Client;

const clientConfig = {
  credential: {
    secretId: "yourSecretId",
    secretKey: "yourSecretKey",
  },
  region: "ap-guangzhou", // 设置请求地域
  profile: {
    httpProfile: {
      endpoint: "ocr.tencentcloudapi.com",
    },
  },
};

// 实例化 OCR 客户端
const client = new OcrClient(clientConfig);

在以上代码中，您需要将yourSecretId和yourSecretKey替换为您的实际API密钥信息。并且，您需要设置region为您需要请求的地域信息。profile部分设置了API请求的基础信息。

步骤3：base64编码

在使用OCR API时，您需要将您的图片文件编码为base64格式。您可以使用以下代码将图片文件编码为base64格式：

const imagePath = path.join(__dirname, "images", "1.jpg");

// 读取图片文件
let imageBase64;
try {
  imageBase64 = fs.readFileSync(imagePath, "base64");
} catch (err) {
  console.error(`Failed to read image file: ${err.message}`);
  return;
}

在以上代码中，您需要将imagePath替换为您的图片文件的实际路径。如果读取图片文件失败，您需要捕获错误并处理。

步骤4：调用API

在设置完请求参数之后，您可以使用以下代码调用OCR API：

javascript

const params = {
  ImageBase64: imageBase64, // 设置图片数据
};

// 发送 OCR 请求
client.RecognizeTableOCR(params)
  .then((data) => {
    console.log(data);
  })
  .catch((err) => {
    console.error(`OCR request failed: ${err.message}`);
  });

在以上代码中，params对象中设置了API请求的参数信息。client.RecognizeTableOCR(params)部分是调用OCR API的实际代码。

完整示例

以下是一个完整的示例，演示了如何使用 Node.js 调用腾讯云的 OCR 服务来识别表格，并打印出识别结果。

const tencentcloud = require("tencentcloud-sdk-nodejs");
const path = require("path");
const fs = require("fs");

// 导入 OCR 客户端
const OcrClient = tencentcloud.ocr.v20181119.Client;

// 设置密钥信息 yourSecretId yourSecretKey
const clientConfig = {
  credential: {
    secretId: "yourSecretId",
    secretKey: "yourSecretKey",
  },
  region: "ap-guangzhou", // 设置请求地域
  profile: {
    httpProfile: {
      endpoint: "ocr.tencentcloudapi.com",
    },
  },
};

// 设置图片路径
const imagePath = path.join(__dirname, "images", "1.jpg");

// 实例化 OCR 客户端
const client = new OcrClient(clientConfig);

// 读取图片文件
let imageBase64;
try {
  imageBase64 = fs.readFileSync(imagePath, "base64");
} catch (err) {
  console.error(`Failed to read image file: ${err.message}`);
  return;
}

// 设置请求参数
const params = {
  ImageBase64: imageBase64, // 设置图片数据
};

// 发送 OCR 请求
client.RecognizeTableOCR(params)
  .then((data) => {
    console.log(data);
  })
  .catch((err) => {
    console.error(`OCR request failed: ${err.message}`);
  });

具体案例

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

在这里，我们提供两个不同版本的示例，它们都可以直接运行：

基础版本：它读取了位于 ./images/1.jpg 路径下的图片，并将识别结果直接打印在控制台窗口上。
进阶版本：该版本提供了前后端交互的示例，它可以将识别结果打印在网页上。

基础版本

路径：/Demo/4 Node.js Demo/basic version

在该文件夹下，点击在终端中打开

输入以下命令以安装必要的依赖：

1	`npm install tencentcloud-sdk-nodejs --save`

将 server.js 第 11 行、12 行中的 yourSecretId 和 yourSecretKey 替换为您自己的凭证。

credential: {
    secretId: "yourSecretId",
    secretKey: "yourSecretKey",
  },

输入以下命令运行脚本：

1	`node server.js`

输出结果如下：

输出参数参考腾讯云 API Explorer

进阶版本

路径：/Demo/4 Node.js Demo/advanced version

在该文件夹下，点击在终端中打开

输入以下命令配置好相关环境

1
2
3

npm install tencentcloud-sdk-nodejs --save
npm install express
npm install multer

将server.js第23行、24行中的 yourSecretId和yourSecretKey 替换为您自己的凭证。

credential: {
    secretId: "yourSecretId",
    secretKey: "yourSecretKey",
  },

输入以下命令运行脚本

1	`node server.js`

打开浏览器，输入http://localhost:3000/

选择需要识别的表格图片，并点击提交，识别结果展示如下：

Python调用OCR API

步骤1：环境配置

Python 3.6至3.9版本。
通过 Pip 安装 Tencent Cloud SDK:

1	`pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-sdk-python`

步骤2：身份认证

在此步骤中，我们将配置身份认证凭证。请确保已获取您的腾讯云 API 密钥。

import json
import base64
from PIL import Image
from io import BytesIO
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models

cred = credential.Credential("SecretId", "SecretKey")

步骤3：base64编码

在此步骤中，我们示范了如何将一张jpg格式图片编码为api需要的base64格式.图片为本地读入，请您在具体调用时根据需求进行修改。

# 读入图片并编码为base64
with open("./images/1.jpg", "rb") as image_file:
    # 打开图片并转换为PIL Image对象
    image = Image.open(image_file)

    # 将PIL Image对象转换为BytesIO对象
    buffer = BytesIO()
    image.save(buffer, format='JPEG')

    # 将BytesIO对象中的数据读取为二进制数据，并使用base64编码
    img_base64 = base64.b64encode(buffer.getvalue()).decode('utf-8')

步骤4：调用 API

在此步骤中，我们将使用 cred 对象调用腾讯云 OCR API的 RecognizeTableOCR API进行表格识别，并将响应保存到变量resp中。

try:
    # 实例化一个http选项，可选的，没有特殊需求可以跳过
    httpProfile = HttpProfile()
    httpProfile.endpoint = "ocr.tencentcloudapi.com"

    # 实例化一个client选项，可选的，没有特殊需求可以跳过
    clientProfile = ClientProfile()
    clientProfile.httpProfile = httpProfile

    # 实例化要请求产品的client对象,clientProfile是可选的
    client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)

    # 实例化一个请求对象,每个接口都会对应一个request对象
    req = models.RecognizeTableOCRRequest()
    params = {
        "ImageBase64": img_base64,  # 将图片的base64格式字符串传递给API
    }
    req.from_json_string(json.dumps(params))

    # 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
    resp = client.RecognizeTableOCR(req)

    # 输出json格式的字符串回包
    print(resp.to_json_string())

except TencentCloudSDKException as err:
    print(err)

完整示例

以下是一个完整的示例，演示了如何使用 Python 调用腾讯云的 OCR 服务来识别表格，并打印出识别结果。

import json
import base64
from PIL import Image
from io import BytesIO
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models

# 将图片编码为base64
with open("./images/1.jpg", "rb") as image_file:
    # 打开图片并转换为PIL Image对象
    image = Image.open(image_file)

    # 将PIL Image对象转换为BytesIO对象
    buffer = BytesIO()
    image.save(buffer, format='JPEG')

    # 将BytesIO对象中的数据读取为二进制数据，并使用base64编码
    img_base64 = base64.b64encode(buffer.getvalue()).decode('utf-8')

try:
    # 实例化一个认证对象，入参需要传入腾讯云账户 SecretId 和 SecretKey
    cred = credential.Credential("SecretId", "SecretKey")

    # 实例化一个http选项，可选的，没有特殊需求可以跳过
    httpProfile = HttpProfile()
    httpProfile.endpoint = "ocr.tencentcloudapi.com"

    # 实例化一个client选项，可选的，没有特殊需求可以跳过
    clientProfile = ClientProfile()
    clientProfile.httpProfile = httpProfile

    # 实例化要请求产品的client对象,clientProfile是可选的
    client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)

    # 实例化一个请求对象,每个接口都会对应一个request对象
    req = models.RecognizeTableOCRRequest()
    params = {
        "ImageBase64": img_base64,  # 将图片的base64格式字符串传递给API
    }
    req.from_json_string(json.dumps(params))

    # 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
    resp = client.RecognizeTableOCR(req)

    # 输出json格式的字符串回包
    print(resp.to_json_string())

except TencentCloudSDKException as err:
    print(err)

具体案例

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

路径：/Demo/5 Python Demo 中提供了可直接运行的程序。

配置好相关环境后，将main.py第25行中的 Credential("SecretId", "SecretKey") 替换为您自己的凭证。然后运行程序。

1	`cred = credential.Credential("SecretId", "SecretKey")`

注意：关闭VPN！！！

程序运行后，变量 resp 中将保存 API 返回的 JSON 格式字符串。下面是打印内容的示例：

GO调用OCR API

步骤1：环境配置

Go 1.9 版本及以上
初始化一个名叫ocr的模块：
1
go mod init ocr

腾讯云镜像加速go get下载（可选）：

Linux 或 MacOS：

1	`export GOPROXY=https://mirrors.tencent.com/go/`

Windows：

1	`set GOPROXY=https://mirrors.tencent.com/go/`

安装公共基础包：

1	`go get -v -u github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/common`

安装ocr产品包：

1	`go get -v -u github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/ocr`

步骤2：身份认证

使用该API之前，您需要拥有一对腾讯云账号的 SecretId 和 SecretKey 用于身份认证。在代码中，可以通过以下代码实例化一个认证对象：

credential := common.NewCredential(
    "SecretId",
    "SecretKey",
)

请将 “SecretId” 和 “SecretKey” 替换为您自己的账号信息。

步骤3：base64编码

在使用该API之前，您需要将待识别的图片文件转换成base64格式。在示例中，通过以下代码读取本地图片文件，并进行base64编码：

fileData, err := ioutil.ReadFile("./images/1.jpg")
if err != nil {
    panic(err)
}
imageBase64 := base64.StdEncoding.EncodeToString(fileData)

步骤4：调用 API

通过以下代码实例化一个client对象和一个请求对象，并将图片内容设置为请求对象的参数：

cpf := profile.NewClientProfile()
cpf.HttpProfile.Endpoint = "ocr.tencentcloudapi.com"

client, err := ocr.NewClient(credential, "ap-guangzhou", cpf)
if err != nil {
    panic(err)
}

request := ocr.NewRecognizeTableOCRRequest()
request.ImageBase64 = common.StringPtr(imageBase64)

response, err := client.RecognizeTableOCR(request)
if _, ok := err.(*errors.TencentCloudSDKError); ok {
    fmt.Printf("An API error has returned: %s", err)
    return
}
if err != nil {
    panic(err)
}

完整示例

以下是一个完整的示例，演示了如何使用 GO 调用腾讯云的 OCR 服务来识别表格，并打印出识别结果。

package main

import (
	"encoding/base64"
	"fmt"
	"io/ioutil"

	"github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/common"
	"github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/common/errors"
	"github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/common/profile"
	ocr "github.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/ocr/v20181119"
)

func main() {
	// 读取本地图片文件
	fileData, err := ioutil.ReadFile("./images/1.jpg")
	if err != nil {
		panic(err)
	}
	// 将图片内容进行 Base64 编码
	imageBase64 := base64.StdEncoding.EncodeToString(fileData)

	// 实例化一个认证对象，入参需要传入腾讯云账户 SecretId 和 SecretKey，此处还需注意密钥对的保密
	credential := common.NewCredential(
		"SecretId",
		"SecretKey",
	)

	// 实例化一个client选项，可选的，没有特殊需求可以跳过
	cpf := profile.NewClientProfile()
	cpf.HttpProfile.Endpoint = "ocr.tencentcloudapi.com"

	// 实例化要请求产品的client对象,clientProfile是可选的
	client, err := ocr.NewClient(credential, "ap-guangzhou", cpf)
	if err != nil {
		panic(err)
	}

	// 实例化一个请求对象,每个接口都会对应一个request对象
	request := ocr.NewRecognizeTableOCRRequest()

	// 设置要识别的图片内容
	request.ImageBase64 = common.StringPtr(imageBase64)

	// 返回的resp是一个RecognizeTableOCRResponse的实例，与请求对象对应
	response, err := client.RecognizeTableOCR(request)
	if _, ok := err.(*errors.TencentCloudSDKError); ok {
		fmt.Printf("An API error has returned: %s", err)
		return
	}
	if err != nil {
		panic(err)
	}

	// 输出json格式的字符串回包
	fmt.Printf("%s", response.ToJsonString())
}

具体案例

您可以从以下仓库中下载示例代码：https://github.com/ShengjieJin/tencentcloudOCRDemo

路径：/Demo/6 GO Demo 中提供了可直接运行的程序。

配置好相关环境后，将main.go第25、26行中的 "SecretId","SecretKey" 替换为您自己的凭证。

credential := common.NewCredential(
		"SecretId",
		"SecretKey",
	)

执行以下命令运行程序：

1	`go run main.go`

程序运行后，变量 response 中将保存 API 返回的 JSON 格式字符串。下面是打印内容的示例：

tutorial

#原创

腾讯云OCR使用教程

http://example.com/2023/02/28/腾讯云OCR使用教程/

Author

Shengjie Jin

Posted on

February 28, 2023

Licensed under