你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

Recognize Printed Text - Recognize Printed Text

参考

服务:: Azure AI Services

API 版本:: 2.1

光学字符识别 (OCR) 可以检测图像中的文本，并将识别到的字符提取到机器可用的字符流中。成功后，将返回 OCR 结果。失败时，将返回错误代码和错误消息。错误代码可以是 InvalidImageUrl、InvalidImageFormat、InvalidImageSize、NotSupportedImage、NotSupportedLanguage 或 InternalServerError 之一。

POST {Endpoint}/vision/v2.1/ocr?detectOrientation={detectOrientation}

具有可选参数:

POST {Endpoint}/vision/v2.1/ocr?detectOrientation={detectOrientation}&language={language}

URI 参数

名称	在	必需	类型	说明
Endpoint	path	True	string	支持的认知服务终结点。
detectOrientation	query	True	boolean	是否检测图像中的文本方向。使用 detectOrientation=true 时，OCR 服务会尝试检测图像方向，并在进一步处理 (之前对其进行更正，例如，如果图像方向是倒置) 。
language	query		OcrLanguages	图像中要检测的文本的 BCP-47 语言代码。默认值为“unk”。

请求头

名称	必需	类型	说明
Ocp-Apim-Subscription-Key	True	string

请求正文

名称	必需	类型	说明
url	True	string	图像的可公开访问 URL。

响应

名称	类型	说明
200 OK	OcrResult	OCR 生成区域/行/字的层次结构。结果包括文本、区域、行和单词的边界框。检测到的文本相对于最接近的水平或垂直方向的角度（以弧度为单位）。按此角度顺时针旋转输入图像后，识别的文本行将变为水平或垂直。结合方向属性，它可用于在原始图像上正确覆盖识别结果，方法是将原始图像或识别结果围绕原始图像中心旋转一个合适的角度。如果无法自信地检测到角度，则此属性不存在。如果图像包含不同角度的文本，则只有部分文本将被正确识别。
Other Status Codes	ComputerVisionError	错误响应。

安全性

Ocp-Apim-Subscription-Key

类型: apiKey
在: header

示例

Successful RecognizePrintedText request

示例请求

HTTP

POST https://westus.api.cognitive.microsoft.com/vision/v2.1/ocr?detectOrientation=true&language=en


"{url}"

示例响应

状态代码:: 200

{
  "language": "en",
  "textAngle": -2.0000000000000338,
  "orientation": "Up",
  "regions": [
    {
      "boundingBox": "462,379,497,258",
      "lines": [
        {
          "boundingBox": "462,379,497,74",
          "words": [
            {
              "boundingBox": "462,379,41,73",
              "text": "A"
            },
            {
              "boundingBox": "523,379,153,73",
              "text": "GOAL"
            },
            {
              "boundingBox": "694,379,265,74",
              "text": "WITHOUT"
            }
          ]
        },
        {
          "boundingBox": "565,471,289,74",
          "words": [
            {
              "boundingBox": "565,471,41,73",
              "text": "A"
            },
            {
              "boundingBox": "626,471,150,73",
              "text": "PLAN"
            },
            {
              "boundingBox": "801,472,53,73",
              "text": "IS"
            }
          ]
        },
        {
          "boundingBox": "519,563,375,74",
          "words": [
            {
              "boundingBox": "519,563,149,74",
              "text": "JUST"
            },
            {
              "boundingBox": "683,564,41,72",
              "text": "A"
            },
            {
              "boundingBox": "741,564,153,73",
              "text": "WISH"
            }
          ]
        }
      ]
    }
  ]
}

定义

名称	说明
ComputerVisionError	有关 API 请求错误的详细信息。
ComputerVisionErrorCodes	错误代码。
ImageUrl
OcrLanguages	图像中要检测的文本的 BCP-47 语言代码。默认值为“unk”。
OcrLine	描述单个已识别文本行的对象。
OcrRegion	一个区域由多行 (例如多列文档中的一列文本) 组成。
OcrResult
OcrWord	有关已识别字词的信息。

ComputerVisionError

有关 API 请求错误的详细信息。

名称	类型	说明
code	ComputerVisionErrorCodes	错误代码。
message	string	解释服务报告的错误的消息。
requestId	string	唯一请求标识符。

ComputerVisionErrorCodes

错误代码。

名称	类型	说明
BadArgument	string
CancelledRequest	string
DetectFaceError	string
FailedToProcess	string
InternalServerError	string
InvalidDetails	string
InvalidImageFormat	string
InvalidImageSize	string
InvalidImageUrl	string
InvalidModel	string
InvalidThumbnailSize	string
NotSupportedFeature	string
NotSupportedImage	string
NotSupportedLanguage	string
NotSupportedVisualFeature	string
StorageException	string
Timeout	string
Unspecified	string
UnsupportedMediaType	string

ImageUrl

名称	类型	说明
url	string	图像的可公开访问 URL。

OcrLanguages

图像中要检测的文本的 BCP-47 语言代码。默认值为“unk”。

名称	类型	说明
ar	string
cs	string
da	string
de	string
el	string
en	string
es	string
fi	string
fr	string
hu	string
it	string
ja	string
ko	string
nb	string
nl	string
pl	string
pt	string
ro	string
ru	string
sk	string
sr-Cyrl	string
sr-Latn	string
sv	string
tr	string
unk	string
zh-Hans	string
zh-Hant	string

OcrLine

描述单个已识别文本行的对象。

名称	类型	说明
boundingBox	string	已识别行的边界框。四个整数表示输入图像的坐标系中左边缘的 x 坐标、边界框的上边缘、宽度和高度的 y 坐标，在根据检测到的文本角度围绕其中心旋转后， (看到 textAngle 属性) ，其原点位于左上角，和向下指向的 y 轴。
words	OcrWord[]	对象的数组，其中每个对象表示一个已识别的单词。

OcrRegion

一个区域由多行 (例如多列文档中的一列文本) 组成。

名称	类型	说明
boundingBox	string	已识别区域的边界框。四个整数表示输入图像的坐标系中左边缘的 x 坐标、边界框的上边缘、宽度和高度的 y 坐标，在根据检测到的文本角度围绕其中心旋转后， (看到 textAngle 属性) ，其原点位于左上角，和向下指向的 y 轴。
lines	OcrLine[]	已识别的文本行的数组。

OcrResult

名称	类型	说明
language	string	图像中文本的 BCP-47 语言代码。
orientation	string	图像中识别的文本的方向（如果需要）。值 (向上、向下、向左或向右) 是指根据检测到的文本角度围绕图像中心旋转后，识别的文本顶部朝上的方向 (看到 textAngle 属性) 。如果未请求检测方向，或者未检测到任何文本，则值为“NotDetected”。
regions	OcrRegion[]	对象的数组，其中每个对象表示已识别文本的区域。
textAngle	number	检测到的文本相对于最接近的水平或垂直方向的角度（以弧度为单位）。按此角度顺时针旋转输入图像后，识别的文本行将变为水平或垂直。结合方向属性，它可用于在原始图像上正确覆盖识别结果，方法是将原始图像或识别结果围绕原始图像中心旋转一个合适的角度。如果无法自信地检测到角度，则此属性不存在。如果图像包含不同角度的文本，则只有部分文本将被正确识别。

OcrWord

有关已识别字词的信息。

名称	类型	说明
boundingBox	string	已识别字词的边界框。四个整数表示输入图像的坐标系中左边缘的 x 坐标、边界框的上边缘、宽度和高度的 y 坐标，在根据检测到的文本角度围绕其中心旋转后， (看到 textAngle 属性) ，其原点位于左上角，和向下指向的 y 轴。
text	string	已识别单词的字符串值。

通过