你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

Content Analyzers - Analyze Binary

服务:: Azure AI Services

API 版本:: 2025-11-01

从输入中提取内容和字段。

POST {endpoint}/contentunderstanding/analyzers/{analyzerId}:analyzeBinary?api-version=2025-11-01

具有可选参数:

POST {endpoint}/contentunderstanding/analyzers/{analyzerId}:analyzeBinary?api-version=2025-11-01&stringEncoding={stringEncoding}&processingLocation={processingLocation}&range={range}

URI 参数

名称	在	必需	类型	说明
analyzerId	path	True	string minLength: 1 maxLength: 64 pattern: ^[a-zA-Z0-9._-]{1,64}$	分析仪的唯一标识符。
endpoint	path	True	string (uri)	内容理解服务端点。
api-version	query	True	string minLength: 1	要用于此操作的 API 版本。
processingLocation	query		ProcessingLocation	数据可能被处理的地点。默认是全局。
range	query		string	输入分析范围（例如）。 `1-3,5,9-` 文档内容使用基于1页码的页面编号，而视听内容使用整数毫秒。
stringEncoding	query		string	内容的字符串编码格式在响应中。可能的值有“codePoint”、“utf16”和 `utf8`。默认是 `codePoint`.“）

请求头

Media Types: "*/*"

名称	必需	类型	说明
x-ms-client-request-id		string (uuid)	请求的不透明、全局唯一的客户端生成的字符串标识符。

请求正文

Media Types: "*/*"

名称	类型	说明
input	string (binary)	文档的二进制内容需要分析。

响应

名称	类型	说明
202 Accepted	ContentAnalyzerAnalyzeOperationStatus	已接受请求进行处理，但尚未完成处理。标头 Operation-Location: string x-ms-client-request-id: string
Other Status Codes	Azure.Core.Foundations.ErrorResponse	意外的错误响应。标头 x-ms-error-code: string

名称

类型

说明

202 Accepted

ContentAnalyzerAnalyzeOperationStatus

已接受请求进行处理，但尚未完成处理。

标头

Operation-Location: string
x-ms-client-request-id: string

Other Status Codes

Azure.Core.Foundations.ErrorResponse

意外的错误响应。

标头

x-ms-error-code: string

安全性

Ocp-Apim-Subscription-Key

基于密钥的认证，使用Azure资源的访问密钥。

类型: apiKey
在: header

EntraIdToken

Microsoft Entra ID OAuth2 使用访问令牌进行认证。

类型: oauth2
流向: accessCode
授权 URL: https://login.microsoftonline.com/common/oauth2/authorize
令牌 URL: https://login.microsoftonline.com/common/oauth2/token

作用域

名称	说明
https://cognitiveservices.azure.com/.default

示例

Analyze File

示例请求

HTTP

POST {endpoint}/contentunderstanding/analyzers/myAnalyzer:analyzeBinary?api-version=2025-11-01

"RXhhbXBsZSBGaWxl"

示例响应

状态代码:: 202

Operation-Location: https://myendpoint.cognitiveservices.azure.com/contentunderstanding/analyzerResults/3b31320d-8bab-4f88-b19c-2322a7f11034?api-version=2025-11-01

{
  "id": "3b31320d-8bab-4f88-b19c-2322a7f11034",
  "status": "NotStarted"
}

定义

名称	说明
AnalysisContentKind	算是媒体内容。
AnalysisResult	分析作结果。
ArrayField	从内容中提取数组字段。
AudioVisualContent	视听内容。比如音频/wav，视频/mp4。
AudioVisualContentSegment	检测到音频/视频内容片段。
Azure.Core.Foundations.Error	错误对象。
Azure.Core.Foundations.ErrorResponse	包含错误详细信息的响应。
Azure.Core.Foundations.InnerError	包含有关错误的更具体信息的对象。根据 Azure REST API 准则 - https://aka.ms/AzureRestApiGuidelines#handling-errors。
BooleanField	从内容中提取的布尔字段。
ContentAnalyzerAnalyzeOperationStatus	提供分析作的状态详情。
ContentFieldType	字段值的语义数据类型。
ContentSpan	元素在 markdown 中的位置，以字符偏移和长度表示。
DateField	从内容中提取日期字段。
DocumentAnnotation	文档中的注释，如划线或注释。
DocumentAnnotationComment	注释与文档注释关联。
DocumentAnnotationKind	文档注释类型。
DocumentBarcode	文档中的条形码。
DocumentBarcodeKind	条形码类型。
DocumentCaption	表格或图形说明。
DocumentChartFigure	包含图表的图，如柱形图、折线图或饼图。
DocumentContent	文档内容。例如：纯文本/纯文字，应用程序/PDF，图片/jpeg。
DocumentContentSegment	检测到的文档内容分段。
DocumentFootnote	表格或图表的脚注。
DocumentFormula	文档中的数学公式。
DocumentFormulaKind	公式类型。
DocumentHyperlink	文档中的超链接，比如网页链接或电子邮件地址。
DocumentLine	文档中的一行，由连续的单词序列组成。
DocumentMermaidFigure	包含图表的图，如流程图或网络图。
DocumentPage	文档页面的内容。
DocumentParagraph	文档中的段落，通常由一系列连续的行组成，具有共同的对齐和间距。
DocumentSection	文件中的章节。
DocumentTable	文档中的表格，由表格单元格组成，排列成矩形布局。
DocumentTableCell	文档表中的表格单元格。
DocumentTableCellKind	表单元格类型。
DocumentWord	文档中的单词，由连续的字符序列组成。对于非空格分隔语言（如中文、日语和朝鲜语），每个字符都表示为自己的单词。
IntegerField	整数域从内容中提取。
JsonField	从内容中提取的JSON字段。
LengthUnit	长度单位由宽度、高度和源属性组成。
NumberField	从内容中提取的数字字段。
ObjectField	从内容中提取的对象字段。
OperationState	作状态
ProcessingLocation	数据可能被处理的地点。默认是全局。
SemanticRole	段落的语义角色。
StringField	字符串字段从内容中提取。
TimeField	从内容中提取的时间场。
TranscriptPhrase	逐字稿短语。
TranscriptWord	逐字稿。
UsageDetails	使用详情。

AnalysisContentKind

枚举

算是媒体内容。

值	说明
document	文档内容，比如PDF、图片、文本等。
audioVisual	视听内容，如mp3、mp4等。

AnalysisResult

对象

分析作结果。

名称	类型	默认值	说明
analyzerId	string minLength: 1 maxLength: 64 pattern: ^[a-zA-Z0-9._-]{1,64}$		分析仪的唯一标识符。
apiVersion	string		用于分析文档的API版本。
contents	AnalysisContent[]: AudioVisualContent[] DocumentContent[]		提取的内容。
createdAt	string (date-time)		结果创建的日期和时间。
stringEncoding	string	codePoint	内容的字符串编码格式在响应中。可能的值有“codePoint”、“utf16”和 `utf8`。默认是 `codePoint`.“）
warnings	Azure.Core.Foundations.Error[]		分析文件时遇到的警告。

ArrayField

对象

从内容中提取数组字段。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: array	字段值的语义数据类型。
valueArray	ContentField[]: ArrayField[] BooleanField[] DateField[] IntegerField[] JsonField[] NumberField[] ObjectField[] StringField[] TimeField[]	数组字段值。

AudioVisualContent

对象

视听内容。比如音频/wav，视频/mp4。

名称	类型	说明
analyzerId	string minLength: 1 maxLength: 64 pattern: ^[a-zA-Z0-9._-]{1,64}$	生成这些内容的分析器。
cameraShotTimesMs	integer[] (int64)	视频中摄像机切换列表，以毫秒为单位的时间戳表示。只有当returnDetails为真时才会如此。
category	string	分类内容类别。
endTimeMs	integer (int64)	内容结束时间以毫秒计。
fields	object	从内容中提取字段。
height	integer (int32)	每个视频帧的高度（如适用）以像素为单位。
keyFrameTimesMs	integer[] (int64)	视频中的关键帧列表，以毫秒为单位的时间戳表示。只有当returnDetails为真时才会如此。
kind	string: audioVisual	内容类。
markdown	string	内容的Markdown表示。
mimeType	string	检测到内容的MIME类型。比如应用程序/PDF、图片/jpeg等。
path	string	内容在输入中的路径。
segments	AudioVisualContentSegment[]	检测到的内容片段列表。只有当 enableSegment 为真时。
startTimeMs	integer (int64)	内容开始时间以毫秒计。
transcriptPhrases	TranscriptPhrase[]	文字稿短语列表。只有当returnDetails为真时才会如此。
width	integer (int32)	每个视频帧的宽度（如适用）以像素为单位。

AudioVisualContentSegment

对象

检测到音频/视频内容片段。

名称	类型	说明
category	string	分类内容类别。
endTimeMs	integer (int64)	分段结束时间以毫秒计。
segmentId	string	段标识符。
span	ContentSpan	该段在 Markdown 内容中的跨度。
startTimeMs	integer (int64)	分段的开始时间以毫秒计。

Azure.Core.Foundations.Error

对象

错误对象。

名称	类型	说明
code	string	服务器定义的错误代码集之一。
details	Azure.Core.Foundations.Error[]	导致此报告错误的特定错误的详细信息数组。
innererror	Azure.Core.Foundations.InnerError	包含与当前对象有关错误的更具体信息的对象。
message	string	错误的人工可读表示形式。
target	string	错误的目标。

Azure.Core.Foundations.ErrorResponse

对象

包含错误详细信息的响应。

名称	类型	说明
error	Azure.Core.Foundations.Error	错误对象。

Azure.Core.Foundations.InnerError

对象

包含有关错误的更具体信息的对象。根据 Azure REST API 准则 - https://aka.ms/AzureRestApiGuidelines#handling-errors。

名称	类型	说明
code	string	服务器定义的错误代码集之一。
innererror	Azure.Core.Foundations.InnerError	内部错误。

BooleanField

对象

从内容中提取的布尔字段。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: boolean	字段值的语义数据类型。
valueBoolean	boolean	布尔场值。

ContentAnalyzerAnalyzeOperationStatus

对象

提供分析作的状态详情。

名称	类型	说明
error	Azure.Core.Foundations.Error	描述状态为“失败”时的错误对象。
id	string	作的唯一 ID。
result	AnalysisResult	操作的结果。
status	OperationState	作状态
usage	UsageDetails	分析作的使用详情。

ContentFieldType

枚举

字段值的语义数据类型。

值	说明
string	纯文本。
date	日期，标准化为ISO 8601（YYYY-MM-DD）格式。
time	时间，归一化为ISO 8601（hh：mm：ss）格式。
number	数字是双精度浮点数。
integer	整数为64位带符号整数。
boolean	Boolean 值。
array	同类型的子字段列表。
object	子字段的命名列表。
json	JSON 对象。

ContentSpan

对象

元素在 markdown 中的位置，以字符偏移和长度表示。

名称	类型	说明
length	integer (int32)	在 markdown 中，元素的长度以字符表示。
offset	integer (int32)	在 markdown 中，元素的起始位置（0 索引），用字符表示。

DateField

对象

从内容中提取日期字段。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: date	字段值的语义数据类型。
valueDate	string (date)	日期字段值，采用ISO 8601（YYYY-MM-DD）格式。

DocumentAnnotation

对象

文档中的注释，如划线或注释。

名称	类型	说明
author	string	注释作者。
comments	DocumentAnnotationComment[]	注释相关的评论。
createdAt	string (date-time)	注释创建的日期和时间。
id	string	注释标识符。
kind	DocumentAnnotationKind	注释类。
lastModifiedAt	string (date-time)	注释最后修改的日期和时间。
source	string	注释的位置。
spans	ContentSpan[]	与注释相关的内容跨度。
tags	string[]	与注释相关的标签。

DocumentAnnotationComment

对象

注释与文档注释关联。

名称	类型	说明
author	string	评论作者。
createdAt	string (date-time)	创建批注的日期和时间。
lastModifiedAt	string (date-time)	评论最后修改的日期和时间。
message	string	Markdown中的评论消息。
tags	string[]	评论相关标签。

DocumentAnnotationKind

枚举

文档注释类型。

值	说明
highlight	高亮注释。
strikethrough	删除线注释。
underline	下划线注释。
italic	斜体注释。
bold	加粗注释。
circle	圈标注。
note	注释。

DocumentBarcode

对象

文档中的条形码。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测条形码的信心。
kind	DocumentBarcodeKind	条形码类型。
source	string	编码源代码，用于识别条码在内容中的位置。
span	ContentSpan	条码在降价内容中。
value	string	条形码值。

DocumentBarcodeKind

枚举

条形码类型。

值	说明
QRCode	二维码，定义于ISO/IEC 18004：2015。
PDF417	PDF417，依ISO 15438定义。
UPCA	GS1 12位通用产品代码。
UPCE	GS1 6位通用产品代码。
Code39	代码39条形码，定义于ISO/IEC 16388：2007。
Code128	代码128条形码，定义于ISO/IEC 15417：2007。
EAN8	GS1 8位国际商品编号（欧洲商品编号）。
EAN13	GS1 13位国际商品编号（欧洲商品编号）。
DataBar	GS1 DataBar条形码。
Code93	代码93条码，定义于ANSI/AIM BC5-1995。
Codabar	Codabar条形码，定义于ANSI/AIM BC3-1995。
DataBarExpanded	GS1 数据条扩展条码。
ITF	交错的5条码中的第2条，定义见ANSI/AIM BC2-1995。
MicroQRCode	微型二维码，定义于ISO/IEC 23941：2022。
Aztec	阿兹特克规范，定义于ISO/IEC 24778：2008。
DataMatrix	数据矩阵代码，定义于ISO/IEC 16022：2006。
MaxiCode	MaxiCode，按照ISO/IEC 16023：2000定义。

DocumentCaption

对象

表格或图形说明。

名称	类型	说明
content	string	标题内容。
elements	string[]	字幕中的子元素。
source	string	编码源代码，标识标题在内容中的位置。
span	ContentSpan	在Markdown内容中，标题的范围。

DocumentChartFigure

对象

包含图表的图，如柱形图、折线图或饼图。

名称	类型	默认值	说明
caption	DocumentCaption		图注。
content			图表内容用 Chart.js 配置表示。
description	string		人物描述。
elements	string[]		图的子元素，不包括任何标题或脚注。
footnotes	DocumentFootnote[]		图表脚注列表。
id	string		数字标识符。
kind	string: chart	unknown	想想那种。
role	SemanticRole		图形的语义作用。
source	string		编码源代码，用于识别图形在内容中的位置。
span	ContentSpan		降价内容中人物的范围。

DocumentContent

对象

文档内容。例如：纯文本/纯文字，应用程序/PDF，图片/jpeg。

名称	类型	说明
analyzerId	string minLength: 1 maxLength: 64 pattern: ^[a-zA-Z0-9._-]{1,64}$	生成这些内容的分析器。
annotations	DocumentAnnotation[]	文档中的注释列表。只有当enableAnnotations和returnDetails为真时。
category	string	分类内容类别。
endPageNumber	integer (int32)	内容的末页编号（1索引）。
fields	object	从内容中提取字段。
figures	DocumentFigure[]: DocumentChartFigure[] DocumentMermaidFigure[]	文件中的人物列表。只有在 enableLayout 和 returnDetails 都成立的情况下。
hyperlinks	DocumentHyperlink[]	文档中的超链接列表。只有在returnDetails为真的情况下。
kind	string: document	内容类。
markdown	string	内容的Markdown表示。
mimeType	string	检测到内容的MIME类型。比如应用程序/PDF、图片/jpeg等。
pages	DocumentPage[]	文件页数列表。
paragraphs	DocumentParagraph[]	文件中的段落列表。只有当 enableOcr 和 returnDetails 都成立时。
path	string	内容在输入中的路径。
sections	DocumentSection[]	文档中的章节列表。只有在 enableLayout 和 returnDetails 都成立的情况下。
segments	DocumentContentSegment[]	检测到的内容片段列表。只有当 enableSegment 为真时。
startPageNumber	integer (int32)	内容的起始页码（索引1）。
tables	DocumentTable[]	文档中的表格列表。只有在 enableLayout 和 returnDetails 都成立的情况下。
unit	LengthUnit	长度单位由宽度、高度和源属性组成。对于图片/tiff，默认单位是像素。对于PDF来说，默认单位是英寸。

DocumentContentSegment

对象

检测到的文档内容分段。

名称	类型	说明
category	string	分类内容类别。
endPageNumber	integer (int32)	该段的末页编号（1索引）。
segmentId	string	段标识符。
span	ContentSpan	该段在 Markdown 内容中的跨度。
startPageNumber	integer (int32)	该段的起始页码（1索引）。

DocumentFootnote

对象

表格或图表的脚注。

名称	类型	说明
content	string	脚注内容。
elements	string[]	脚注中的子元素。
source	string	编码源代码，标识脚注在内容中的位置。
span	ContentSpan	在折扣内容中脚注的范围。

DocumentFormula

对象

文档中的数学公式。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	对公式预测的信心。
kind	DocumentFormulaKind	公式类型。
source	string	编码源，标识公式在内容中的位置。
span	ContentSpan	公式的跨度，在 Markdown 内容中。
value	string	描述公式的 LaTex 表达式。

DocumentFormulaKind

枚举

公式类型。

值	说明
inline	一个嵌入在段落内容中的公式。
display	显示模式下的公式占据整行。

DocumentHyperlink

对象

文档中的超链接，比如网页链接或电子邮件地址。

名称	类型	说明
content	string	超链接内容。
source	string	超链接的位置。
span	ContentSpan	在Markdown内容中超链接的范围。
url	string	超链接的网址。

DocumentLine

对象

文档中的一行，由连续的单词序列组成。

名称	类型	说明
content	string	行文字。
source	string	编码源用于识别该行在内容中的位置。
span	ContentSpan	折扣内容中该线的宽度。

DocumentMermaidFigure

对象

包含图表的图，如流程图或网络图。

名称	类型	默认值	说明
caption	DocumentCaption		图注。
content	string		图示内容采用美人鱼语法表示。
description	string		人物描述。
elements	string[]		图的子元素，不包括任何标题或脚注。
footnotes	DocumentFootnote[]		图表脚注列表。
id	string		数字标识符。
kind	string: mermaid	unknown	想想那种。
role	SemanticRole		图形的语义作用。
source	string		编码源代码，用于识别图形在内容中的位置。
span	ContentSpan		降价内容中人物的范围。

DocumentPage

对象

文档页面的内容。

名称	类型	说明
angle	number (float) maximum: 180	内容以顺时针方向的一般方向，以度为单位（-180,180）。只有在enableOCR成立的情况下。
barcodes	DocumentBarcode[]	页面中的条形码列表。只有在 enableBarcode 和 returnDetails 都为真的情况下。
formulas	DocumentFormula[]	页面中的数学公式列表。只有当 enableFormula 和 returnDetails 都为真时。
height	number (float)	页面高度。
lines	DocumentLine[]	页面中的行数列表。只有当 enableOcr 和 returnDetails 都成立时。
pageNumber	integer (int32) minimum: 1	页码（以1为基础）。
spans	ContentSpan[]	与 markdown 内容中页面相关的跨度。
width	number (float)	页面宽度。
words	DocumentWord[]	页面上的单词列表。只有当 enableOcr 和 returnDetails 都成立时。

DocumentParagraph

对象

文档中的段落，通常由一系列连续的行组成，具有共同的对齐和间距。

名称	类型	说明
content	string	段落文本。
role	SemanticRole	段落的语义角色。
source	string	编码来源，标识段落在内容中的位置。
span	ContentSpan	在降价内容中段落的长度。

DocumentSection

对象

文件中的章节。

名称	类型	说明
elements	string[]	节的子元素。
span	ContentSpan	在降价内容中该段的跨度。

DocumentTable

对象

文档中的表格，由表格单元格组成，排列成矩形布局。

名称	类型	说明
caption	DocumentCaption	表格说明。
cells	DocumentTableCell[]	表格中包含的单元格。
columnCount	integer (int32) minimum: 1	表中的列数。
footnotes	DocumentFootnote[]	表格脚注列表。
role	SemanticRole	表格的语义角色。
rowCount	integer (int32) minimum: 1	表中的行数。
source	string	编码源，标识表在内容中的位置。
span	ContentSpan	在Markdown内容中，表格的横幅。

DocumentTableCell

对象

文档表中的表格单元格。

名称	类型	默认值	说明
columnIndex	integer (int32)		单元格的列索引。
columnSpan	integer (int32) minimum: 1	1	此单元格跨越的列数。
content	string		表格单元的内容。
elements	string[]		表单元的子元素。
kind	DocumentTableCellKind	content	表单元格类型。
rowIndex	integer (int32)		单元格的行索引。
rowSpan	integer (int32) minimum: 1	1	此单元格跨越的行数。
source	string		编码源代码，用于识别表格单元在内容中的位置。
span	ContentSpan		markdown 内容中表格单元的跨度。

DocumentTableCellKind

枚举

表单元格类型。

值	说明
content	主要内容/数据。
rowHeader	列内容描述。
columnHeader	描述专栏内容。
stubHead	行头的描述，通常位于表格的左上角。
description	表格（部分）内容的描述。

DocumentWord

对象

文档中的单词，由连续的字符序列组成。对于非空格分隔语言（如中文、日语和朝鲜语），每个字符都表示为自己的单词。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测单词的自信。
content	string	文字文字。
source	string	编码的源代码，标识该词在内容中的位置。
span	ContentSpan	在Markdown内容中，这个词的范围。

IntegerField

对象

整数域从内容中提取。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: integer	字段值的语义数据类型。
valueInteger	integer (int64)	整数场值。

JsonField

对象

从内容中提取的JSON字段。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: json	字段值的语义数据类型。
valueJson		JSON 字段值。

LengthUnit

枚举

长度单位由宽度、高度和源属性组成。

值	说明
pixel	像素单元。
inch	英寸单元。

NumberField

对象

从内容中提取的数字字段。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: number	字段值的语义数据类型。
valueNumber	number (double)	数字字段值。

ObjectField

对象

从内容中提取的对象字段。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: object	字段值的语义数据类型。
valueObject	object	对象字段值。

OperationState

枚举

作状态

值	说明
NotStarted	尚未启动操作。
Running	作正在进行中。
Succeeded	作已成功完成。
Failed	作失败。
Canceled	用户已取消作。

ProcessingLocation

枚举

数据可能被处理的地点。默认是全局。

值	说明
geography	数据可能与资源在同一地理范围内处理。
dataZone	数据可以在与资源相同的数据区域内处理。
global	数据可以在全球任何 Azure 数据中心处理。

SemanticRole

枚举

段落的语义角色。

值	说明
pageHeader	页面顶部附近的文字。
pageFooter	页面底部附近的文字。
pageNumber	页码。
title	顶部标题描述整份文档。
sectionHeading	小标题描述文档的某一部分。
footnote	注释通常放在页面主内容之后。
formulaBlock	一组公式，通常与对齐一致。

StringField

对象

字符串字段从内容中提取。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: string	字段值的语义数据类型。
valueString	string	字符串场值。

TimeField

对象

从内容中提取的时间场。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测场值的信心。
source	string	编码源用于识别字段值在内容中的位置。
spans	ContentSpan[]	与降价内容中字段值相关的跨度。
type	string: time	字段值的语义数据类型。
valueTime	string (time)	时间字段值，采用ISO 8601（hh：mm：ss）格式。

TranscriptPhrase

对象

逐字稿短语。

名称	类型	说明
confidence	number (float) minimum: 0 maximum: 1	预测短语的自信。
endTimeMs	integer (int64)	短语结束时间以毫秒表示。
locale	string	检测到该短语的所在地。例如 en-US。
span	ContentSpan	折扣内容中该短语的范围。
speaker	string	演讲者索引或姓名。
startTimeMs	integer (int64)	短语的起始时间以毫秒为单位。
text	string	逐字稿文本。
words	TranscriptWord[]	短语中的单词列表。

TranscriptWord

对象

逐字稿。

名称	类型	说明
endTimeMs	integer (int64)	单词结束时间以毫秒表示。
span	ContentSpan	在Markdown内容中，这个词的范围。
startTimeMs	integer (int64)	单词的起始时间以毫秒计。
text	string	逐字稿文本。

UsageDetails

对象

使用详情。

名称	类型	说明
audioHours	number (float)	数小时的音频处理。
contextualizationTokens	integer (int32)	用于准备上下文、生成置信度评分、来源基础和输出格式化所消耗的上下文化代币数量。
documentPagesBasic	integer (int32)	基础层面处理的文档页数。对于没有明确页面的文档（ex. txt、html），每3000个UTF-16字符算作一页。
documentPagesMinimal	integer (int32)	最低处理的文档页数。对于没有明确页面的文档（ex. txt、html），每3000个UTF-16字符算作一页。
documentPagesStandard	integer (int32)	标准层面处理的文档页数。对于没有明确页面的文档（ex. txt、html），每3000个UTF-16字符算作一页。
tokens	object	消耗的LLM和嵌入令牌数量，按型号（例如GTP 4.1）和类型（例如输入、缓存输入、输出）分组。
videoHours	number (float)	数小时的视频处理。

通过

Content Analyzers - Analyze Binary

URI 参数

请求头

请求正文

响应

安全性

Ocp-Apim-Subscription-Key

EntraIdToken

作用域

示例

Analyze File

示例请求

示例响应

定义

AnalysisContentKind

AnalysisResult

ArrayField

AudioVisualContent

AudioVisualContentSegment

Azure.Core.Foundations.Error

Azure.Core.Foundations.ErrorResponse

Azure.Core.Foundations.InnerError

BooleanField

ContentAnalyzerAnalyzeOperationStatus

ContentFieldType

ContentSpan

DateField

DocumentAnnotation

DocumentAnnotationComment

DocumentAnnotationKind

DocumentBarcode

DocumentBarcodeKind

DocumentCaption

DocumentChartFigure

DocumentContent

DocumentContentSegment

DocumentFootnote

DocumentFormula

DocumentFormulaKind

DocumentHyperlink

DocumentLine

DocumentMermaidFigure

DocumentPage

DocumentParagraph

DocumentSection

DocumentTable

DocumentTableCell

DocumentTableCellKind

DocumentWord

IntegerField

JsonField

LengthUnit

NumberField

ObjectField

OperationState

ProcessingLocation

SemanticRole

StringField

TimeField

TranscriptPhrase

TranscriptWord

UsageDetails