你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

DocumentAnalysisClient class

用于与表单识别器服务的分析功能进行交互的客户端。

示例:

表单识别器服务和客户端支持两种身份验证方法:

Azure Active Directory

import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://<resource name>.cognitiveservices.azure.com";
const credential = new DefaultAzureCredential();

const client = new DocumentAnalysisClient(endpoint, credential);

API 密钥 (订阅密钥)

import { DocumentAnalysisClient, AzureKeyCredential } from "@azure/ai-form-recognizer";

const endpoint = "https://<resource name>.cognitiveservices.azure.com";
const credential = new AzureKeyCredential("<api key>");

const client = new DocumentAnalysisClient(endpoint, credential);

构造函数

DocumentAnalysisClient(string, KeyCredential, DocumentAnalysisClientOptions)

DocumentAnalysisClient从资源终结点和静态 API 密钥创建实例,KeyCredential () ,

示例:

import { DocumentAnalysisClient, AzureKeyCredential } from "@azure/ai-form-recognizer";

const endpoint = "https://<resource name>.cognitiveservices.azure.com";
const credential = new AzureKeyCredential("<api key>");

const client = new DocumentAnalysisClient(endpoint, credential);
DocumentAnalysisClient(string, TokenCredential, DocumentAnalysisClientOptions)

DocumentAnalysisClient从资源终结点和 Azure 标识 TokenCredential创建实例。

有关使用 Azure Active Directory 进行身份验证的详细信息, @azure/identity 请参阅包。

示例:

import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://<resource name>.cognitiveservices.azure.com";
const credential = new DefaultAzureCredential();

const client = new DocumentAnalysisClient(endpoint, credential);

方法

beginAnalyzeDocument(string, FormRecognizerRequestBody, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

使用由其唯一 ID 提供的模型从输入中提取数据。

此操作支持自定义模型和预生成模型。 例如,若要使用预生成发票模型,请提供模型 ID“prebuilt-invoice”,或使用更简单的预生成布局模型,请提供模型 ID“prebuilt-layout”。

在 中 AnalyzeResult 生成的字段取决于用于分析的模型,并且任何提取文档的字段中的值依赖于模型中的文档类型(如果有任何) 及其相应的字段架构), (。

示例

此方法支持可流式传输的请求正文 (FormRecognizerRequestBody) ,例如 Node.JS ReadableStream 对象、浏览器 BlobArrayBuffer。 正文的内容将上传到服务进行分析。

import * as fs from "fs";

const file = fs.createReadStream("path/to/receipt.pdf");

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model, but you could use a custom model ID/name instead.
const poller = await client.beginAnalyzeDocument("prebuilt-receipt", file);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)
  entities, // extracted entities in the input's content, which are categorized (ex. "Location" or "Organization")
  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// The fields correspond to the model's document types and their field schemas. Refer to the Form Recognizer
// documentation for information about the document types and field schemas within a model, or use the `getModel`
// operation to view this information programmatically.
console.log("The type of this receipt is:", receipt?.["ReceiptType"]?.value);
beginAnalyzeDocument<Result>(DocumentModel<Result>, FormRecognizerRequestBody, AnalyzeDocumentOptions<Result>)

使用具有已知强类型文档架构的模型从输入中提取数据,该架构 (DocumentModel) 。

AnalyzeResult 生成的字段取决于用于分析的模型。 在 TypeScript 中,此方法重载的结果类型是从输入 DocumentModel的类型推断出来的。

示例

此方法支持可流式传输的请求正文 (FormRecognizerRequestBody) ,例如 Node.JS ReadableStream 对象、浏览器 BlobArrayBuffer。 正文的内容将上传到服务进行分析。

如果提供的输入是字符串,则会将其视为要分析的文档位置的 URL。 有关详细信息,请参阅 beginAnalyzeDocumentFromUrl 方法。 使用 URL 时,首选使用该方法,并且仅在此方法中提供 URL 支持以实现向后兼容性。

import * as fs from "fs";

// See the `prebuilt` folder in the SDK samples (http://aka.ms/azsdk/formrecognizer/js/samples) for examples of
// DocumentModels for known prebuilts.
import { PrebuiltReceiptModel } from "./prebuilt-receipt.ts";

const file = fs.createReadStream("path/to/receipt.pdf");

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model.
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, file);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)

  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// Since we used the strongly-typed PrebuiltReceiptModel object instead of the "prebuilt-receipt" model ID
// string, the fields of the receipt are strongly-typed and have camelCase names (as opposed to PascalCase).
console.log("The type of this receipt is:", receipt.receiptType?.value);
beginAnalyzeDocumentFromUrl(string, string, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

使用由其唯一 ID 提供的模型从输入中提取数据。

此操作支持自定义模型和预生成模型。 例如,若要使用预生成发票模型,请提供模型 ID“prebuilt-invoice”,或使用更简单的预生成布局模型,请提供模型 ID“prebuilt-layout”。

在 中 AnalyzeResult 生成的字段取决于用于分析的模型,并且任何提取文档的字段中的值依赖于模型中的文档类型(如果有任何) 及其相应的字段架构), (。

示例

此方法支持从给定 URL 处的文件中提取数据。 表单识别器服务将尝试使用提交的 URL 下载文件,因此必须可从公共 Internet 访问该 URL。 例如,SAS 令牌可用于授予对 Azure 存储中 Blob 的读取访问权限,服务将使用 SAS 编码的 URL 来请求文件。

// the URL must be publicly accessible
const url = "<receipt document url>";

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model, but you could use a custom model ID/name instead.
const poller = await client.beginAnalyzeDocument("prebuilt-receipt", url);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)

  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// The fields correspond to the model's document types and their field schemas. Refer to the Form Recognizer
// documentation for information about the document types and field schemas within a model, or use the `getModel`
// operation to view this information programmatically.
console.log("The type of this receipt is:", receipt?.["ReceiptType"]?.value);
beginAnalyzeDocumentFromUrl<Result>(DocumentModel<Result>, string, AnalyzeDocumentOptions<Result>)

使用具有已知强类型文档架构的模型从输入中提取数据,该架构 (DocumentModel) 。

AnalyzeResult 生成的字段取决于用于分析的模型。 在 TypeScript 中,此方法重载的结果类型是从输入 DocumentModel的类型推断出来的。

示例

此方法支持从给定 URL 处的文件中提取数据。 表单识别器服务将尝试使用提交的 URL 下载文件,因此必须可从公共 Internet 访问该 URL。 例如,SAS 令牌可用于授予对 Azure 存储中 Blob 的读取访问权限,服务将使用 SAS 编码的 URL 来请求文件。

// See the `prebuilt` folder in the SDK samples (http://aka.ms/azsdk/formrecognizer/js/samples) for examples of
// DocumentModels for known prebuilts.
import { PrebuiltReceiptModel } from "./prebuilt-receipt.ts";

// the URL must be publicly accessible
const url = "<receipt document url>";

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model.
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, url);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)

  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// Since we used the strongly-typed PrebuiltReceiptModel object instead of the "prebuilt-receipt" model ID
// string, the fields of the receipt are strongly-typed and have camelCase names (as opposed to PascalCase).
console.log("The type of this receipt is:", receipt.receiptType?.value);
beginClassifyDocument(string, FormRecognizerRequestBody, ClassifyDocumentOptions)

使用由文档 ID 提供的自定义分类器对文档进行分类。

此方法 (轮询器) 生成长时间运行的操作,最终将生成 AnalyzeResult。 这与 和 beginAnalyzeDocumentFromUrl的类型相同beginAnalyzeDocument,但结果将只包含其字段的一小部分。 仅填充 documents 字段和 pages 字段,仅返回最少的页面信息。 字段 documents 将包含有关所有标识的文档及其 docType 分类的信息。

示例

此方法支持可流式传输的请求正文 (FormRecognizerRequestBody) ,例如 Node.JS ReadableStream 对象、浏览器 BlobArrayBuffer。 正文的内容将上传到服务进行分析。

import * as fs from "fs";

const file = fs.createReadStream("path/to/file.pdf");

const poller = await client.beginClassifyDocument("<classifier ID>", file);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain only basic information for classifiers
  documents // extracted documents and their types
} = await poller.pollUntilDone();

// We'll print the documents and their types
for (const { docType } of documents) {
  console.log("The type of this document is:", docType);
}
beginClassifyDocumentFromUrl(string, string, ClassifyDocumentOptions)

使用 ID 提供的自定义分类器对 URL 中的文档进行分类。

此方法 (轮询器) 生成长时间运行的操作,最终将生成 AnalyzeResult。 这与 和 beginAnalyzeDocumentFromUrl的类型相同beginAnalyzeDocument,但结果将只包含其字段的一小部分。 仅填充 documents 字段和 pages 字段,仅返回最少的页面信息。 字段 documents 将包含有关所有标识的文档及其 docType 分类的信息。

示例

此方法支持从给定 URL 处的文件中提取数据。 表单识别器服务将尝试使用提交的 URL 下载文件,因此必须可从公共 Internet 访问该 URL。 例如,SAS 令牌可用于授予对 Azure 存储中 Blob 的读取访问权限,服务将使用 SAS 编码的 URL 来请求文件。

// the URL must be publicly accessible
const url = "<file url>";

const poller = await client.beginClassifyDocument("<classifier ID>", url);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain only basic information for classifiers
  documents // extracted documents and their types
} = await poller.pollUntilDone();

// We'll print the documents and their types
for (const { docType } of documents) {
  console.log("The type of this document is:", docType);
}

构造函数详细信息

DocumentAnalysisClient(string, KeyCredential, DocumentAnalysisClientOptions)

DocumentAnalysisClient从资源终结点和静态 API 密钥创建实例,KeyCredential () ,

示例:

import { DocumentAnalysisClient, AzureKeyCredential } from "@azure/ai-form-recognizer";

const endpoint = "https://<resource name>.cognitiveservices.azure.com";
const credential = new AzureKeyCredential("<api key>");

const client = new DocumentAnalysisClient(endpoint, credential);
new DocumentAnalysisClient(endpoint: string, credential: KeyCredential, options?: DocumentAnalysisClientOptions)

参数

endpoint

string

Azure 认知服务实例的终结点 URL

credential
KeyCredential

包含认知服务实例订阅密钥的 KeyCredential

options
DocumentAnalysisClientOptions

用于在客户端中配置所有方法的可选设置

DocumentAnalysisClient(string, TokenCredential, DocumentAnalysisClientOptions)

DocumentAnalysisClient从资源终结点和 Azure 标识 TokenCredential创建实例。

有关使用 Azure Active Directory 进行身份验证的详细信息, @azure/identity 请参阅包。

示例:

import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { DefaultAzureCredential } from "@azure/identity";

const endpoint = "https://<resource name>.cognitiveservices.azure.com";
const credential = new DefaultAzureCredential();

const client = new DocumentAnalysisClient(endpoint, credential);
new DocumentAnalysisClient(endpoint: string, credential: TokenCredential, options?: DocumentAnalysisClientOptions)

参数

endpoint

string

Azure 认知服务实例的终结点 URL

credential
TokenCredential

包中的 @azure/identity TokenCredential 实例

options
DocumentAnalysisClientOptions

用于在客户端中配置所有方法的可选设置

方法详细信息

beginAnalyzeDocument(string, FormRecognizerRequestBody, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

使用由其唯一 ID 提供的模型从输入中提取数据。

此操作支持自定义模型和预生成模型。 例如,若要使用预生成发票模型,请提供模型 ID“prebuilt-invoice”,或使用更简单的预生成布局模型,请提供模型 ID“prebuilt-layout”。

在 中 AnalyzeResult 生成的字段取决于用于分析的模型,并且任何提取文档的字段中的值依赖于模型中的文档类型(如果有任何) 及其相应的字段架构), (。

示例

此方法支持可流式传输的请求正文 (FormRecognizerRequestBody) ,例如 Node.JS ReadableStream 对象、浏览器 BlobArrayBuffer。 正文的内容将上传到服务进行分析。

import * as fs from "fs";

const file = fs.createReadStream("path/to/receipt.pdf");

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model, but you could use a custom model ID/name instead.
const poller = await client.beginAnalyzeDocument("prebuilt-receipt", file);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)
  entities, // extracted entities in the input's content, which are categorized (ex. "Location" or "Organization")
  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// The fields correspond to the model's document types and their field schemas. Refer to the Form Recognizer
// documentation for information about the document types and field schemas within a model, or use the `getModel`
// operation to view this information programmatically.
console.log("The type of this receipt is:", receipt?.["ReceiptType"]?.value);
function beginAnalyzeDocument(modelId: string, document: FormRecognizerRequestBody, options?: AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

参数

modelId

string

此客户端资源中模型的唯一 ID (名称)

document
FormRecognizerRequestBody

将随请求一起上传的 FormRecognizerRequestBody

options

AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>

分析操作和轮询程序可选设置

返回

长时间运行的操作 (轮询器) ,最终将生成 AnalyzeResult

beginAnalyzeDocument<Result>(DocumentModel<Result>, FormRecognizerRequestBody, AnalyzeDocumentOptions<Result>)

使用具有已知强类型文档架构的模型从输入中提取数据,该架构 (DocumentModel) 。

AnalyzeResult 生成的字段取决于用于分析的模型。 在 TypeScript 中,此方法重载的结果类型是从输入 DocumentModel的类型推断出来的。

示例

此方法支持可流式传输的请求正文 (FormRecognizerRequestBody) ,例如 Node.JS ReadableStream 对象、浏览器 BlobArrayBuffer。 正文的内容将上传到服务进行分析。

如果提供的输入是字符串,则会将其视为要分析的文档位置的 URL。 有关详细信息,请参阅 beginAnalyzeDocumentFromUrl 方法。 使用 URL 时,首选使用该方法,并且仅在此方法中提供 URL 支持以实现向后兼容性。

import * as fs from "fs";

// See the `prebuilt` folder in the SDK samples (http://aka.ms/azsdk/formrecognizer/js/samples) for examples of
// DocumentModels for known prebuilts.
import { PrebuiltReceiptModel } from "./prebuilt-receipt.ts";

const file = fs.createReadStream("path/to/receipt.pdf");

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model.
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, file);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)

  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// Since we used the strongly-typed PrebuiltReceiptModel object instead of the "prebuilt-receipt" model ID
// string, the fields of the receipt are strongly-typed and have camelCase names (as opposed to PascalCase).
console.log("The type of this receipt is:", receipt.receiptType?.value);
function beginAnalyzeDocument<Result>(model: DocumentModel<Result>, document: FormRecognizerRequestBody, options?: AnalyzeDocumentOptions<Result>): Promise<AnalysisPoller<Result>>

参数

model

DocumentModel<Result>

一个 DocumentModel,表示用于分析的模型和预期的输出类型

document
FormRecognizerRequestBody

将随请求一起上传的 FormRecognizerRequestBody

options

AnalyzeDocumentOptions<Result>

分析操作和轮询程序可选设置

返回

Promise<AnalysisPoller<Result>>

长时间运行的操作 (轮询器) ,该轮询器最终将生成具有 AnalyzeResult 与输入模型关联的结果类型的文档

beginAnalyzeDocumentFromUrl(string, string, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

使用由其唯一 ID 提供的模型从输入中提取数据。

此操作支持自定义模型和预生成模型。 例如,若要使用预生成发票模型,请提供模型 ID“prebuilt-invoice”,或使用更简单的预生成布局模型,请提供模型 ID“prebuilt-layout”。

在 中 AnalyzeResult 生成的字段取决于用于分析的模型,并且任何提取文档的字段中的值依赖于模型中的文档类型(如果有任何) 及其相应的字段架构), (。

示例

此方法支持从给定 URL 处的文件中提取数据。 表单识别器服务将尝试使用提交的 URL 下载文件,因此必须可从公共 Internet 访问该 URL。 例如,SAS 令牌可用于授予对 Azure 存储中 Blob 的读取访问权限,服务将使用 SAS 编码的 URL 来请求文件。

// the URL must be publicly accessible
const url = "<receipt document url>";

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model, but you could use a custom model ID/name instead.
const poller = await client.beginAnalyzeDocument("prebuilt-receipt", url);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)

  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// The fields correspond to the model's document types and their field schemas. Refer to the Form Recognizer
// documentation for information about the document types and field schemas within a model, or use the `getModel`
// operation to view this information programmatically.
console.log("The type of this receipt is:", receipt?.["ReceiptType"]?.value);
function beginAnalyzeDocumentFromUrl(modelId: string, documentUrl: string, options?: AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

参数

modelId

string

此客户端资源中模型的唯一 ID (名称)

documentUrl

string

指向可从公共 Internet 访问的输入文档的 URL (字符串)

options

AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>

分析操作和轮询程序可选设置

返回

长时间运行的操作 (轮询器) ,最终将生成 AnalyzeResult

beginAnalyzeDocumentFromUrl<Result>(DocumentModel<Result>, string, AnalyzeDocumentOptions<Result>)

使用具有已知强类型文档架构的模型从输入中提取数据,该架构 (DocumentModel) 。

AnalyzeResult 生成的字段取决于用于分析的模型。 在 TypeScript 中,此方法重载的结果类型是从输入 DocumentModel的类型推断出来的。

示例

此方法支持从给定 URL 处的文件中提取数据。 表单识别器服务将尝试使用提交的 URL 下载文件,因此必须可从公共 Internet 访问该 URL。 例如,SAS 令牌可用于授予对 Azure 存储中 Blob 的读取访问权限,服务将使用 SAS 编码的 URL 来请求文件。

// See the `prebuilt` folder in the SDK samples (http://aka.ms/azsdk/formrecognizer/js/samples) for examples of
// DocumentModels for known prebuilts.
import { PrebuiltReceiptModel } from "./prebuilt-receipt.ts";

// the URL must be publicly accessible
const url = "<receipt document url>";

// The model that is passed to the following function call determines the type of the eventual result. In the
// example, we will use the prebuilt receipt model.
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, url);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain lines and words
  tables, // extracted tables, organized into cells that contain their contents
  styles, // text styles (ex. handwriting) that were observed in the document
  keyValuePairs, // extracted pairs of elements  (directed associations from one element in the input to another)

  documents // extracted documents (instances of one of the model's document types and its field schema)
} = await poller.pollUntilDone();

// Extract the fields of the first document. These fields constitute a receipt, because we used the receipt model
const [{ fields: receipt }] = documents;

// Since we used the strongly-typed PrebuiltReceiptModel object instead of the "prebuilt-receipt" model ID
// string, the fields of the receipt are strongly-typed and have camelCase names (as opposed to PascalCase).
console.log("The type of this receipt is:", receipt.receiptType?.value);
function beginAnalyzeDocumentFromUrl<Result>(model: DocumentModel<Result>, documentUrl: string, options?: AnalyzeDocumentOptions<Result>): Promise<AnalysisPoller<Result>>

参数

model

DocumentModel<Result>

一个 DocumentModel,表示用于分析的模型和预期的输出类型

documentUrl

string

指向可从公共 Internet 访问的输入文档的 URL (字符串)

options

AnalyzeDocumentOptions<Result>

分析操作和轮询程序可选设置

返回

Promise<AnalysisPoller<Result>>

长时间运行的操作 (轮询器) ,最终将生成 AnalyzeResult

beginClassifyDocument(string, FormRecognizerRequestBody, ClassifyDocumentOptions)

使用由文档 ID 提供的自定义分类器对文档进行分类。

此方法 (轮询器) 生成长时间运行的操作,最终将生成 AnalyzeResult。 这与 和 beginAnalyzeDocumentFromUrl的类型相同beginAnalyzeDocument,但结果将只包含其字段的一小部分。 仅填充 documents 字段和 pages 字段,仅返回最少的页面信息。 字段 documents 将包含有关所有标识的文档及其 docType 分类的信息。

示例

此方法支持可流式传输的请求正文 (FormRecognizerRequestBody) ,例如 Node.JS ReadableStream 对象、浏览器 BlobArrayBuffer。 正文的内容将上传到服务进行分析。

import * as fs from "fs";

const file = fs.createReadStream("path/to/file.pdf");

const poller = await client.beginClassifyDocument("<classifier ID>", file);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain only basic information for classifiers
  documents // extracted documents and their types
} = await poller.pollUntilDone();

// We'll print the documents and their types
for (const { docType } of documents) {
  console.log("The type of this document is:", docType);
}
function beginClassifyDocument(classifierId: string, document: FormRecognizerRequestBody, options?: ClassifyDocumentOptions): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

参数

classifierId

string

用于分析的自定义分类器的 ID

document
FormRecognizerRequestBody

要分类的文档

options
ClassifyDocumentOptions

分类操作的选项

返回

长时间运行的操作 (轮询器) ,最终将生成 AnalyzeResult

beginClassifyDocumentFromUrl(string, string, ClassifyDocumentOptions)

使用 ID 提供的自定义分类器对 URL 中的文档进行分类。

此方法 (轮询器) 生成长时间运行的操作,最终将生成 AnalyzeResult。 这与 和 beginAnalyzeDocumentFromUrl的类型相同beginAnalyzeDocument,但结果将只包含其字段的一小部分。 仅填充 documents 字段和 pages 字段,仅返回最少的页面信息。 字段 documents 将包含有关所有标识的文档及其 docType 分类的信息。

示例

此方法支持从给定 URL 处的文件中提取数据。 表单识别器服务将尝试使用提交的 URL 下载文件,因此必须可从公共 Internet 访问该 URL。 例如,SAS 令牌可用于授予对 Azure 存储中 Blob 的读取访问权限,服务将使用 SAS 编码的 URL 来请求文件。

// the URL must be publicly accessible
const url = "<file url>";

const poller = await client.beginClassifyDocument("<classifier ID>", url);

// The result is a long-running operation (poller), which must itself be polled until the operation completes
const {
  pages, // pages extracted from the document, which contain only basic information for classifiers
  documents // extracted documents and their types
} = await poller.pollUntilDone();

// We'll print the documents and their types
for (const { docType } of documents) {
  console.log("The type of this document is:", docType);
}
function beginClassifyDocumentFromUrl(classifierId: string, documentUrl: string, options?: ClassifyDocumentOptions): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

参数

classifierId

string

用于分析的自定义分类器的 ID

documentUrl

string

要分类的文档的 URL

返回