你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

文档智能入门

重要

  • Azure 认知服务表单识别器现称为 Azure AI 文档智能。
  • 某些平台仍在等待命名更新。
  • 我们的文档中提及的所有表单识别器或文档智能均指同一项 Azure 服务。

此内容适用于:选中标记 v4.0(预览版)先前版本:蓝色复选标记 v3.1 (GA)蓝色复选标记 v3.0 (GA)

  • Azure AI 文档智能最新预览版 (2024-07-31-preview) 入门。

此内容适用于:选中标记 v3.1 (GA) 先前版本:蓝色复选标记 v3.0蓝色复选标记 v2.1

  • Azure 表单识别器最新 GA 版本 (2023-07-31) 入门。

此内容适用于:选中标记 v3.0 (GA) 更新版本:蓝色复选标记 v3.1 蓝色复选标记 v2.1

  • Azure 表单识别器旧版 GA 版本 (2022-08-31) 入门。
  • Azure AI 文档智能/表单识别器是一款基于云的 Azure AI 服务,它使用机器学习从文档中提取键值对、文本、表和关键数据。

  • 可以使用编程语言 SDK 或调用 REST API 轻松将文档处理模型集成到工作流和应用程序中。

  • 对于此快速入门,我们建议你在学习该技术时使用免费服务。 请记住,每月的免费页数限于 500。

若要详细了解 API 功能和开发选项,请访问我们的概述页。

在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:

  • 布局模型 - 分析并提取表、行、字词和选择标记(例如文档中的单选按钮和复选框),而无需训练模型。

  • 预生成模型 - 使用预生成模型分析和提取特定文档类型的公共字段。

先决条件

  • Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务Azure AI 多服务资源以获取密钥和终结点。

  • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

提示

如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

  • Azure AI 服务或表单识别器资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务Azure AI 多服务资源以获取密钥和终结点。

  • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

提示

如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 若要仅访问表单识别器,请创建表单识别器资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需从创建的资源获取密钥和终结点,以便将应用程序连接到表单识别器 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

  1. 启动 Visual Studio。

  2. 在“开始”页上,选择“创建新项目”。

    Visual Studio 开始窗口的屏幕截图。

  3. 在“创建新项目”页面上,在搜索框中输入“控制台”。 选择“控制台应用程序”模板,然后选择“下一步”。

    Visual Studio“创建新项目”页面的屏幕截图。

  1. 在“配置新项目”对话框中,在项目名称框中输入 doc_intel_quickstart。 然后选择“下一步” 。
  1. 在“配置新项目”对话框中,在项目名称框中输入 form_recognizer_quickstart。 然后选择“下一步”。
  1. 在“附加信息”对话框窗口中,选择“.NET 8.0 (长期支持)”,然后选择“创建”

    Visual Studio“其他信息”对话框窗口的屏幕截图。

使用 NuGet 安装客户端库

  1. 右键单击“doc_intel_quickstart”项目并选择“管理 NuGet 包...”。

    在 Visual Studio 中选择 NuGet 预发行包窗口的屏幕截图。

  2. 选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer

  3. 选中 Include prerelease 复选框。

    在 Visual Studio 中选择预发布 NuGet 包的屏幕截图。

  4. 从下拉菜单中选择一个版本,然后将包安装到项目中。

  1. 右键单击 form_recognizer_quickstart 项目,然后选择“管理 NuGet 包...”

    在 Visual Studio 中查找 NuGet 包窗口的屏幕截图。

  2. 选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer。 从下拉菜单中选择版本 4.1.0

    在 Visual Studio 中选择 NuGet 表单识别器包的屏幕截图。

  1. 右键单击 form_recognizer_quickstart 项目,然后选择“管理 NuGet 包...”

    Visual Studio 中 NuGet 包窗口的屏幕截图。

  2. 选择“浏览”选项卡,并输入 Azure.AI.FormRecognizer。 从下拉菜单中选择版本 4.0.0

    在 Visual Studio 中选择 NuGet 旧包的屏幕截图。

生成应用程序

若要与文档智能服务交互,需要创建 DocumentIntelligenceClient 类的实例。 为此,你要通过 Azure 门户使用 key 创建一个 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建一个 DocumentIntelligenceClient 实例。

若要与表单识别器服务交互,需要创建 DocumentAnalysisClient 类的实例。 为此,你将使用 Azure 门户的 key 创建一个 AzureKeyCredential,使用 AzureKeyCredential 和表单识别器 endpoint 创建一个 DocumentAnalysisClient 实例。

注意

  • 从 .NET 6 开始,使用 console 模板的新项目将生成与以前版本不同的新程序样式。
  • 新的输出使用最新的 C# 功能,这些功能简化了你需要编写的代码。
  • 使用较新版本时,只需编写 Main 方法的主体。 无需包括顶级语句、全局 using 指令或隐式 using 指令。
  • 有关详细信息,请参阅新的 C# 模板生成顶级语句
  1. 打开 Program.cs 文件。

  2. 删除预先存在的代码(包括行 Console.Writeline("Hello World!"))并选择以下代码示例之一,以复制和粘贴到应用程序的 Program.cs 文件中:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

布局模型

从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 我们已将文件 URI 值添加到脚本顶部的 Uri fileUri 变量中。
  • 若要从 URI 中的特定文件中提取布局,请使用 StartAnalyzeDocumentFromUri 方法,并传递 prebuilt-layout 作为模型 ID。 返回的值是一个 AnalyzeResult 对象,其中包含来自已提交文档的数据。

将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


using Azure;
using Azure.AI.DocumentIntelligence;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentIntelligenceClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentIntelligenceClient client = new DocumentIntelligenceClient(new Uri(endpoint), credential);

//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");

AnalyzeDocumentContent content = new AnalyzeDocumentContent()
{
    UrlSource= fileUri
};

Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-layout", content);

AnalyzeResult result = operation.Value;

foreach (DocumentPage page in result.Pages)
{
    Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s)," +
        $" and {page.SelectionMarks.Count} selection mark(s).");

    for (int i = 0; i < page.Lines.Count; i++)
    {
        DocumentLine line = page.Lines[i];

        Console.WriteLine($"  Line {i}:");
        Console.WriteLine($"    Content: '{line.Content}'");

        Console.Write("    Bounding polygon, with points ordered clockwise:");
        for (int j = 0; j < line.Polygon.Count; j += 2)
        {
            Console.Write($" ({line.Polygon[j]}, {line.Polygon[j + 1]})");
        }

        Console.WriteLine();
    }

    for (int i = 0; i < page.SelectionMarks.Count; i++)
    {
        DocumentSelectionMark selectionMark = page.SelectionMarks[i];

        Console.WriteLine($"  Selection Mark {i} is {selectionMark.State}.");
        Console.WriteLine($"    State: {selectionMark.State}");

        Console.Write("    Bounding polygon, with points ordered clockwise:");
        for (int j = 0; j < selectionMark.Polygon.Count; j++)
        {
            Console.Write($" ({selectionMark.Polygon[j]}, {selectionMark.Polygon[j + 1]})");
        }

        Console.WriteLine();
    }
}

for (int i = 0; i < result.Paragraphs.Count; i++)
{
    DocumentParagraph paragraph = result.Paragraphs[i];

    Console.WriteLine($"Paragraph {i}:");
    Console.WriteLine($"  Content: {paragraph.Content}");

    if (paragraph.Role != null)
    {
        Console.WriteLine($"  Role: {paragraph.Role}");
    }
}

foreach (DocumentStyle style in result.Styles)
{
    // Check the style and style confidence to see if text is handwritten.
    // Note that value '0.8' is used as an example.

    bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;

    if (isHandwritten && style.Confidence > 0.8)
    {
        Console.WriteLine($"Handwritten content found:");

        foreach (DocumentSpan span in style.Spans)
        {
            var handwrittenContent = result.Content.Substring(span.Offset, span.Length);
            Console.WriteLine($"  {handwrittenContent}");
        }
    }
}

for (int i = 0; i < result.Tables.Count; i++)
{
    DocumentTable table = result.Tables[i];

    Console.WriteLine($"Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");

    foreach (DocumentTableCell cell in table.Cells)
    {
        Console.WriteLine($"  Cell ({cell.RowIndex}, {cell.ColumnIndex}) is a '{cell.Kind}' with content: {cell.Content}");
    }
}

运行应用程序

将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5

运行 Visual Studio 程序按钮的屏幕截图。

将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:

using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);

//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");

AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-layout", fileUri);

AnalyzeResult result = operation.Value;

foreach (DocumentPage page in result.Pages)
{
    Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
    Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");

    for (int i = 0; i < page.Lines.Count; i++)
    {
        DocumentLine line = page.Lines[i];
        Console.WriteLine($"  Line {i} has content: '{line.Content}'.");

        Console.WriteLine($"    Its bounding box is:");
        Console.WriteLine($"      Upper left => X: {line.BoundingPolygon[0].X}, Y= {line.BoundingPolygon[0].Y}");
        Console.WriteLine($"      Upper right => X: {line.BoundingPolygon[1].X}, Y= {line.BoundingPolygon[1].Y}");
        Console.WriteLine($"      Lower right => X: {line.BoundingPolygon[2].X}, Y= {line.BoundingPolygon[2].Y}");
        Console.WriteLine($"      Lower left => X: {line.BoundingPolygon[3].X}, Y= {line.BoundingPolygon[3].Y}");
    }

    for (int i = 0; i < page.SelectionMarks.Count; i++)
    {
        DocumentSelectionMark selectionMark = page.SelectionMarks[i];

        Console.WriteLine($"  Selection Mark {i} is {selectionMark.State}.");
        Console.WriteLine($"    Its bounding box is:");
        Console.WriteLine($"      Upper left => X: {selectionMark.BoundingPolygon[0].X}, Y= {selectionMark.BoundingPolygon[0].Y}");
        Console.WriteLine($"      Upper right => X: {selectionMark.BoundingPolygon[1].X}, Y= {selectionMark.BoundingPolygon[1].Y}");
        Console.WriteLine($"      Lower right => X: {selectionMark.BoundingPolygon[2].X}, Y= {selectionMark.BoundingPolygon[2].Y}");
        Console.WriteLine($"      Lower left => X: {selectionMark.BoundingPolygon[3].X}, Y= {selectionMark.BoundingPolygon[3].Y}");
    }
}

foreach (DocumentStyle style in result.Styles)
{
    // Check the style and style confidence to see if text is handwritten.
    // Note that value '0.8' is used as an example.

    bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;

    if (isHandwritten && style.Confidence > 0.8)
    {
        Console.WriteLine($"Handwritten content found:");

        foreach (DocumentSpan span in style.Spans)
        {
            Console.WriteLine($"  Content: {result.Content.Substring(span.Index, span.Length)}");
        }
    }
}

Console.WriteLine("The following tables were extracted:");

for (int i = 0; i < result.Tables.Count; i++)
{
    DocumentTable table = result.Tables[i];
    Console.WriteLine($"  Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");

    foreach (DocumentTableCell cell in table.Cells)
    {
        Console.WriteLine($"    Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
    }
}

运行应用程序

将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5

运行 Visual Studio 程序按钮位置的屏幕截图。

布局模型输出

下面是预期输出的代码段:

  Document Page 1 has 69 line(s), 425 word(s), and 15 selection mark(s).
  Line 0 has content: 'UNITED STATES'.
    Its bounding box is:
      Upper left => X: 3.4915, Y= 0.6828
      Upper right => X: 5.0116, Y= 0.6828
      Lower right => X: 5.0116, Y= 0.8265
      Lower left => X: 3.4915, Y= 0.8265
  Line 1 has content: 'SECURITIES AND EXCHANGE COMMISSION'.
    Its bounding box is:
      Upper left => X: 2.1937, Y= 0.9061
      Upper right => X: 6.297, Y= 0.9061
      Lower right => X: 6.297, Y= 1.0498
      Lower left => X: 2.1937, Y= 1.0498

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出

将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:

using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);

//sample document
Uri fileUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf");

AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-layout", fileUri);

AnalyzeResult result = operation.Value;

foreach (DocumentPage page in result.Pages)
{
    Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
    Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");

    for (int i = 0; i < page.Lines.Count; i++)
    {
        DocumentLine line = page.Lines[i];
        Console.WriteLine($"  Line {i} has content: '{line.Content}'.");

        Console.WriteLine($"    Its bounding polygon (points ordered clockwise):");

        for (int j = 0; j < line.BoundingPolygon.Count; j++)
        {
            Console.WriteLine($"      Point {j} => X: {line.BoundingPolygon[j].X}, Y: {line.BoundingPolygon[j].Y}");
        }
    }

    for (int i = 0; i < page.SelectionMarks.Count; i++)
    {
        DocumentSelectionMark selectionMark = page.SelectionMarks[i];

        Console.WriteLine($"  Selection Mark {i} is {selectionMark.State}.");
        Console.WriteLine($"    Its bounding polygon (points ordered clockwise):");

        for (int j = 0; j < selectionMark.BoundingPolygon.Count; j++)
        {
            Console.WriteLine($"      Point {j} => X: {selectionMark.BoundingPolygon[j].X}, Y: {selectionMark.BoundingPolygon[j].Y}");
        }
    }
}

Console.WriteLine("Paragraphs:");

foreach (DocumentParagraph paragraph in result.Paragraphs)
{
    Console.WriteLine($"  Paragraph content: {paragraph.Content}");

    if (paragraph.Role != null)
    {
        Console.WriteLine($"    Role: {paragraph.Role}");
    }
}

foreach (DocumentStyle style in result.Styles)
{
    // Check the style and style confidence to see if text is handwritten.
    // Note that value '0.8' is used as an example.

    bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;

    if (isHandwritten && style.Confidence > 0.8)
    {
        Console.WriteLine($"Handwritten content found:");

        foreach (DocumentSpan span in style.Spans)
        {
            Console.WriteLine($"  Content: {result.Content.Substring(span.Index, span.Length)}");
        }
    }
}

Console.WriteLine("The following tables were extracted:");

for (int i = 0; i < result.Tables.Count; i++)
{
    DocumentTable table = result.Tables[i];
    Console.WriteLine($"  Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");

    foreach (DocumentTableCell cell in table.Cells)
    {
        Console.WriteLine($"    Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
    }
}
Extract the layout of a document from a file stream
To extract the layout from a given file at a file stream, use the AnalyzeDocument method and pass prebuilt-layout as the model ID. The returned value is an AnalyzeResult object containing data about the submitted document.

string filePath = "<filePath>";
using var stream = new FileStream(filePath, FileMode.Open);

AnalyzeDocumentOperation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-layout", stream);
AnalyzeResult result = operation.Value;

foreach (DocumentPage page in result.Pages)
{
    Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
    Console.WriteLine($"and {page.SelectionMarks.Count} selection mark(s).");

    for (int i = 0; i < page.Lines.Count; i++)
    {
        DocumentLine line = page.Lines[i];
        Console.WriteLine($"  Line {i} has content: '{line.Content}'.");

        Console.WriteLine($"    Its bounding polygon (points ordered clockwise):");

        for (int j = 0; j < line.BoundingPolygon.Count; j++)
        {
            Console.WriteLine($"      Point {j} => X: {line.BoundingPolygon[j].X}, Y: {line.BoundingPolygon[j].Y}");
        }
    }

    for (int i = 0; i < page.SelectionMarks.Count; i++)
    {
        DocumentSelectionMark selectionMark = page.SelectionMarks[i];

        Console.WriteLine($"  Selection Mark {i} is {selectionMark.State}.");
        Console.WriteLine($"    Its bounding polygon (points ordered clockwise):");

        for (int j = 0; j < selectionMark.BoundingPolygon.Count; j++)
        {
            Console.WriteLine($"      Point {j} => X: {selectionMark.BoundingPolygon[j].X}, Y: {selectionMark.BoundingPolygon[j].Y}");
        }
    }
}

Console.WriteLine("Paragraphs:");

foreach (DocumentParagraph paragraph in result.Paragraphs)
{
    Console.WriteLine($"  Paragraph content: {paragraph.Content}");

    if (paragraph.Role != null)
    {
        Console.WriteLine($"    Role: {paragraph.Role}");
    }
}

foreach (DocumentStyle style in result.Styles)
{
    // Check the style and style confidence to see if text is handwritten.
    // Note that value '0.8' is used as an example.

    bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;

    if (isHandwritten && style.Confidence > 0.8)
    {
        Console.WriteLine($"Handwritten content found:");

        foreach (DocumentSpan span in style.Spans)
        {
            Console.WriteLine($"  Content: {result.Content.Substring(span.Index, span.Length)}");
        }
    }
}

Console.WriteLine("The following tables were extracted:");

for (int i = 0; i < result.Tables.Count; i++)
{
    DocumentTable table = result.Tables[i];
    Console.WriteLine($"  Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");

    foreach (DocumentTableCell cell in table.Cells)
    {
        Console.WriteLine($"    Cell ({cell.RowIndex}, {cell.ColumnIndex}) has kind '{cell.Kind}' and content: '{cell.Content}'.");
    }
}

运行应用程序

将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5

运行 Visual Studio 程序的屏幕截图。

预生成模型

使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。

提示

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze 操作的模型由要分析的文档类型确定。 请参阅模型数据提取

  • 使用预生成发票模型分析发票。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URI 值添加到 Program.cs 文件顶部的 Uri invoiceUri 变量。
  • 若要分析 URI 中的特定文件,请使用 StartAnalyzeDocumentFromUri 方法,并传递 prebuilt-invoice 作为模型 ID。 返回的值是一个 AnalyzeResult 对象,其中包含来自已提交文档的数据。
  • 为简洁起见,此处并未显示服务返回的所有键值对。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


using Azure;
using Azure.AI.DocumentIntelligence;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentIntelligenceClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentIntelligenceClient client = new DocumentIntelligenceClient(new Uri(endpoint), credential);

//sample invoice document

Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");

AnalyzeDocumentContent content = new AnalyzeDocumentContent()
{
    UrlSource = invoiceUri
};

Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-invoice", content);

AnalyzeResult result = operation.Value;

for (int i = 0; i < result.Documents.Count; i++)
{
    Console.WriteLine($"Document {i}:");

    AnalyzedDocument document = result.Documents[i];

    if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField)
        && vendorNameField.Type == DocumentFieldType.String)
    {
        string vendorName = vendorNameField.ValueString;
        Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
    }

    if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField)
        && customerNameField.Type == DocumentFieldType.String)
    {
        string customerName = customerNameField.ValueString;
        Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
    }

    if (document.Fields.TryGetValue("Items", out DocumentField itemsField)
        && itemsField.Type == DocumentFieldType.Array)
    {
        foreach (DocumentField itemField in itemsField.ValueArray)
        {
            Console.WriteLine("Item:");

            if (itemField.Type == DocumentFieldType.Object)
            {
                IReadOnlyDictionary<string, DocumentField> itemFields = itemField.ValueObject;

                if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField)
                    && itemDescriptionField.Type == DocumentFieldType.String)
                {
                    string itemDescription = itemDescriptionField.ValueString;
                    Console.WriteLine($"  Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
                }

                if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField)
                    && itemAmountField.Type == DocumentFieldType.Currency)
                {
                    CurrencyValue itemAmount = itemAmountField.ValueCurrency;
                    Console.WriteLine($"  Amount: '{itemAmount.CurrencySymbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
                }
            }
        }
    }

    if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField)
        && subTotalField.Type == DocumentFieldType.Currency)
    {
        CurrencyValue subTotal = subTotalField.ValueCurrency;
        Console.WriteLine($"Sub Total: '{subTotal.CurrencySymbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
    }

    if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField)
        && totalTaxField.Type == DocumentFieldType.Currency)
    {
        CurrencyValue totalTax = totalTaxField.ValueCurrency;
        Console.WriteLine($"Total Tax: '{totalTax.CurrencySymbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
    }

    if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField)
        && invoiceTotalField.Type == DocumentFieldType.Currency)
    {
        CurrencyValue invoiceTotal = invoiceTotalField.ValueCurrency;
        Console.WriteLine($"Invoice Total: '{invoiceTotal.CurrencySymbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
    }
}

运行应用程序

将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5

运行 Visual Studio 程序按钮的屏幕截图。

将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:


using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `FormRecognizerClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);

//sample invoice document

Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");

Operation operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-invoice", invoiceUri);

AnalyzeResult result = operation.Value;

for (int i = 0; i < result.Documents.Count; i++)
{
    Console.WriteLine($"Document {i}:");

    AnalyzedDocument document = result.Documents[i];

    if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField))
    {
        if (vendorNameField.FieldType == DocumentFieldType.String)
        {
            string vendorName = vendorNameField.Value.AsString();
            Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField))
    {
        if (customerNameField.FieldType == DocumentFieldType.String)
        {
            string customerName = customerNameField.Value.AsString();
            Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("Items", out DocumentField itemsField))
    {
        if (itemsField.FieldType == DocumentFieldType.List)
        {
            foreach (DocumentField itemField in itemsField.Value.AsList())
            {
                Console.WriteLine("Item:");

                if (itemField.FieldType == DocumentFieldType.Dictionary)
                {
                    IReadOnlyDictionary<string, DocumentField> itemFields = itemField.Value.AsDictionary();

                    if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField))
                    {
                        if (itemDescriptionField.FieldType == DocumentFieldType.String)
                        {
                            string itemDescription = itemDescriptionField.Value.AsString();

                            Console.WriteLine($"  Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
                        }
                    }

                    if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField))
                    {
                        if (itemAmountField.FieldType == DocumentFieldType.Currency)
                        {
                            CurrencyValue itemAmount = itemAmountField.Value.AsCurrency();

                            Console.WriteLine($"  Amount: '{itemAmount.Symbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
                        }
                    }
                }
            }
        }
    }

    if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField))
    {
        if (subTotalField.FieldType == DocumentFieldType.Currency)
        {
            CurrencyValue subTotal = subTotalField.Value.AsCurrency();
            Console.WriteLine($"Sub Total: '{subTotal.Symbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField))
    {
        if (totalTaxField.FieldType == DocumentFieldType.Currency)
        {
            CurrencyValue totalTax = totalTaxField.Value.AsCurrency();
            Console.WriteLine($"Total Tax: '{totalTax.Symbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField))
    {
        if (invoiceTotalField.FieldType == DocumentFieldType.Currency)
        {
            CurrencyValue invoiceTotal = invoiceTotalField.Value.AsCurrency();
            Console.WriteLine($"Invoice Total: '{invoiceTotal.Symbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
        }
    }
}

运行应用程序

将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5

运行 Visual Studio 程序按钮位置的屏幕截图。

预生成模型输出

下面是预期输出的代码段:

  Document 0:
  Vendor Name: 'CONTOSO LTD.', with confidence 0.962
  Customer Name: 'MICROSOFT CORPORATION', with confidence 0.951
  Item:
    Description: 'Test for 23 fields', with confidence 0.899
    Amount: '100', with confidence 0.902
  Sub Total: '100', with confidence 0.979

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出

将以下代码示例添加到 Program.cs 文件。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:


using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `FormRecognizerClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);

//sample invoice document

Uri invoiceUri = new Uri ("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf");

AnalyzeDocumentOperation operation = await client.AnalyzeDocumentFromUriAsync(WaitUntil.Completed, "prebuilt-invoice", invoiceUri);

AnalyzeResult result = operation.Value;

for (int i = 0; i < result.Documents.Count; i++)
{
    Console.WriteLine($"Document {i}:");

    AnalyzedDocument document = result.Documents[i];

    if (document.Fields.TryGetValue("VendorName", out DocumentField vendorNameField))
    {
        if (vendorNameField.FieldType == DocumentFieldType.String)
        {
            string vendorName = vendorNameField.Value.AsString();
            Console.WriteLine($"Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("CustomerName", out DocumentField customerNameField))
    {
        if (customerNameField.FieldType == DocumentFieldType.String)
        {
            string customerName = customerNameField.Value.AsString();
            Console.WriteLine($"Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("Items", out DocumentField itemsField))
    {
        if (itemsField.FieldType == DocumentFieldType.List)
        {
            foreach (DocumentField itemField in itemsField.Value.AsList())
            {
                Console.WriteLine("Item:");

                if (itemField.FieldType == DocumentFieldType.Dictionary)
                {
                    IReadOnlyDictionary<string, DocumentField> itemFields = itemField.Value.AsDictionary();

                    if (itemFields.TryGetValue("Description", out DocumentField itemDescriptionField))
                    {
                        if (itemDescriptionField.FieldType == DocumentFieldType.String)
                        {
                            string itemDescription = itemDescriptionField.Value.AsString();

                            Console.WriteLine($"  Description: '{itemDescription}', with confidence {itemDescriptionField.Confidence}");
                        }
                    }

                    if (itemFields.TryGetValue("Amount", out DocumentField itemAmountField))
                    {
                        if (itemAmountField.FieldType == DocumentFieldType.Currency)
                        {
                            CurrencyValue itemAmount = itemAmountField.Value.AsCurrency();

                            Console.WriteLine($"  Amount: '{itemAmount.Symbol}{itemAmount.Amount}', with confidence {itemAmountField.Confidence}");
                        }
                    }
                }
            }
        }
    }

    if (document.Fields.TryGetValue("SubTotal", out DocumentField subTotalField))
    {
        if (subTotalField.FieldType == DocumentFieldType.Currency)
        {
            CurrencyValue subTotal = subTotalField.Value.AsCurrency();
            Console.WriteLine($"Sub Total: '{subTotal.Symbol}{subTotal.Amount}', with confidence {subTotalField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("TotalTax", out DocumentField totalTaxField))
    {
        if (totalTaxField.FieldType == DocumentFieldType.Currency)
        {
            CurrencyValue totalTax = totalTaxField.Value.AsCurrency();
            Console.WriteLine($"Total Tax: '{totalTax.Symbol}{totalTax.Amount}', with confidence {totalTaxField.Confidence}");
        }
    }

    if (document.Fields.TryGetValue("InvoiceTotal", out DocumentField invoiceTotalField))
    {
        if (invoiceTotalField.FieldType == DocumentFieldType.Currency)
        {
            CurrencyValue invoiceTotal = invoiceTotalField.Value.AsCurrency();
            Console.WriteLine($"Invoice Total: '{invoiceTotal.Symbol}{invoiceTotal.Amount}', with confidence {invoiceTotalField.Confidence}");
        }
    }
}

运行应用程序

将代码示例添加到应用程序后,选择 formRecognizer_quickstart 旁边的绿色“开始”按钮以生成和运行程序,或按 F5

运行 Visual Studio 程序的屏幕截图。

在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:

  • 布局 - 分析并提取表、行、字词和选择标记(例如文档中的单选按钮和复选框),而无需训练模型。

  • 预生成的发票 - 使用预先训练的模型分析和提取特定文档类型的公共字段。

先决条件

  • Azure 订阅 - 免费创建订阅

  • 最新版本的 Visual Studio Code 或者你首选的 IDE。 请参阅 Visual Studio Code 中的 Java

    提示

    • Visual Studio Code 提供适用于 Windows 和 macOS 的 "Java 编码包"。该编码包是 VS Code、Java 开发工具包 (JDK) 和 Microsoft 建议扩展的集合。 编码包还可用于修复现有开发环境。
    • 如果使用 VS Code 和适用于 Java 的编码包,请安装 Gradle for Java 扩展。
  • 如果不使用 Visual Studio Code,请确保在开发环境中安装了以下内容:

  • Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

    提示

    如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

创建新的 Gradle 项目

  1. 在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 doc-intel-app 的新目录,并导航到该目录。

    mkdir doc-intel-app && doc-intel-app
    
    mkdir doc-intel-app; cd doc-intel-app
    
  2. 从工作目录运行 gradle init 命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。

    gradle init --type basic
    
  3. 当提示你选择一个 DSL 时,选择 Kotlin

  4. 按 Return 键或 Enter 键接受默认项目名称 (doc-intel-app)。

  1. 在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 form-recognize-app 的新目录,并导航到该目录

    mkdir form-recognize-app && form-recognize-app
    
    mkdir form-recognize-app; cd form-recognize-app
    
  2. 从工作目录运行 gradle init 命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。

    gradle init --type basic
    
  3. 当提示你选择一个 DSL 时,选择 Kotlin

  4. 通过选择 Return 或 Enter 接受默认项目名称 (form-recognize-app)。

安装客户端库

本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。

在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation 语句与所需的插件和设置一起包括在内。

   plugins {
       java
       application
   }
   application {
       mainClass.set("DocIntelligence")
   }
   repositories {
       mavenCentral()
   }
   dependencies {
       implementation group: 'com.azure', name: 'azure-ai-documentintelligence', version: '1.0.0-beta.4'

   }

本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。

在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation 语句与所需的插件和设置一起包括在内。

   plugins {
       java
       application
   }
   application {
       mainClass.set("FormRecognizer")
   }
   repositories {
       mavenCentral()
   }
   dependencies {
       implementation group: 'com.azure', name: 'azure-ai-formrecognizer', version: '4.1.0'

   }

本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。

在 IDE 中打开项目的 build.gradle.kts 文件。 复制粘贴以下代码,将客户端库作为 implementation 语句与所需的插件和设置一起包括在内。

   plugins {
       java
       application
   }
   application {
       mainClass.set("FormRecognizer")
   }
   repositories {
       mavenCentral()
   }
   dependencies {
       implementation group: 'com.azure', name: 'azure-ai-formrecognizer', version: '4.0.0'


   }

创建 Java 应用程序

若要与文档智能服务交互,需要创建 DocumentIntelligenceClient 类的实例。 为此,你要通过 Azure 门户使用 key 创建一个 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建一个 DocumentIntelligenceClient 实例。

若要与文档智能服务交互,需要创建 DocumentAnalysisClient 类的实例。 为此,你要通过 Azure 门户使用 key 创建一个 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建一个 DocumentAnalysisClient 实例。

  1. 从 doc-intel-app 目录运行以下命令:

    mkdir -p src/main/java
    

    创建以下目录结构:

    Java 目录结构的屏幕截图

  1. 导航到 java 目录,创建一个名为 DocIntelligence.java 的文件。

    提示

    • 可以使用 Powershell 创建新文件。
    • 按住 Shift 键并右键单击该文件夹,在项目目录中打开 Powershell 窗口。
    • 键入以下命令 New-Item DocIntelligence.java。
  2. 打开 DocIntelligence.java 文件。 将以下代码示例之一复制并粘贴到应用程序中:

  1. 导航到 java 目录,创建一个名为 FormRecognizer.java 的文件。

    提示

    • 可以使用 Powershell 创建新文件。
    • 按住 Shift 键并右键单击该文件夹,在项目目录中打开 Powershell 窗口。
    • 键入以下命令:New-Item FormRecognizer.java
  2. 打开 FormRecognizer.java 文件。 将以下代码示例之一复制并粘贴到应用程序中:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

布局模型

从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 若要分析 URI 上的给定文件,请使用 beginAnalyzeDocumentFromUrl 方法并传递 prebuilt-layout 作为模型 ID。返回的值是一个 AnalyzeResult 对象,其中包含有关提交的文档的数据。
  • 我们已将文件 URI 值添加到 main 方法的 documentUrl 变量中。

将以下代码示例添加到 DocIntelligence.java 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


import com.azure.ai.documentintelligence.models.AnalyzeDocumentRequest;
import com.azure.ai.documentintelligence.models.AnalyzeResult;
import com.azure.ai.documentintelligence.models.AnalyzeResultOperation;
import com.azure.ai.documentintelligence.models.DocumentTable;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.util.List;

public class DocIntelligence {

  // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
  private static final String endpoint = "<your-endpoint>";
  private static final String key = "<your-key>";

  public static void main(String[] args) {

    // create your `DocumentIntelligenceClient` instance and `AzureKeyCredential` variable
    DocumentIntelligenceClient client = new DocumentIntelligenceClientBuilder()
      .credential(new AzureKeyCredential(key))
      .endpoint(endpoint)
      .buildClient();

    // sample document
    String modelId = "prebuilt-layout";
    String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";

    SyncPoller <AnalyzeResultOperation, AnalyzeResultOperation> analyzeLayoutPoller =
      client.beginAnalyzeDocument(modelId,
          null,
          null,
          null,
          null,
          null,
          null,
          new AnalyzeDocumentRequest().setUrlSource(documentUrl));

    AnalyzeResult analyzeLayoutResult = analyzeLayoutPoller.getFinalResult().getAnalyzeResult();

    // pages
    analyzeLayoutResult.getPages().forEach(documentPage -> {
      System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
        documentPage.getWidth(),
        documentPage.getHeight(),
        documentPage.getUnit());

      // lines
      documentPage.getLines().forEach(documentLine ->
        System.out.printf("Line '%s' is within a bounding polygon %s.%n",
          documentLine.getContent(),
          documentLine.getPolygon()));

      // words
      documentPage.getWords().forEach(documentWord ->
        System.out.printf("Word '%s' has a confidence score of %.2f.%n",
          documentWord.getContent(),
          documentWord.getConfidence()));

      // selection marks
      documentPage.getSelectionMarks().forEach(documentSelectionMark ->
        System.out.printf("Selection mark is '%s' and is within a bounding polygon %s with confidence %.2f.%n",
          documentSelectionMark.getState().toString(),
          documentSelectionMark.getPolygon(),
          documentSelectionMark.getConfidence()));
    });

    // tables
    List < DocumentTable > tables = analyzeLayoutResult.getTables();
    for (int i = 0; i < tables.size(); i++) {
      DocumentTable documentTable = tables.get(i);
      System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
        documentTable.getColumnCount());
      documentTable.getCells().forEach(documentTableCell -> {
        System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
          documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
      });
      System.out.println();
    }

    // styles
    analyzeLayoutResult.getStyles().forEach(documentStyle -
      > System.out.printf("Document is handwritten %s.%n", documentStyle.isHandwritten()));
  }
}

生成并运行应用程序

将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”

  1. 使用 build 命令生成应用程序:

    gradle build
    
  2. 使用 run 命令运行应用程序:

    gradle run
    

将以下代码示例添加到 FormRecognizer.java 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


import com.azure.ai.formrecognizer.documentanalysis.models.*;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;

import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.io.IOException;
import java.util.List;
import java.util.Arrays;
import java.time.LocalDate;
import java.util.Map;
import java.util.stream.Collectors;

public class FormRecognizer {

  // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
  private static final String endpoint = "<your-endpoint>";
  private static final String key = "<your-key>";

  public static void main(String[] args) {

    // create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
    DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
      .credential(new AzureKeyCredential(key))
      .endpoint(endpoint)
      .buildClient();

    // sample document
    String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
    String modelId = "prebuilt-layout";

    SyncPoller < OperationResult, AnalyzeResult > analyzeLayoutResultPoller =
      client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);

    AnalyzeResult analyzeLayoutResult = analyzeLayoutResultPoller.getFinalResult();

    // pages
    analyzeLayoutResult.getPages().forEach(documentPage -> {
      System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
        documentPage.getWidth(),
        documentPage.getHeight(),
        documentPage.getUnit());

      // lines
      documentPage.getLines().forEach(documentLine ->
        System.out.printf("Line %s is within a bounding polygon %s.%n",
          documentLine.getContent(),
          documentLine.getBoundingPolygon().toString()));

      // words
      documentPage.getWords().forEach(documentWord ->
        System.out.printf("Word '%s' has a confidence score of %.2f%n",
          documentWord.getContent(),
          documentWord.getConfidence()));

      // selection marks
      documentPage.getSelectionMarks().forEach(documentSelectionMark ->
        System.out.printf("Selection mark is %s and is within a bounding polygon %s with confidence %.2f.%n",
          documentSelectionMark.getState().toString(),
          documentSelectionMark.getBoundingPolygon().toString(),
          documentSelectionMark.getConfidence()));
    });

    // tables
    List < DocumentTable > tables = analyzeLayoutResult.getTables();
    for (int i = 0; i < tables.size(); i++) {
      DocumentTable documentTable = tables.get(i);
      System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
        documentTable.getColumnCount());
      documentTable.getCells().forEach(documentTableCell -> {
        System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
          documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
      });
      System.out.println();
    }
  }
  // Utility function to get the bounding polygon coordinates
  private static String getBoundingCoordinates(List < Point > boundingPolygon) {
    return boundingPolygon.stream().map(point -> String.format("[%.2f, %.2f]", point.getX(),
      point.getY())).collect(Collectors.joining(", "));
  }
}

生成并运行应用程序

将代码示例添加到应用程序后,导航回主项目目录 -“form-recognize-app”

  1. 使用 build 命令生成应用程序:

    gradle build
    
  2. 使用 run 命令运行应用程序:

    gradle run
    

布局模型输出

下面是预期输出的代码段:

  Table 0 has 5 rows and 3 columns.
  Cell 'Title of each class', has row index 0 and column index 0.
  Cell 'Trading Symbol', has row index 0 and column index 1.
  Cell 'Name of exchange on which registered', has row index 0 and column index 2.
  Cell 'Common stock, $0.00000625 par value per share', has row index 1 and column index 0.
  Cell 'MSFT', has row index 1 and column index 1.
  Cell 'NASDAQ', has row index 1 and column index 2.
  Cell '2.125% Notes due 2021', has row index 2 and column index 0.
  Cell 'MSFT', has row index 2 and column index 1.
  Cell 'NASDAQ', has row index 2 and column index 2.
  Cell '3.125% Notes due 2028', has row index 3 and column index 0.
  Cell 'MSFT', has row index 3 and column index 1.
  Cell 'NASDAQ', has row index 3 and column index 2.
  Cell '2.625% Notes due 2033', has row index 4 and column index 0.
  Cell 'MSFT', has row index 4 and column index 1.
  Cell 'NASDAQ', has row index 4 and column index 2.

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出

将以下代码示例添加到 FormRecognizer.java 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzeResult;
import com.azure.ai.formrecognizer.documentanalysis.models.OperationResult;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentTable;
import com.azure.ai.formrecognizer.documentanalysis.models.Point;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.util.List;
import java.util.stream.Collectors;

public class FormRecognizer {

  // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
  private static final String endpoint = "<your-endpoint>";
  private static final String key = "<your-key>";

  public static void main(String[] args) {

    // create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
    DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
      .credential(new AzureKeyCredential(key))
      .endpoint(endpoint)
      .buildClient();

    // sample document
    String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
    String modelId = "prebuilt-layout";

    SyncPoller < OperationResult, AnalyzeResult > analyzeLayoutPoller =
      client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);

    AnalyzeResult analyzeLayoutResult = analyzeLayoutPoller.getFinalResult();

    // pages
    analyzeLayoutResult.getPages().forEach(documentPage -> {
      System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
        documentPage.getWidth(),
        documentPage.getHeight(),
        documentPage.getUnit());

      // lines
      documentPage.getLines().forEach(documentLine ->
        System.out.printf("Line '%s' is within a bounding polygon %s.%n",
          documentLine.getContent(),
          getBoundingCoordinates(documentLine.getBoundingPolygon())));

      // words
      documentPage.getWords().forEach(documentWord ->
        System.out.printf("Word '%s' has a confidence score of %.2f.%n",
          documentWord.getContent(),
          documentWord.getConfidence()));

      // selection marks
      documentPage.getSelectionMarks().forEach(documentSelectionMark ->
        System.out.printf("Selection mark is '%s' and is within a bounding polygon %s with confidence %.2f.%n",
          documentSelectionMark.getSelectionMarkState().toString(),
          getBoundingCoordinates(documentSelectionMark.getBoundingPolygon()),
          documentSelectionMark.getConfidence()));
    });

    // tables
    List < DocumentTable > tables = analyzeLayoutResult.getTables();
    for (int i = 0; i < tables.size(); i++) {
      DocumentTable documentTable = tables.get(i);
      System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
        documentTable.getColumnCount());
      documentTable.getCells().forEach(documentTableCell -> {
        System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
          documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
      });
      System.out.println();
    }

    // styles
    analyzeLayoutResult.getStyles().forEach(documentStyle -
      > System.out.printf("Document is handwritten %s.%n", documentStyle.isHandwritten()));
  }

  /**
   * Utility function to get the bounding polygon coordinates.
   */
  private static String getBoundingCoordinates(List < Point > boundingPolygon) {
    return boundingPolygon.stream().map(point -> String.format("[%.2f, %.2f]", point.getX(),
      point.getY())).collect(Collectors.joining(", "));
  }
}

生成并运行应用程序

将代码示例添加到应用程序后,导航回主项目目录 -“form-recognize-app”

  1. 使用 build 命令生成应用程序:

    gradle build
    
  2. 使用 run 命令运行应用程序:

    gradle run
    

预生成模型

使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。

提示

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze 操作的模型由要分析的文档类型确定。 请参阅模型数据提取

  • 使用预生成发票模型分析发票。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URL 值添加到文件顶部的 invoiceUrl 变量中。
  • 若要分析 URI 中的给定文件,请使用 beginAnalyzeDocuments 方法并传递 PrebuiltModels.Invoice 作为模型 ID。返回的值是一个 result 对象,其中包含有关提交的文档的数据。
  • 为简洁起见,此处并未显示服务返回的所有键值对。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

将以下代码示例添加到 DocIntelligence.java 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


import com.azure.ai.documentintelligence.models.AnalyzeDocumentRequest;
import com.azure.ai.documentintelligence.models.AnalyzeResult;
import com.azure.ai.documentintelligence.models.AnalyzeResultOperation;
import com.azure.ai.documentintelligence.models.Document;
import com.azure.ai.documentintelligence.models.DocumentField;
import com.azure.ai.documentintelligence.models.DocumentFieldType;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.io.IOException;
import java.time.LocalDate;
import java.util.List;
import java.util.Map;

public class DocIntelligence {

  // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
  private static final String endpoint = "<your-endpoint>";
  private static final String key = "<your-key>";

  public static void main(String[] args) {

    // sample document
    String modelId = "prebuilt-invoice";
    String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

    public static void main(final String[] args) throws IOException {

      // Instantiate a client that will be used to call the service.
      DocumentIntelligenceClient client = new DocumentIntelligenceClientBuilder()
        .credential(new AzureKeyCredential(key))
        .endpoint(endpoint)
        .buildClient();

      SyncPoller<AnalyzeResultOperation, AnalyzeResultOperation > analyzeInvoicesPoller =
        client.beginAnalyzeDocument(modelId, 
            null,
            null,
            null,
            null,
            null,
            null,
            new AnalyzeDocumentRequest().setUrlSource(invoiceUrl));

      AnalyzeResult analyzeInvoiceResult = analyzeInvoicesPoller.getFinalResult().getAnalyzeResult();

      for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
        Document analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
        Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
        System.out.printf("----------- Analyzing invoice  %d -----------%n", i);
        DocumentField vendorNameField = invoiceFields.get("VendorName");
        if (vendorNameField != null) {
          if (DocumentFieldType.STRING == vendorNameField.getType()) {
            String merchantName = vendorNameField.getValueString();
            System.out.printf("Vendor Name: %s, confidence: %.2f%n",
              merchantName, vendorNameField.getConfidence());
          }
        }

        DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
        if (vendorAddressField != null) {
          if (DocumentFieldType.STRING == vendorAddressField.getType()) {
            String merchantAddress = vendorAddressField.getValueString();
            System.out.printf("Vendor address: %s, confidence: %.2f%n",
              merchantAddress, vendorAddressField.getConfidence());
          }
        }

        DocumentField customerNameField = invoiceFields.get("CustomerName");
        if (customerNameField != null) {
          if (DocumentFieldType.STRING == customerNameField.getType()) {
            String merchantAddress = customerNameField.getValueString();
            System.out.printf("Customer Name: %s, confidence: %.2f%n",
              merchantAddress, customerNameField.getConfidence());
          }
        }

        DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
        if (customerAddressRecipientField != null) {
          if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
            String customerAddr = customerAddressRecipientField.getValueString();
            System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
              customerAddr, customerAddressRecipientField.getConfidence());
          }
        }

        DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
        if (invoiceIdField != null) {
          if (DocumentFieldType.STRING == invoiceIdField.getType()) {
            String invoiceId = invoiceIdField.getValueString();
            System.out.printf("Invoice ID: %s, confidence: %.2f%n",
              invoiceId, invoiceIdField.getConfidence());
          }
        }

        DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
        if (customerNameField != null) {
          if (DocumentFieldType.DATE == invoiceDateField.getType()) {
            LocalDate invoiceDate = invoiceDateField.getValueDate();
            System.out.printf("Invoice Date: %s, confidence: %.2f%n",
              invoiceDate, invoiceDateField.getConfidence());
          }
        }

        DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
        if (customerAddressRecipientField != null) {
          if (DocumentFieldType.NUMBER == invoiceTotalField.getType()) {
            Double invoiceTotal = invoiceTotalField.getValueNumber();
            System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
              invoiceTotal, invoiceTotalField.getConfidence());
          }
        }

        DocumentField invoiceItemsField = invoiceFields.get("Items");
        if (invoiceItemsField != null) {
          System.out.printf("Invoice Items: %n");
          if (DocumentFieldType.ARRAY == invoiceItemsField.getType()) {
            List < DocumentField > invoiceItems = invoiceItemsField.getValueArray();
            invoiceItems.stream()
              .filter(invoiceItem -> DocumentFieldType.OBJECT == invoiceItem.getType())
              .map(documentField -> documentField.getValueObject())
              .forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {

                // See a full list of fields found on an invoice here:
                // https://aka.ms/documentintelligence/invoicefields

                if ("Description".equals(key)) {
                  if (DocumentFieldType.STRING == documentField.getType()) {
                    String name = documentField.getValueString();
                    System.out.printf("Description: %s, confidence: %.2fs%n",
                      name, documentField.getConfidence());
                  }
                }
                if ("Quantity".equals(key)) {
                  if (DocumentFieldType.NUMBER == documentField.getType()) {
                    Double quantity = documentField.getValueNumber();
                    System.out.printf("Quantity: %f, confidence: %.2f%n",
                      quantity, documentField.getConfidence());
                  }
                }
                if ("UnitPrice".equals(key)) {
                  if (DocumentFieldType.NUMBER == documentField.getType()) {
                    Double unitPrice = documentField.getValueNumber();
                    System.out.printf("Unit Price: %f, confidence: %.2f%n",
                      unitPrice, documentField.getConfidence());
                  }
                }
                if ("ProductCode".equals(key)) {
                  if (DocumentFieldType.NUMBER == documentField.getType()) {
                    Double productCode = documentField.getValueNumber();
                    System.out.printf("Product Code: %f, confidence: %.2f%n",
                      productCode, documentField.getConfidence());
                  }
                }
              }));
          }
        }
      }
    }
  }
}

生成并运行应用程序

将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”

  1. 使用 build 命令生成应用程序:

    gradle build
    
  2. 使用 run 命令运行应用程序:

    gradle run
    

将以下代码示例添加到 FormRecognizer.java 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


import com.azure.ai.formrecognizer.documentanalysis.models.*;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;

import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.io.IOException;
import java.util.List;
import java.util.Arrays;
import java.time.LocalDate;
import java.util.Map;
import java.util.stream.Collectors;

public class FormRecognizer {

  // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
  private static final String endpoint = "<your-endpoint>";
  private static final String key = "<your-key>";

  public static void main(final String[] args) throws IOException {

    // create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
    DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
      .credential(new AzureKeyCredential(key))
      .endpoint(endpoint)
      .buildClient();

    // sample document
    String modelId = "prebuilt-invoice";
    String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

    SyncPoller < OperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl(modelId, invoiceUrl);

    AnalyzeResult analyzeInvoiceResult = analyzeInvoicePoller.getFinalResult();

    for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
      AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
      Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
      System.out.printf("----------- Analyzing invoice  %d -----------%n", i);
      DocumentField vendorNameField = invoiceFields.get("VendorName");
      if (vendorNameField != null) {
        if (DocumentFieldType.STRING == vendorNameField.getType()) {
          String merchantName = vendorNameField.getValueAsString();
          System.out.printf("Vendor Name: %s, confidence: %.2f%n",
            merchantName, vendorNameField.getConfidence());
        }
      }

      DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
      if (vendorAddressField != null) {
        if (DocumentFieldType.STRING == vendorAddressField.getType()) {
          String merchantAddress = vendorAddressField.getValueAsString();
          System.out.printf("Vendor address: %s, confidence: %.2f%n",
            merchantAddress, vendorAddressField.getConfidence());
        }
      }

      DocumentField customerNameField = invoiceFields.get("CustomerName");
      if (customerNameField != null) {
        if (DocumentFieldType.STRING == customerNameField.getType()) {
          String merchantAddress = customerNameField.getValueAsString();
          System.out.printf("Customer Name: %s, confidence: %.2f%n",
            merchantAddress, customerNameField.getConfidence());
        }
      }

      DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
      if (customerAddressRecipientField != null) {
        if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
          String customerAddr = customerAddressRecipientField.getValueAsString();
          System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
            customerAddr, customerAddressRecipientField.getConfidence());
        }
      }

      DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
      if (invoiceIdField != null) {
        if (DocumentFieldType.STRING == invoiceIdField.getType()) {
          String invoiceId = invoiceIdField.getValueAsString();
          System.out.printf("Invoice ID: %s, confidence: %.2f%n",
            invoiceId, invoiceIdField.getConfidence());
        }
      }

      DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
      if (customerNameField != null) {
        if (DocumentFieldType.DATE == invoiceDateField.getType()) {
          LocalDate invoiceDate = invoiceDateField.getValueAsDate();
          System.out.printf("Invoice Date: %s, confidence: %.2f%n",
            invoiceDate, invoiceDateField.getConfidence());
        }
      }

      DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
      if (customerAddressRecipientField != null) {
        if (DocumentFieldType.DOUBLE == invoiceTotalField.getType()) {
          Double invoiceTotal = invoiceTotalField.getValueAsDouble();
          System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
            invoiceTotal, invoiceTotalField.getConfidence());
        }
      }

      DocumentField invoiceItemsField = invoiceFields.get("Items");
      if (invoiceItemsField != null) {
        System.out.printf("Invoice Items: %n");
        if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
          List < DocumentField > invoiceItems = invoiceItemsField.getValueAsList();
          invoiceItems.stream()
            .filter(invoiceItem -> DocumentFieldType.MAP == invoiceItem.getType())
            .map(documentField -> documentField.getValueAsMap())
            .forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {

              // See a full list of fields found on an invoice here:
              // https://aka.ms/formrecognizer/invoicefields

              if ("Description".equals(key)) {
                if (DocumentFieldType.STRING == documentField.getType()) {
                  String name = documentField.getValueAsString();
                  System.out.printf("Description: %s, confidence: %.2fs%n",
                    name, documentField.getConfidence());
                }
              }
              if ("Quantity".equals(key)) {
                if (DocumentFieldType.DOUBLE == documentField.getType()) {
                  Double quantity = documentField.getValueAsDouble();
                  System.out.printf("Quantity: %f, confidence: %.2f%n",
                    quantity, documentField.getConfidence());
                }
              }
              if ("UnitPrice".equals(key)) {
                if (DocumentFieldType.DOUBLE == documentField.getType()) {
                  Double unitPrice = documentField.getValueAsDouble();
                  System.out.printf("Unit Price: %f, confidence: %.2f%n",
                    unitPrice, documentField.getConfidence());
                }
              }
              if ("ProductCode".equals(key)) {
                if (DocumentFieldType.DOUBLE == documentField.getType()) {
                  Double productCode = documentField.getValueAsDouble();
                  System.out.printf("Product Code: %f, confidence: %.2f%n",
                    productCode, documentField.getConfidence());
                }
              }
            }));
        }
      }
    }
  }
}

生成并运行应用程序

将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”

  1. 使用 build 命令生成应用程序:

    gradle build
    
  2. 使用 run 命令运行应用程序:

    gradle run
    

预生成模型输出

下面是预期输出的代码段:

  ----------- Analyzing invoice  0 -----------
  Analyzed document has doc type invoice with confidence : 1.00
  Vendor Name: CONTOSO LTD., confidence: 0.92
  Vendor address: 123 456th St New York, NY, 10001, confidence: 0.91
  Customer Name: MICROSOFT CORPORATION, confidence: 0.84
  Customer Address Recipient: Microsoft Corp, confidence: 0.92
  Invoice ID: INV-100, confidence: 0.97
  Invoice Date: 2019-11-15, confidence: 0.97

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出

将以下代码示例添加到 FormRecognizer.java 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClient;
import com.azure.ai.formrecognizer.documentanalysis.DocumentAnalysisClientBuilder;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzeResult;
import com.azure.ai.formrecognizer.documentanalysis.models.AnalyzedDocument;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentField;
import com.azure.ai.formrecognizer.documentanalysis.models.DocumentFieldType;
import com.azure.ai.formrecognizer.documentanalysis.models.OperationResult;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.io.IOException;
import java.time.LocalDate;
import java.util.List;
import java.util.Map;

public class FormRecognizer {

  // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
  private static final String endpoint = "<your-endpoint>";
  private static final String key = "<your-key>";

  public static void main(String[] args) {

    // create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
    DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
      .credential(new AzureKeyCredential(key))
      .endpoint(endpoint)
      .buildClient();

    // sample document
    String modelId = "prebuilt-invoice";
    String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

    SyncPoller < OperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl(modelId, invoiceUrl);

    AnalyzeResult analyzeInvoiceResult = analyzeInvoicePoller.getFinalResult();

    for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
      AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
      Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
      System.out.printf("----------- Analyzing invoice  %d -----------%n", i);
      DocumentField vendorNameField = invoiceFields.get("VendorName");
      if (vendorNameField != null) {
        if (DocumentFieldType.STRING == vendorNameField.getType()) {
          String merchantName = vendorNameField.getValueAsString();
          System.out.printf("Vendor Name: %s, confidence: %.2f%n",
            merchantName, vendorNameField.getConfidence());
        }
      }

      DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
      if (vendorAddressField != null) {
        if (DocumentFieldType.STRING == vendorAddressField.getType()) {
          String merchantAddress = vendorAddressField.getValueAsString();
          System.out.printf("Vendor address: %s, confidence: %.2f%n",
            merchantAddress, vendorAddressField.getConfidence());
        }
      }

      DocumentField customerNameField = invoiceFields.get("CustomerName");
      if (customerNameField != null) {
        if (DocumentFieldType.STRING == customerNameField.getType()) {
          String merchantAddress = customerNameField.getValueAsString();
          System.out.printf("Customer Name: %s, confidence: %.2f%n",
            merchantAddress, customerNameField.getConfidence());
        }
      }

      DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
      if (customerAddressRecipientField != null) {
        if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
          String customerAddr = customerAddressRecipientField.getValueAsString();
          System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
            customerAddr, customerAddressRecipientField.getConfidence());
        }
      }

      DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
      if (invoiceIdField != null) {
        if (DocumentFieldType.STRING == invoiceIdField.getType()) {
          String invoiceId = invoiceIdField.getValueAsString();
          System.out.printf("Invoice ID: %s, confidence: %.2f%n",
            invoiceId, invoiceIdField.getConfidence());
        }
      }

      DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
      if (customerNameField != null) {
        if (DocumentFieldType.DATE == invoiceDateField.getType()) {
          LocalDate invoiceDate = invoiceDateField.getValueAsDate();
          System.out.printf("Invoice Date: %s, confidence: %.2f%n",
            invoiceDate, invoiceDateField.getConfidence());
        }
      }

      DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
      if (customerAddressRecipientField != null) {
        if (DocumentFieldType.DOUBLE == invoiceTotalField.getType()) {
          Double invoiceTotal = invoiceTotalField.getValueAsDouble();
          System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
            invoiceTotal, invoiceTotalField.getConfidence());
        }
      }

      DocumentField invoiceItemsField = invoiceFields.get("Items");
      if (invoiceItemsField != null) {
        System.out.printf("Invoice Items: %n");
        if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
          List < DocumentField > invoiceItems = invoiceItemsField.getValueAsList();
          invoiceItems.stream()
            .filter(invoiceItem -> DocumentFieldType.MAP == invoiceItem.getType())
            .map(documentField -> documentField.getValueAsMap())
            .forEach(documentFieldMap -> documentFieldMap.forEach((key, documentField) -> {

              // See a full list of fields found on an invoice here:
              // https://aka.ms/formrecognizer/invoicefields

              if ("Description".equals(key)) {
                if (DocumentFieldType.STRING == documentField.getType()) {
                  String name = documentField.getValueAsString();
                  System.out.printf("Description: %s, confidence: %.2fs%n",
                    name, documentField.getConfidence());
                }
              }
              if ("Quantity".equals(key)) {
                if (DocumentFieldType.DOUBLE == documentField.getType()) {
                  Double quantity = documentField.getValueAsDouble();
                  System.out.printf("Quantity: %f, confidence: %.2f%n",
                    quantity, documentField.getConfidence());
                }
              }
              if ("UnitPrice".equals(key)) {
                if (DocumentFieldType.DOUBLE == documentField.getType()) {
                  Double unitPrice = documentField.getValueAsDouble();
                  System.out.printf("Unit Price: %f, confidence: %.2f%n",
                    unitPrice, documentField.getConfidence());
                }
              }
              if ("ProductCode".equals(key)) {
                if (DocumentFieldType.DOUBLE == documentField.getType()) {
                  Double productCode = documentField.getValueAsDouble();
                  System.out.printf("Product Code: %f, confidence: %.2f%n",
                    productCode, documentField.getConfidence());
                }
              }
            }));
        }
      }
    }
  }
}

生成并运行应用程序

将代码示例添加到应用程序后,导航回主项目目录 -“doc-intel-app”

  1. 使用 build 命令生成应用程序:

    gradle build
    
  2. 使用 run 命令运行应用程序:

    gradle run
    

在本快速入门中,使用以下功能来分析和提取表单和文档中的数据和值:

  • 布局—分析和提取表、行、字词和选择标记(例如文档中的单选按钮和复选框),而无需训练模型。

  • 预生成的发票 - 使用预先训练的发票模型分析和提取特定文档类型的公共字段。

先决条件

  • Azure 订阅 - 免费创建订阅

  • 最新版本的 Visual Studio Code 或者你首选的 IDE。 有关详细信息,请参阅 Visual Studio Code 中的 Node.js

  • Node.js 的最新 LTS 版本。

  • Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

    提示

    如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

  1. 创建新的 Node.js Express 应用程序:在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 doc-intel-app 的新目录并导航到该目录。

    mkdir doc-intel-app && cd doc-intel-app
    
  2. 运行 npm init 命令以初始化应用程序并为项目构建基架。

    npm init
    
  3. 使用终端中提供的提示指定项目的属性。

    • 最重要的属性包括名称、版本号和入口点。
    • 建议保留 index.js 作为入口点名称。 描述、测试命令、GitHub 存储库、关键字、作者和许可证信息均为可选属性 —— 在此项目中可跳过。
    • 通过选择“返回”或“Enter”,接受括号中的建议。
    • 完成提示后,doc-intel-app 目录中会创建一个 package.json 文件。
  1. 安装 ai-document-intelligence 客户端库和 azure/identity npm 包:

    npm i @azure-rest/ai-document-intelligence@1.0.0-beta.3 @azure/core-auth
    

    应用的 package.json 文件将使用依赖项进行更新。

  1. 安装 ai-form-recognizer 客户端库和 azure/identity npm 包:

    npm i @azure/ai-form-recognizer@5.0.0 @azure/identity
    
    • 应用的 package.json 文件将使用依赖项进行更新。
  1. 安装 ai-form-recognizer 客户端库和 azure/identity npm 包:

    npm i @azure/ai-form-recognizer@4.0.0 @azure/identity
    
  1. 在应用程序目录中创建一个名为 index.js 的文件。

    提示

    • 可以使用 Powershell 创建新文件。
    • 按住 Shift 键并右键单击该文件夹,在项目目录中打开 Powershell 窗口。
    • 键入以下命令:New-Item index.js

生成应用程序

若要与文档智能服务交互,需要创建 DocumentIntelligenceClient 类的实例。 为此,你要通过 Azure 门户使用 key 创建一个 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建一个 DocumentIntelligenceClient 实例。

若要与文档智能服务交互,需要创建 DocumentAnalysisClient 类的实例。 为此,你将使用 Azure 门户的 key 创建一个 AzureKeyCredential,使用 AzureKeyCredential 和表单识别器 endpoint 创建一个 DocumentAnalysisClient 实例。

  1. 在 Visual Studio Code 或你喜欢用的 IDE 中打开 index.js 文件。 将以下代码示例之一复制并粘贴到应用程序中:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

布局模型

从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 我们已将文件 URL 值添加到文件顶部附近的 formUrl 变量中。
  • 若要分析 URL 中的特定文件,请使用 beginAnalyzeDocuments 方法,并传入 prebuilt-layout 作为模型 ID。
    const DocumentIntelligence = require("@azure-rest/ai-document-intelligence").default,
  { getLongRunningPoller, isUnexpected } = require("@azure-rest/ai-document-intelligence");

  const { AzureKeyCredential } = require("@azure/core-auth");

    // set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
    const key = "<your-key>";
    const endpoint = "<your-endpoint>";

    // sample document
    const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

   async function main() {
    const client = DocumentIntelligence(endpoint, new AzureKeyCredential(key));


    const initialResponse = await client
      .path("/documentModels/{modelId}:analyze", "prebuilt-layout")
      .post({
        contentType: "application/json",
        body: {
          urlSource: formUrl
        },
       });

       if (isUnexpected(initialResponse)) {
       throw initialResponse.body.error;
     }

    const poller = await getLongRunningPoller(client, initialResponse);
    const analyzeResult = (await poller.pollUntilDone()).body.analyzeResult;

    const documents = analyzeResult?.documents;

    const document = documents && documents[0];
    if (!document) {
    throw new Error("Expected at least one document in the result.");
    }

    console.log(
    "Extracted document:",
    document.docType,
    `(confidence: ${document.confidence || "<undefined>"})`,
    );
    console.log("Fields:", document.fields);
}

main().catch((error) => {
    console.error("An error occurred:", error);
    process.exit(1);
});

运行应用程序

将代码示例添加到应用程序后,运行你的程序:

  1. 导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。

  2. 在终端中键入以下命令:

    node index.js
    

将以下代码示例添加到 index.js 文件中。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


 const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");

    // set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
    const key = "<your-key>";
    const endpoint = "<your-endpoint>";

    // sample document
  const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

  async function main() {
    const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));

    const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-layout", formUrl);

    const {
        pages,
        tables
    } = await poller.pollUntilDone();

    if (pages.length <= 0) {
        console.log("No pages were extracted from the document.");
    } else {
        console.log("Pages:");
        for (const page of pages) {
            console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
            console.log(`  ${page.width}x${page.height}, angle: ${page.angle}`);
            console.log(`  ${page.lines.length} lines, ${page.words.length} words`);
        }
    }

    if (tables.length <= 0) {
        console.log("No tables were extracted from the document.");
    } else {
        console.log("Tables:");
        for (const table of tables) {
            console.log(
                `- Extracted table: ${table.columnCount} columns, ${table.rowCount} rows (${table.cells.length} cells)`
            );
        }
    }
}

main().catch((error) => {
    console.error("An error occurred:", error);
    process.exit(1);
});

运行应用程序

将代码示例添加到应用程序后,运行你的程序:

  1. 导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。

  2. 在终端中键入以下命令:

    node index.js
    

布局模型输出

下面是预期输出的代码段:

Pages:
- Page 1 (unit: inch)
  8.5x11, angle: 0
  69 lines, 425 words
Tables:
- Extracted table: 3 columns, 5 rows (15 cells)

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出

预生成模型

在本示例中,我们将使用预生成发票模型分析发票。

提示

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze 操作的模型由要分析的文档类型确定。 请参阅模型数据提取

  • 使用预生成发票模型分析发票。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URL 值添加到文件顶部的 invoiceUrl 变量中。
  • 若要分析 URI 中的给定文件,请使用 beginAnalyzeDocuments 方法并传递 PrebuiltModels.Invoice 作为模型 ID。返回的值是一个 result 对象,其中包含有关提交的文档的数据。
  • 为简洁起见,此处并未显示服务返回的所有键值对。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

const DocumentIntelligence = require("@azure-rest/ai-document-intelligence").default,
  { getLongRunningPoller, isUnexpected } = require("@azure-rest/ai-document-intelligence");

const { AzureKeyCredential } = require("@azure/core-auth");

    // set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
    const key = "<your-key>";
    const endpoint = "<your-endpoint>";

    // sample document
    const invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"

async function main() {

    const client = DocumentIntelligence(endpoint, new AzureKeyCredential(key));

    const initialResponse = await client
    .path("/documentModels/{modelId}:analyze", "prebuilt-invoice")
    .post({
      contentType: "application/json",
      body: {
        // The Document Intelligence service will access the URL to the invoice image and extract data from it
        urlSource: invoiceUrl,
      },
    });

    if (isUnexpected(initialResponse)) {
       throw initialResponse.body.error;
     }

    const poller = await getLongRunningPoller(client, initialResponse);

    poller.onProgress((state) => console.log("Operation:", state.result, state.status));
    const analyzeResult = (await poller.pollUntilDone()).body.analyzeResult;

    const documents = analyzeResult?.documents;

    const result = documents && documents[0];
    if (result) {
      console.log(result.fields);
    } else {
      throw new Error("Expected at least one invoice in the result.");
    }

console.log(
    "Extracted invoice:",
    document.docType,
    `(confidence: ${document.confidence || "<undefined>"})`,
  );
  console.log("Fields:", document.fields);
}


main().catch((error) => {
    console.error("An error occurred:", error);
    process.exit(1);
});

运行应用程序

将代码示例添加到应用程序后,运行你的程序:

  1. 导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。

  2. 在终端中键入以下命令:

    node index.js
    

 const {
    AzureKeyCredential,
    DocumentAnalysisClient
} = require("@azure/ai-form-recognizer");

// set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
const key = "<your-key>";
const endpoint = "<your-endpoint>";
// sample document
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"

async function main() {
    const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));

    const poller = await client.beginAnalyzeDocumentFromUrl("prebuilt-invoice", invoiceUrl);

    const {
        pages,
        tables
    } = await poller.pollUntilDone();

    if (pages.length <= 0) {
        console.log("No pages were extracted from the document.");
    } else {
        console.log("Pages:");
        for (const page of pages) {
            console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
            console.log(`  ${page.width}x${page.height}, angle: ${page.angle}`);
            console.log(`  ${page.lines.length} lines, ${page.words.length} words`);

            if (page.lines && page.lines.length > 0) {
                console.log("  Lines:");

                for (const line of page.lines) {
                    console.log(`  - "${line.content}"`);

                    // The words of the line can also be iterated independently. The words are computed based on their
                    // corresponding spans.
                    for (const word of line.words()) {
                        console.log(`    - "${word.content}"`);
                    }
                }
            }
        }
    }

    if (tables.length <= 0) {
        console.log("No tables were extracted from the document.");
    } else {
        console.log("Tables:");
        for (const table of tables) {
            console.log(
                `- Extracted table: ${table.columnCount} columns, ${table.rowCount} rows (${table.cells.length} cells)`
            );
        }
    }
}

main().catch((error) => {
    console.error("An error occurred:", error);
    process.exit(1);
});

运行应用程序

将代码示例添加到应用程序后,运行你的程序:

  1. 导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。

  2. 在终端中键入以下命令:

    node index.js
    

预生成模型输出

下面是预期输出的代码段:

  Vendor Name: CONTOSO LTD.
  Customer Name: MICROSOFT CORPORATION
  Invoice Date: 2019-11-15T00:00:00.000Z
  Due Date: 2019-12-15T00:00:00.000Z
  Items:
  - <no product code>
    Description: Test for 23 fields
    Quantity: 1
    Date: undefined
    Unit: undefined
    Unit Price: 1
    Tax: undefined
    Amount: 100

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出

const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");

  // set `<your-key>` and `<your-endpoint>` variables with the values from the Azure portal.
      const key = "<your-key>";
      const endpoint = "<your-endpoint>";
// sample document
    invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"

async function main() {
    const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));

    const poller = await client.beginAnalyzeDocument("prebuilt-invoice", invoiceUrl);

    const {
    documents: [document],
  } = await poller.pollUntilDone();


  if (document) {
    const {
      vendorName,
      customerName,
      invoiceDate,
      dueDate,
      items,
      subTotal,
      previousUnpaidBalance,
      totalTax,
      amountDue,
    } = document.fields;

    // The invoice model has many fields. For details, *see* [Invoice model field extraction](../../prebuilt/invoice.md#field-extraction)
    console.log("Vendor Name:", vendorName && vendorName.value);
    console.log("Customer Name:", customerName && customerName.value);
    console.log("Invoice Date:", invoiceDate && invoiceDate.value);
    console.log("Due Date:", dueDate && dueDate.value);

    console.log("Items:");
    for (const item of (items && items.values) || []) {
      const { productCode, description, quantity, date, unit, unitPrice, tax, amount } =
        item.properties;

      console.log("-", (productCode && productCode.value) || "<no product code>");
      console.log("  Description:", description && description.value);
      console.log("  Quantity:", quantity && quantity.value);
      console.log("  Date:", date && date.value);
      console.log("  Unit:", unit && unit.value);
      console.log("  Unit Price:", unitPrice && unitPrice.value);
      console.log("  Tax:", tax && tax.value);
      console.log("  Amount:", amount && amount.value);
    }

    console.log("Subtotal:", subTotal && subTotal.value);
    console.log("Previous Unpaid Balance:", previousUnpaidBalance && previousUnpaidBalance.value);
    console.log("Tax:", totalTax && totalTax.value);
    console.log("Amount Due:", amountDue && amountDue.value);
  } else {
    throw new Error("Expected at least one receipt in the result.");
  }
}


main().catch((error) => {
    console.error("An error occurred:", error);
    process.exit(1);
});

运行应用程序

将代码示例添加到应用程序后,运行你的程序:

  1. 导航到你的文档智能应用程序所在的文件夹 (doc-intel-app)。

  2. 在终端中键入以下命令:

    node index.js
    

在本快速入门中,使用以下功能来分析和提取表单和文档中的数据:

  • 布局 - 分析并提取表、行、字词和选择标记(例如单选按钮和复选框,以及键值对),而无需训练模型。

  • 预生成的发票 - 使用预先训练的模型分析和提取特定文档类型的公共字段。

先决条件

  • Azure 订阅 - 免费创建订阅

  • Python 3.7 或更高版本

    • 你的 Python 安装应包含 pip。 可以通过在命令行上运行 pip --version 来检查是否安装了 pip。 通过安装最新版本的 Python 获取 pip。
  • 最新版本的 Visual Studio Code 或者你首选的 IDE。 有关详细信息,请参阅 Visual Studio Code 中的 Python 入门

  • Azure AI 服务或文档智能资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

提示

如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

在本地环境中打开一个终端窗口,并使用 pip 安装适用于 Python 的 Azure 文档智能客户端库:

pip install azure-ai-documentintelligence==1.0.0b4

pip install azure-ai-formrecognizer==3.3.0

pip install azure-ai-formrecognizer==3.2.0b6

创建 Python 应用程序

若要与文档智能服务交互,需要创建 DocumentIntelligenceClient 类的实例。 为此,你要通过 Azure 门户使用 key 创建一个 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建一个 DocumentIntelligenceClient 实例。

  1. 在首选编辑器或 IDE 中创建一个名为“doc_intel_quickstart.py”的新 Python 文件。

  2. 打开“doc_intel_quickstart.py”文件,并选择以下代码示例之一,将其复制并粘贴到应用程序中:

若要与文档智能服务交互,需要创建 DocumentAnalysisClient 类的实例。 为此,你要通过 Azure 门户使用 key 创建一个 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建一个 DocumentAnalysisClient 实例。

  1. 在首选编辑器或 IDE 中创建一个名为 form_recognizer_quickstart.py 的新 Python 文件。

  2. 打开 form_recognizer_quickstart.py 文件,并选择以下代码示例之一,将其复制并粘贴到应用程序中:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

布局模型

从文档中提取文本、选择标记、文本样式、表结构和边界区域坐标。

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 我们已将文件 URL 值添加到 analyze_layout 函数的 formUrl 变量中。

将以下代码示例添加到 doc_intel_quickstart.py 应用程序。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


# import libraries
import os
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest

# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"

# helper functions

def get_words(page, line):
    result = []
    for word in page.words:
        if _in_span(word, line.spans):
            result.append(word)
    return result


def _in_span(word, spans):
    for span in spans:
        if word.span.offset >= span.offset and (
            word.span.offset + word.span.length
        ) <= (span.offset + span.length):
            return True
    return False


def analyze_layout():
    # sample document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

    document_intelligence_client = DocumentIntelligenceClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-layout", AnalyzeDocumentRequest(url_source=formUrl
    ))

    result: AnalyzeResult = poller.result()

    if result.styles and any([style.is_handwritten for style in result.styles]):
        print("Document contains handwritten content")
    else:
        print("Document does not contain handwritten content")

    for page in result.pages:
        print(f"----Analyzing layout from page #{page.page_number}----")
        print(
            f"Page has width: {page.width} and height: {page.height}, measured with unit: {page.unit}"
        )

        if page.lines:
            for line_idx, line in enumerate(page.lines):
                words = get_words(page, line)
                print(
                    f"...Line # {line_idx} has word count {len(words)} and text '{line.content}' "
                    f"within bounding polygon '{line.polygon}'"
                )

                for word in words:
                    print(
                        f"......Word '{word.content}' has a confidence of {word.confidence}"
                    )

        if page.selection_marks:
            for selection_mark in page.selection_marks:
                print(
                    f"Selection mark is '{selection_mark.state}' within bounding polygon "
                    f"'{selection_mark.polygon}' and has a confidence of {selection_mark.confidence}"
                )

    if result.tables:
        for table_idx, table in enumerate(result.tables):
            print(
                f"Table # {table_idx} has {table.row_count} rows and "
                f"{table.column_count} columns"
            )
            if table.bounding_regions:
                for region in table.bounding_regions:
                    print(
                        f"Table # {table_idx} location on page: {region.page_number} is {region.polygon}"
                    )
            for cell in table.cells:
                print(
                    f"...Cell[{cell.row_index}][{cell.column_index}] has text '{cell.content}'"
                )
                if cell.bounding_regions:
                    for region in cell.bounding_regions:
                        print(
                            f"...content on page {region.page_number} is within bounding polygon '{region.polygon}'"
                        )

    print("----------------------------------------")


if __name__ == "__main__":
    analyze_layout()

运行应用程序

将代码示例添加到应用程序后,构建并运行程序:

  1. 导航到“doc_intel_quickstart.py”文件所在的文件夹。

  2. 在终端中键入以下命令:

    python doc_intel_quickstart.py
    

若要分析 URL 中的给定文件,请使用 begin_analyze_document_from_url 方法并传递 prebuilt-layout 作为模型 ID。返回的值是一个 result 对象,其中包含有关提交的文档的数据。

将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:


# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"

def format_polygon(polygon):
    if not polygon:
        return "N/A"
    return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])

def analyze_layout():
    # sample document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

    document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = document_analysis_client.begin_analyze_document_from_url(
            "prebuilt-layout", formUrl)
    result = poller.result()

    for idx, style in enumerate(result.styles):
        print(
            "Document contains {} content".format(
                "handwritten" if style.is_handwritten else "no handwritten"
            )
        )

    for page in result.pages:
        print("----Analyzing layout from page #{}----".format(page.page_number))
        print(
            "Page has width: {} and height: {}, measured with unit: {}".format(
                page.width, page.height, page.unit
            )
        )

        for line_idx, line in enumerate(page.lines):
            words = line.get_words()
            print(
                "...Line # {} has word count {} and text '{}' within bounding box '{}'".format(
                    line_idx,
                    len(words),
                    line.content,
                    format_polygon(line.polygon),
                )
            )

            for word in words:
                print(
                    "......Word '{}' has a confidence of {}".format(
                        word.content, word.confidence
                    )
                )

        for selection_mark in page.selection_marks:
            print(
                "...Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
                    selection_mark.state,
                    format_polygon(selection_mark.polygon),
                    selection_mark.confidence,
                )
            )

    for table_idx, table in enumerate(result.tables):
        print(
            "Table # {} has {} rows and {} columns".format(
                table_idx, table.row_count, table.column_count
            )
        )
        for region in table.bounding_regions:
            print(
                "Table # {} location on page: {} is {}".format(
                    table_idx,
                    region.page_number,
                    format_polygon(region.polygon),
                )
            )
        for cell in table.cells:
            print(
                "...Cell[{}][{}] has content '{}'".format(
                    cell.row_index,
                    cell.column_index,
                    cell.content,
                )
            )
            for region in cell.bounding_regions:
                print(
                    "...content on page {} is within bounding box '{}'".format(
                        region.page_number,
                        format_polygon(region.polygon),
                    )
                )

    print("----------------------------------------")


if __name__ == "__main__":
    analyze_layout()

运行应用程序

将代码示例添加到应用程序后,构建并运行程序:

  1. 导航到 form_recognizer_quickstart.py 文件所在的文件夹。

  2. 在终端中键入以下命令:

    python form_recognizer_quickstart.py
    

布局模型输出

下面是预期输出的代码段:

  ----Analyzing layout from page #1----
  Page has width: 8.5 and height: 11.0, measured with unit: inch
  ...Line # 0 has word count 2 and text 'UNITED STATES' within bounding box '[3.4915, 0.6828], [5.0116, 0.6828], [5.0116, 0.8265], [3.4915, 0.8265]'
  ......Word 'UNITED' has a confidence of 1.0
  ......Word 'STATES' has a confidence of 1.0
  ...Line # 1 has word count 4 and text 'SECURITIES AND EXCHANGE COMMISSION' within bounding box '[2.1937, 0.9061], [6.297, 0.9061], [6.297, 1.0498], [2.1937, 1.0498]'
  ......Word 'SECURITIES' has a confidence of 1.0
  ......Word 'AND' has a confidence of 1.0
  ......Word 'EXCHANGE' has a confidence of 1.0
  ......Word 'COMMISSION' has a confidence of 1.0
  ...Line # 2 has word count 3 and text 'Washington, D.C. 20549' within bounding box '[3.4629, 1.1179], [5.031, 1.1179], [5.031, 1.2483], [3.4629, 1.2483]'
  ......Word 'Washington,' has a confidence of 1.0
  ......Word 'D.C.' has a confidence of 1.0

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看布局模型输出

将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:


# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"


def analyze_layout():
    # sample document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

    document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = document_analysis_client.begin_analyze_document_from_url(
        "prebuilt-layout", formUrl
    )
    result = poller.result()

    for idx, style in enumerate(result.styles):
        print(
            "Document contains {} content".format(
                "handwritten" if style.is_handwritten else "no handwritten"
            )
        )

    for page in result.pages:
        print("----Analyzing layout from page #{}----".format(page.page_number))
        print(
            "Page has width: {} and height: {}, measured with unit: {}".format(
                page.width, page.height, page.unit
            )
        )

        for line_idx, line in enumerate(page.lines):
            words = line.get_words()
            print(
                "...Line # {} has word count {} and text '{}' within bounding polygon '{}'".format(
                    line_idx,
                    len(words),
                    line.content,
                    format_polygon(line.polygon),
                )
            )

            for word in words:
                print(
                    "......Word '{}' has a confidence of {}".format(
                        word.content, word.confidence
                    )
                )

        for selection_mark in page.selection_marks:
            print(
                "...Selection mark is '{}' within bounding polygon '{}' and has a confidence of {}".format(
                    selection_mark.state,
                    format_polygon(selection_mark.polygon),
                    selection_mark.confidence,
                )
            )

    for table_idx, table in enumerate(result.tables):
        print(
            "Table # {} has {} rows and {} columns".format(
                table_idx, table.row_count, table.column_count
            )
        )
        for region in table.bounding_regions:
            print(
                "Table # {} location on page: {} is {}".format(
                    table_idx,
                    region.page_number,
                    format_polygon(region.polygon),
                )
            )
        for cell in table.cells:
            print(
                "...Cell[{}][{}] has content '{}'".format(
                    cell.row_index,
                    cell.column_index,
                    cell.content,
                )
            )
            for region in cell.bounding_regions:
                print(
                    "...content on page {} is within bounding polygon '{}'".format(
                        region.page_number,
                        format_polygon(region.polygon),
                    )
                )

    print("----------------------------------------")


if __name__ == "__main__":
    analyze_layout()


运行应用程序

将代码示例添加到应用程序后,构建并运行程序:

  1. 导航到 form_recognizer_quickstart.py 文件所在的文件夹。

  2. 在终端中键入以下命令:

    python form_recognizer_quickstart.py
    

预生成模型

使用预生成模型分析和提取特定文档类型的公共字段。 在本示例中,我们将使用预生成发票模型分析发票。

提示

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于 analyze 操作的模型由要分析的文档类型确定。 请参阅模型数据提取

  • 使用预生成发票模型分析发票。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URL 值添加到文件顶部的 invoiceUrl 变量中。
  • 为简洁起见,此处并未显示服务返回的所有键值对。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

将以下代码示例添加到 doc_intel_quickstart.py 应用程序。 请确保使用 Azure 门户中文档智能实例中的值更新密钥和终结点变量:


# import libraries
import os
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest



# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"

def analyze_invoice():
    # sample document

    invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"

    document_intelligence_client = DocumentIntelligenceClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = document_intelligence_client.begin_analyze_document(
        "prebuilt-invoice", AnalyzeDocumentRequest(url_source=invoiceUrl)
    )
    invoices = poller.result()

    if invoices.documents:
        for idx, invoice in enumerate(invoices.documents):
            print(f"--------Analyzing invoice #{idx + 1}--------")
            vendor_name = invoice.fields.get("VendorName")
            if vendor_name:
                print(
                    f"Vendor Name: {vendor_name.get('content')} has confidence: {vendor_name.get('confidence')}"
                )
            vendor_address = invoice.fields.get("VendorAddress")
            if vendor_address:
                print(
                    f"Vendor Address: {vendor_address.get('content')} has confidence: {vendor_address.get('confidence')}"
                )
            vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
            if vendor_address_recipient:
                print(
                    f"Vendor Address Recipient: {vendor_address_recipient.get('content')} has confidence: {vendor_address_recipient.get('confidence')}"
                )
            customer_name = invoice.fields.get("CustomerName")
            if customer_name:
                print(
                    f"Customer Name: {customer_name.get('content')} has confidence: {customer_name.get('confidence')}"
                )
            customer_id = invoice.fields.get("CustomerId")
            if customer_id:
                print(
                    f"Customer Id: {customer_id.get('content')} has confidence: {customer_id.get('confidence')}"
                )
            customer_address = invoice.fields.get("CustomerAddress")
            if customer_address:
                print(
                    f"Customer Address: {customer_address.get('content')} has confidence: {customer_address.get('confidence')}"
                )
            customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
            if customer_address_recipient:
                print(
                    f"Customer Address Recipient: {customer_address_recipient.get('content')} has confidence: {customer_address_recipient.get('confidence')}"
                )
            invoice_id = invoice.fields.get("InvoiceId")
            if invoice_id:
                print(
                    f"Invoice Id: {invoice_id.get('content')} has confidence: {invoice_id.get('confidence')}"
                )
            invoice_date = invoice.fields.get("InvoiceDate")
            if invoice_date:
                print(
                    f"Invoice Date: {invoice_date.get('content')} has confidence: {invoice_date.get('confidence')}"
                )
            invoice_total = invoice.fields.get("InvoiceTotal")
            if invoice_total:
                print(
                    f"Invoice Total: {invoice_total.get('content')} has confidence: {invoice_total.get('confidence')}"
                )
            due_date = invoice.fields.get("DueDate")
            if due_date:
                print(
                    f"Due Date: {due_date.get('content')} has confidence: {due_date.get('confidence')}"
                )
            purchase_order = invoice.fields.get("PurchaseOrder")
            if purchase_order:
                print(
                    f"Purchase Order: {purchase_order.get('content')} has confidence: {purchase_order.get('confidence')}"
                )
            billing_address = invoice.fields.get("BillingAddress")
            if billing_address:
                print(
                    f"Billing Address: {billing_address.get('content')} has confidence: {billing_address.get('confidence')}"
                )
            billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
            if billing_address_recipient:
                print(
                    f"Billing Address Recipient: {billing_address_recipient.get('content')} has confidence: {billing_address_recipient.get('confidence')}"
                )
            shipping_address = invoice.fields.get("ShippingAddress")
            if shipping_address:
                print(
                    f"Shipping Address: {shipping_address.get('content')} has confidence: {shipping_address.get('confidence')}"
                )
            shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
            if shipping_address_recipient:
                print(
                    f"Shipping Address Recipient: {shipping_address_recipient.get('content')} has confidence: {shipping_address_recipient.get('confidence')}"
                )
            print("Invoice items:")
            for idx, item in enumerate(invoice.fields.get("Items").get("valueArray")):
                print(f"...Item #{idx + 1}")
                item_description = item.get("valueObject").get("Description")
                if item_description:
                    print(
                        f"......Description: {item_description.get('content')} has confidence: {item_description.get('confidence')}"
                    )
                item_quantity = item.get("valueObject").get("Quantity")
                if item_quantity:
                    print(
                        f"......Quantity: {item_quantity.get('content')} has confidence: {item_quantity.get('confidence')}"
                    )
                unit = item.get("valueObject").get("Unit")
                if unit:
                    print(
                        f"......Unit: {unit.get('content')} has confidence: {unit.get('confidence')}"
                    )
                unit_price = item.get("valueObject").get("UnitPrice")
                if unit_price:
                    unit_price_code = (
                        unit_price.get("valueCurrency").get("currencyCode")
                        if unit_price.get("valueCurrency").get("currencyCode")
                        else ""
                    )
                    print(
                        f"......Unit Price: {unit_price.get('content')}{unit_price_code} has confidence: {unit_price.get('confidence')}"
                    )
                product_code = item.get("valueObject").get("ProductCode")
                if product_code:
                    print(
                        f"......Product Code: {product_code.get('content')} has confidence: {product_code.get('confidence')}"
                    )
                item_date = item.get("valueObject").get("Date")
                if item_date:
                    print(
                        f"......Date: {item_date.get('content')} has confidence: {item_date.get('confidence')}"
                    )
                tax = item.get("valueObject").get("Tax")
                if tax:
                    print(
                        f"......Tax: {tax.get('content')} has confidence: {tax.get('confidence')}"
                    )
                amount = item.get("valueObject").get("Amount")
                if amount:
                    print(
                        f"......Amount: {amount.get('content')} has confidence: {amount.get('confidence')}"
                    )
            subtotal = invoice.fields.get("SubTotal")
            if subtotal:
                print(
                    f"Subtotal: {subtotal.get('content')} has confidence: {subtotal.get('confidence')}"
                )
            total_tax = invoice.fields.get("TotalTax")
            if total_tax:
                print(
                    f"Total Tax: {total_tax.get('content')} has confidence: {total_tax.get('confidence')}"
                )
            previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
            if previous_unpaid_balance:
                print(
                    f"Previous Unpaid Balance: {previous_unpaid_balance.get('content')} has confidence: {previous_unpaid_balance.get('confidence')}"
                )
            amount_due = invoice.fields.get("AmountDue")
            if amount_due:
                print(
                    f"Amount Due: {amount_due.get('content')} has confidence: {amount_due.get('confidence')}"
                )
            service_start_date = invoice.fields.get("ServiceStartDate")
            if service_start_date:
                print(
                    f"Service Start Date: {service_start_date.get('content')} has confidence: {service_start_date.get('confidence')}"
                )
            service_end_date = invoice.fields.get("ServiceEndDate")
            if service_end_date:
                print(
                    f"Service End Date: {service_end_date.get('content')} has confidence: {service_end_date.get('confidence')}"
                )
            service_address = invoice.fields.get("ServiceAddress")
            if service_address:
                print(
                    f"Service Address: {service_address.get('content')} has confidence: {service_address.get('confidence')}"
                )
            service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
            if service_address_recipient:
                print(
                    f"Service Address Recipient: {service_address_recipient.get('content')} has confidence: {service_address_recipient.get('confidence')}"
                )
            remittance_address = invoice.fields.get("RemittanceAddress")
            if remittance_address:
                print(
                    f"Remittance Address: {remittance_address.get('content')} has confidence: {remittance_address.get('confidence')}"
                )
            remittance_address_recipient = invoice.fields.get(
                "RemittanceAddressRecipient"
            )
            if remittance_address_recipient:
                print(
                    f"Remittance Address Recipient: {remittance_address_recipient.get('content')} has confidence: {remittance_address_recipient.get('confidence')}"
                )


          print("----------------------------------------")


if __name__ == "__main__":
    analyze_invoice()


运行应用程序

将代码示例添加到应用程序后,构建并运行程序:

  1. 导航到“doc_intel_quickstart.py”文件所在的文件夹。

  2. 在终端中键入以下命令:

    python doc_intel_quickstart.py
    

若要分析 URI 中的给定文件,请使用 begin_analyze_document_from_url 方法并传递 prebuilt-invoice 作为模型 ID。返回的值是一个 result 对象,其中包含有关提交的文档的数据。

将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:

# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"


def format_bounding_region(bounding_regions):
    if not bounding_regions:
        return "N/A"
    return ", ".join(
        "Page #{}: {}".format(region.page_number, format_polygon(region.polygon))
        for region in bounding_regions
    )


def format_polygon(polygon):
    if not polygon:
        return "N/A"
    return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])


def analyze_invoice():

    invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"

    document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = document_analysis_client.begin_analyze_document_from_url(
        "prebuilt-invoice", invoiceUrl
    )
    invoices = poller.result()

    for idx, invoice in enumerate(invoices.documents):
        print("--------Recognizing invoice #{}--------".format(idx + 1))
        vendor_name = invoice.fields.get("VendorName")
        if vendor_name:
            print(
                "Vendor Name: {} has confidence: {}".format(
                    vendor_name.value, vendor_name.confidence
                )
            )
        vendor_address = invoice.fields.get("VendorAddress")
        if vendor_address:
            print(
                "Vendor Address: {} has confidence: {}".format(
                    vendor_address.value, vendor_address.confidence
                )
            )
        vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
        if vendor_address_recipient:
            print(
                "Vendor Address Recipient: {} has confidence: {}".format(
                    vendor_address_recipient.value, vendor_address_recipient.confidence
                )
            )
        customer_name = invoice.fields.get("CustomerName")
        if customer_name:
            print(
                "Customer Name: {} has confidence: {}".format(
                    customer_name.value, customer_name.confidence
                )
            )
        customer_id = invoice.fields.get("CustomerId")
        if customer_id:
            print(
                "Customer Id: {} has confidence: {}".format(
                    customer_id.value, customer_id.confidence
                )
            )
        customer_address = invoice.fields.get("CustomerAddress")
        if customer_address:
            print(
                "Customer Address: {} has confidence: {}".format(
                    customer_address.value, customer_address.confidence
                )
            )
        customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
        if customer_address_recipient:
            print(
                "Customer Address Recipient: {} has confidence: {}".format(
                    customer_address_recipient.value,
                    customer_address_recipient.confidence,
                )
            )
        invoice_id = invoice.fields.get("InvoiceId")
        if invoice_id:
            print(
                "Invoice Id: {} has confidence: {}".format(
                    invoice_id.value, invoice_id.confidence
                )
            )
        invoice_date = invoice.fields.get("InvoiceDate")
        if invoice_date:
            print(
                "Invoice Date: {} has confidence: {}".format(
                    invoice_date.value, invoice_date.confidence
                )
            )
        invoice_total = invoice.fields.get("InvoiceTotal")
        if invoice_total:
            print(
                "Invoice Total: {} has confidence: {}".format(
                    invoice_total.value, invoice_total.confidence
                )
            )
        due_date = invoice.fields.get("DueDate")
        if due_date:
            print(
                "Due Date: {} has confidence: {}".format(
                    due_date.value, due_date.confidence
                )
            )
        purchase_order = invoice.fields.get("PurchaseOrder")
        if purchase_order:
            print(
                "Purchase Order: {} has confidence: {}".format(
                    purchase_order.value, purchase_order.confidence
                )
            )
        billing_address = invoice.fields.get("BillingAddress")
        if billing_address:
            print(
                "Billing Address: {} has confidence: {}".format(
                    billing_address.value, billing_address.confidence
                )
            )
        billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
        if billing_address_recipient:
            print(
                "Billing Address Recipient: {} has confidence: {}".format(
                    billing_address_recipient.value,
                    billing_address_recipient.confidence,
                )
            )
        shipping_address = invoice.fields.get("ShippingAddress")
        if shipping_address:
            print(
                "Shipping Address: {} has confidence: {}".format(
                    shipping_address.value, shipping_address.confidence
                )
            )
        shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
        if shipping_address_recipient:
            print(
                "Shipping Address Recipient: {} has confidence: {}".format(
                    shipping_address_recipient.value,
                    shipping_address_recipient.confidence,
                )
            )
        print("Invoice items:")
        for idx, item in enumerate(invoice.fields.get("Items").value):
            print("...Item #{}".format(idx + 1))
            item_description = item.value.get("Description")
            if item_description:
                print(
                    "......Description: {} has confidence: {}".format(
                        item_description.value, item_description.confidence
                    )
                )
            item_quantity = item.value.get("Quantity")
            if item_quantity:
                print(
                    "......Quantity: {} has confidence: {}".format(
                        item_quantity.value, item_quantity.confidence
                    )
                )
            unit = item.value.get("Unit")
            if unit:
                print(
                    "......Unit: {} has confidence: {}".format(
                        unit.value, unit.confidence
                    )
                )
            unit_price = item.value.get("UnitPrice")
            if unit_price:
                print(
                    "......Unit Price: {} has confidence: {}".format(
                        unit_price.value, unit_price.confidence
                    )
                )
            product_code = item.value.get("ProductCode")
            if product_code:
                print(
                    "......Product Code: {} has confidence: {}".format(
                        product_code.value, product_code.confidence
                    )
                )
            item_date = item.value.get("Date")
            if item_date:
                print(
                    "......Date: {} has confidence: {}".format(
                        item_date.value, item_date.confidence
                    )
                )
            tax = item.value.get("Tax")
            if tax:
                print(
                    "......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
                )
            amount = item.value.get("Amount")
            if amount:
                print(
                    "......Amount: {} has confidence: {}".format(
                        amount.value, amount.confidence
                    )
                )
        subtotal = invoice.fields.get("SubTotal")
        if subtotal:
            print(
                "Subtotal: {} has confidence: {}".format(
                    subtotal.value, subtotal.confidence
                )
            )
        total_tax = invoice.fields.get("TotalTax")
        if total_tax:
            print(
                "Total Tax: {} has confidence: {}".format(
                    total_tax.value, total_tax.confidence
                )
            )
        previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
        if previous_unpaid_balance:
            print(
                "Previous Unpaid Balance: {} has confidence: {}".format(
                    previous_unpaid_balance.value, previous_unpaid_balance.confidence
                )
            )
        amount_due = invoice.fields.get("AmountDue")
        if amount_due:
            print(
                "Amount Due: {} has confidence: {}".format(
                    amount_due.value, amount_due.confidence
                )
            )
        service_start_date = invoice.fields.get("ServiceStartDate")
        if service_start_date:
            print(
                "Service Start Date: {} has confidence: {}".format(
                    service_start_date.value, service_start_date.confidence
                )
            )
        service_end_date = invoice.fields.get("ServiceEndDate")
        if service_end_date:
            print(
                "Service End Date: {} has confidence: {}".format(
                    service_end_date.value, service_end_date.confidence
                )
            )
        service_address = invoice.fields.get("ServiceAddress")
        if service_address:
            print(
                "Service Address: {} has confidence: {}".format(
                    service_address.value, service_address.confidence
                )
            )
        service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
        if service_address_recipient:
            print(
                "Service Address Recipient: {} has confidence: {}".format(
                    service_address_recipient.value,
                    service_address_recipient.confidence,
                )
            )
        remittance_address = invoice.fields.get("RemittanceAddress")
        if remittance_address:
            print(
                "Remittance Address: {} has confidence: {}".format(
                    remittance_address.value, remittance_address.confidence
                )
            )
        remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
        if remittance_address_recipient:
            print(
                "Remittance Address Recipient: {} has confidence: {}".format(
                    remittance_address_recipient.value,
                    remittance_address_recipient.confidence,
                )
            )

        print("----------------------------------------")

if __name__ == "__main__":
    analyze_invoice()


运行应用程序

将代码示例添加到应用程序后,构建并运行程序:

  1. 导航到 form_recognizer_quickstart.py 文件所在的文件夹。

  2. 在终端中键入以下命令:

    python form_recognizer_quickstart.py
    

预生成模型输出

下面是预期输出的代码段:

  --------Recognizing invoice #1--------
  Vendor Name: CONTOSO LTD. has confidence: 0.919
  Vendor Address: 123 456th St New York, NY, 10001 has confidence: 0.907
  Vendor Address Recipient: Contoso Headquarters has confidence: 0.919
  Customer Name: MICROSOFT CORPORATION has confidence: 0.84
  Customer Id: CID-12345 has confidence: 0.956
  Customer Address: 123 Other St, Redmond WA, 98052 has confidence: 0.909
  Customer Address Recipient: Microsoft Corp has confidence: 0.917
  Invoice Id: INV-100 has confidence: 0.972
  Invoice Date: 2019-11-15 has confidence: 0.971
  Invoice Total: CurrencyValue(amount=110.0, symbol=$) has confidence: 0.97
  Due Date: 2019-12-15 has confidence: 0.973

若要查看整个输出,请访问 GitHub 上的 Azure 示例存储库,以查看预生成模型输出

将以下代码示例添加到 form_recognizer_quickstart.py 应用程序。 请确保使用 Azure 门户中表单识别器实例中的值更新密钥和终结点变量:


# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"


def format_polygon(polygon):
    if not polygon:
        return "N/A"
    return ", ".join(["[{}, {}]".format(p.x, p.y) for p in polygon])


def analyze_layout():
    # sample document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

    document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = document_analysis_client.begin_analyze_document_from_url(
        "prebuilt-layout", formUrl
    )
    result = poller.result()

    for idx, style in enumerate(result.styles):
        print(
            "Document contains {} content".format(
                "handwritten" if style.is_handwritten else "no handwritten"
            )
        )


for page in result.pages:
    print("----Analyzing layout from page #{}----".format(page.page_number))
    print(
        "Page has width: {} and height: {}, measured with unit: {}".format(
            page.width, page.height, page.unit
        )
    )

    for line_idx, line in enumerate(page.lines):
        words = line.get_words()
        print(
            "...Line # {} has word count {} and text '{}' within bounding polygon '{}'".format(
                line_idx,
                len(words),
                line.content,
                format_polygon(line.polygon),
            )
        )

        for word in words:
            print(
                "......Word '{}' has a confidence of {}".format(
                    word.content, word.confidence
                )
            )

    for selection_mark in page.selection_marks:
        print(
            "...Selection mark is '{}' within bounding polygon '{}' and has a confidence of {}".format(
                selection_mark.state,
                format_polygon(selection_mark.polygon),
                selection_mark.confidence,
            )
        )

for table_idx, table in enumerate(result.tables):
    print(
        "Table # {} has {} rows and {} columns".format(
            table_idx, table.row_count, table.column_count
        )
    )
    for region in table.bounding_regions:
        print(
            "Table # {} location on page: {} is {}".format(
                table_idx,
                region.page_number,
                format_polygon(region.polygon),
            )
        )
    for cell in table.cells:
        print(
            "...Cell[{}][{}] has content '{}'".format(
                cell.row_index,
                cell.column_index,
                cell.content,
            )
        )
        for region in cell.bounding_regions:
            print(
                "...content on page {} is within bounding polygon '{}'".format(
                    region.page_number,
                    format_polygon(region.polygon),
                )
            )

print("----------------------------------------")


if __name__ == "__main__":
    analyze_layout()


运行应用程序

将代码示例添加到应用程序后,构建并运行程序:

  1. 导航到 form_recognizer_quickstart.py 文件所在的文件夹。

  2. 在终端中键入以下命令:

    python form_recognizer_quickstart.py
    

在本快速入门中,了解如何使用文档智能 REST API 分析和提取文档中的数据和值:

先决条件

  • Azure 订阅 - 免费创建订阅

  • 已安装 curl 命令行工具。

  • PowerShell 7.* 版本(或类似的命令行应用程序。):

  • 若要检查 PowerShell 版本,请键入与操作系统相关的以下命令:

    • Windows: Get-Host | Select-Object Version
    • macOS 或 Linux:$PSVersionTable
  • 文档智能(单服务)或 Azure AI 服务(多服务)资源。 获得 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

提示

如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

分析文档并获取结果

POST 请求用于通过预生成或自定义模型分析文档。 GET 请求用于检索文档分析调用的结果。 modelId 与 POST 一起使用,resultId 与 GET 操作一起使用。

分析文档(POST 请求)

在运行 cURL 命令之前,请对 POST 请求进行以下更改:

  1. {endpoint} 替换为你的 Azure 门户文档智能实例中的终结点值。

  2. {key} 替换为你的 Azure 门户文档智能实例中的密钥值。

  3. 使用下表作为参考,将 {modelID}{your-document-url} 替换为所需的值。

  4. 你需要在 URL 中提供文档文件。 对于本快速入门,你可以使用下表中为每项功能提供的示例表单:

示例文档

功能 {modelID} {your-document-url}
读取 prebuilt-read https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png
布局 预生成布局 https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/layout.png
医疗保险卡 prebuilt-healthInsuranceCard.us https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/insurance-card.png
W-2 prebuilt-tax.us.w2 https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/w2.png
发票 预生成的发票 https://github.com/Azure-Samples/cognitive-services-REST-api-samples/raw/master/curl/form-recognizer/rest-api/invoice.pdf
回执 prebuilt-receipt https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/receipt.png
身份文档 prebuilt-idDocument https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/identity_documents.png

示例文档

功能 {modelID} {your-document-url}
常规文档 预生成文档 https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf
读取 prebuilt-read https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png
布局 预生成布局 https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/layout.png
医疗保险卡 prebuilt-healthInsuranceCard.us https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/insurance-card.png
W-2 prebuilt-tax.us.w2 https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/w2.png
发票 预生成的发票 https://github.com/Azure-Samples/cognitive-services-REST-api-samples/raw/master/curl/form-recognizer/rest-api/invoice.pdf
回执 prebuilt-receipt https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/receipt.png
身份文档 prebuilt-idDocument https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/identity_documents.png
名片 prebuilt-businessCard https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/de5e0d8982ab754823c54de47a47e8e499351523/curl/form-recognizer/rest-api/business_card.jpg

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

POST 请求

curl -v -i POST "{endpoint}/documentintelligence/documentModels/{modelId}:analyze?api-version=2024-07-31-preview" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
curl -v -i POST "{endpoint}/formrecognizer/documentModels/{modelID}:analyze?api-version=2023-07-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"
curl -v -i POST "{endpoint}/formrecognizer/documentModels/{modelId}:analyze?api-version=2022-08-31" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': '{your-document-url}'}"

POST 响应 (resultID)

你将收到 202 (Success) 响应,其中包含只读“Operation-Location”标头。 此标头的值包含 resultID,可以查询该 ID,从而使用包含你同一资源订阅密钥的 GET 请求来获取异步操作的状态并检索结果:

{alt-text}

获取分析结果(GET 请求)

调用 Analyze document API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:

调用 Analyze document API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:

调用 Analyze document API 后,调用获取分析结果 API 以获取操作的状态和提取的数据。 运行该命令之前,请进行以下更改:

  1. 替换 POST 响应中的 {resultID} Operation-location 标头。

  2. {key} 替换为 Azure 门户中你的文档智能实例中的密钥值。

GET 请求

curl -v -X GET "{endpoint}/documentintelligence/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2024-07-31-preview" -H "Ocp-Apim-Subscription-Key: {key}"
curl -v -X GET "{endpoint}/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2023-07-31" -H "Ocp-Apim-Subscription-Key: {key}"

curl -v -X GET "{endpoint}/formrecognizer/documentModels/{modelId}/analyzeResults/{resultId}?api-version=2022-08-31" -H "Ocp-Apim-Subscription-Key: {key}"

检查响应

你将收到包含 JSON 输出的 200 (Success) 响应。 第一个字段 "status" 指示操作的状态。 如果操作未完成,"status" 的值为 "running""notStarted",你应当采用手动方式或通过脚本再次调用该 API。 我们建议两次调用间隔一秒或更长时间。

针对预生成发票的示例响应

{
    "status": "succeeded",
    "createdDateTime": "2024-03-25T19:31:37Z",
    "lastUpdatedDateTime": "2024-03-25T19:31:43Z",
    "analyzeResult": {
        "apiVersion": "2024-07-31-preview",
        "modelId": "prebuilt-invoice",
        "stringIndexType": "textElements"...
    ..."pages": [
            {
                "pageNumber": 1,
                "angle": 0,
                "width": 8.5,
                "height": 11,
                "unit": "inch",
                "words": [
                    {
                        "content": "CONTOSO",
                        "boundingBox": [
                            0.5911,
                            0.6857,
                            1.7451,
                            0.6857,
                            1.7451,
                            0.8664,
                            0.5911,
                            0.8664
                        ],
                        "confidence": 1,
                        "span": {
                            "offset": 0,
                            "length": 7
                                }
                      }],
              }]
      }
}
{
    "status": "succeeded",
    "createdDateTime": "2023-08-25T19:31:37Z",
    "lastUpdatedDateTime": "2023-08-25T19:31:43Z",
    "analyzeResult": {
        "apiVersion": "2023-07-31",
        "modelId": "prebuilt-invoice",
        "stringIndexType": "textElements"...
    ..."pages": [
            {
                "pageNumber": 1,
                "angle": 0,
                "width": 8.5,
                "height": 11,
                "unit": "inch",
                "words": [
                    {
                        "content": "CONTOSO",
                        "boundingBox": [
                            0.5911,
                            0.6857,
                            1.7451,
                            0.6857,
                            1.7451,
                            0.8664,
                            0.5911,
                            0.8664
                        ],
                        "confidence": 1,
                        "span": {
                            "offset": 0,
                            "length": 7
                                }
                      }],
              }]
      }
}
{
    "status": "succeeded",
    "createdDateTime": "2022-09-25T19:31:37Z",
    "lastUpdatedDateTime": "2022-09-25T19:31:43Z",
    "analyzeResult": {
        "apiVersion": "2022-08-31",
        "modelId": "prebuilt-invoice",
        "stringIndexType": "textElements"...
    ..."pages": [
            {
                "pageNumber": 1,
                "angle": 0,
                "width": 8.5,
                "height": 11,
                "unit": "inch",
                "words": [
                    {
                        "content": "CONTOSO",
                        "boundingBox": [
                            0.5911,
                            0.6857,
                            1.7451,
                            0.6857,
                            1.7451,
                            0.8664,
                            0.5911,
                            0.8664
                        ],
                        "confidence": 1,
                        "span": {
                            "offset": 0,
                            "length": 7
                                }
                      }],
              }]
      }
}

支持的文档字段

预生成模型提取预定义的文档字段集。 请参阅模型数据提取来了解提取的字段名称、类型、说明和示例。

完成了,恭喜!

在本快速入门中,你使用了文档智能模型来分析各种表单和文档。 接下来,浏览文档智能工作室和参考文档,深入了解文档智能 API。

后续步骤

此内容适用于:选中标记 v2.1 | 最新版本:蓝色复选标记 v4.0(预览版)

开始通过你选择的编程语言或 REST API 来使用 Azure AI 文档智能。 文档智能是一款基于云的 Azure AI 服务,它使用机器学习从文档中提取键值对、文本和表。 我们建议你在学习该技术时使用免费服务。 请记住,每月的免费页数限于 500。

若要详细了解文档智能功能和开发选项,请访问我们的概述页。

参考文档 | 库源代码 | 包 (NuGet) | 示例

在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:

先决条件

  • Azure 订阅 - 免费创建订阅

  • Visual Studio IDE 的当前版本。

  • Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

    提示

    如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

  1. 启动 Visual Studio 2019。

  2. 在“开始”页上,选择“创建新项目”。

    Visual Studio 开始窗口的屏幕截图。

  3. 在“创建新项目”页面上,在搜索框中输入“控制台”。 选择“控制台应用程序”模板,然后选择“下一步”。

    Visual Studio 的“新建项目”页面的屏幕截图。

  4. 在“配置新项目”对话框中,在项目名称框中输入 formRecognizer_quickstart。 然后选择“下一步”。

    Visual Studio“配置新项目”对话框窗口的屏幕截图。

  5. 在“附加信息”对话框窗口中,选择“.NET 5.0 (当前版)”,然后选择“创建”。

    Visual Studio“其他信息”对话框窗口的屏幕截图。

使用 NuGet 安装客户端库

  1. 右键单击 formRecognizer_quickstart 项目,然后选择“管理 NuGet 包...”。

    显示选择“NuGet 包”窗口的屏幕截图。

  2. 选择“浏览”选项卡,并键入 Azure.AI.FormRecognizer。

    显示“选择文档智能包”下拉菜单的屏幕截图。

  3. 从下拉菜单中选择版本 3.1.1,然后选择安装

生成应用程序

若要与文档智能服务交互,需要创建 FormRecognizerClient 类的实例。 为此,你要使用密钥创建 AzureKeyCredential,并使用 AzureKeyCredential 和文档智能 endpoint 创建 FormRecognizerClient 实例。

注意

  • 从 .NET 6 开始,使用 console 模板的新项目将生成与以前版本不同的新程序样式。
  • 新的输出使用最新的 C# 功能,这些功能简化了你需要编写的代码。
  • 使用较新版本时,只需编写 Main 方法的主体。 无需包括顶级语句、全局 using 指令或隐式 using 指令。
  • 有关详细信息,请参阅新的 C# 模板生成顶级语句
  1. 打开 Program.cs 文件。

  2. 包括以下 using 指令:

using Azure;
using Azure.AI.FormRecognizer;
using Azure.AI.FormRecognizer.Models;
using System.Threading.Tasks;
  1. 设置 endpointkey 环境变量,并创建 AzureKeyCredentialFormRecognizerClient 实例:
private static readonly string endpoint = "your-form-recognizer-endpoint";
private static readonly string key = "your-api-key";
private static readonly AzureKeyCredential credential = new AzureKeyCredential(key);
  1. 删除 Console.Writeline("Hello World!"); 行,并向 Program.cs 文件中添加下列“试用”代码示例之一:

    将示例代码添加到主方法的屏幕截图。

  2. 选择要复制并粘贴到应用程序的 Main 方法中的代码示例:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性一文。

试用:布局模型

从文档中提取文本、选择标记、文本类型和表结构及其边界区域坐标。

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 我们已将文件 URI 值添加到 formUri 变量中。
  • 若要从 URI 中的特定文件中提取布局,请使用 StartRecognizeContentFromUriAsync 方法。

将以下代码添加到布局应用程序 Program.cs 文件:


FormRecognizerClient recognizerClient = AuthenticateClient();

Task recognizeContent = RecognizeContent(recognizerClient);
Task.WaitAll(recognizeContent);

private static FormRecognizerClient AuthenticateClient()
            {
                var credential = new AzureKeyCredential(key);
                var client = new FormRecognizerClient(new Uri(endpoint), credential);
                return client;
            }

            private static async Task RecognizeContent(FormRecognizerClient recognizerClient)
        {
            string formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
            FormPageCollection formPages = await recognizerClient
        .StartRecognizeContentFromUri(new Uri(formUrl))
        .WaitForCompletionAsync();

            foreach (FormPage page in formPages)
            {
                Console.WriteLine($"Form Page {page.PageNumber} has {page.Lines.Count} lines.");

                for (int i = 0; i < page.Lines.Count; i++)
                {
                    FormLine line = page.Lines[i];
                    Console.WriteLine($"    Line {i} has {line.Words.Count} word{(line.Words.Count > 1 ? "s" : "")}, and text: '{line.Text}'.");
                }

                for (int i = 0; i < page.Tables.Count; i++)
                {
                    FormTable table = page.Tables[i];
                    Console.WriteLine($"Table {i} has {table.RowCount} rows and {table.ColumnCount} columns.");
                    foreach (FormTableCell cell in table.Cells)
                    {
                        Console.WriteLine($"    Cell ({cell.RowIndex}, {cell.ColumnIndex}) contains text: '{cell.Text}'.");
                    }
                }
            }
        }
    }
}

试用:预生成模型

此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。

  • 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URI 值添加到 Main 方法顶部的 invoiceUri 变量。
  • 若要分析 URI 中的特定文件,请使用 StartRecognizeInvoicesFromUriAsync 方法。
  • 为简洁起见,此处并未显示服务返回的所有字段。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

选择一个预生成模型

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:

  • 发票:从发票中提取文本、选择标记、表、字段和关键信息。
  • 收据:从收据中提取文本和关键信息。
  • 身份文档:从驾照和国际护照中提取文本和关键信息。
  • 名片:从名片中提取文本和关键信息。

将以下代码添加到预生成发票应用程序的 Program.cs 文件方法

FormRecognizerClient recognizerClient = AuthenticateClient();

  Task analyzeinvoice = AnalyzeInvoice(recognizerClient, invoiceUrl);
  Task.WaitAll(analyzeinvoice);

   private static FormRecognizerClient AuthenticateClient() {
     var credential = new AzureKeyCredential(key);
     var client = new FormRecognizerClient(new Uri(endpoint), credential);
     return client;
   }

   static string invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

   private static async Task AnalyzeInvoice(FormRecognizerClient recognizerClient, string invoiceUrl) {
     var options = new RecognizeInvoicesOptions() {
       Locale = "en-US"
     };
     RecognizedFormCollection invoices = await recognizerClient.StartRecognizeInvoicesFromUriAsync(new Uri(invoiceUrl), options).WaitForCompletionAsync();

     RecognizedForm invoice = invoices[0];

     FormField invoiceIdField;
     if (invoice.Fields.TryGetValue("InvoiceId", out invoiceIdField)) {
       if (invoiceIdField.Value.ValueType == FieldValueType.String) {
         string invoiceId = invoiceIdField.Value.AsString();
         Console.WriteLine($"    Invoice Id: '{invoiceId}', with confidence {invoiceIdField.Confidence}");
       }
     }

     FormField invoiceDateField;
     if (invoice.Fields.TryGetValue("InvoiceDate", out invoiceDateField)) {
       if (invoiceDateField.Value.ValueType == FieldValueType.Date) {
         DateTime invoiceDate = invoiceDateField.Value.AsDate();
         Console.WriteLine($"    Invoice Date: '{invoiceDate}', with confidence {invoiceDateField.Confidence}");
       }
     }

     FormField dueDateField;
     if (invoice.Fields.TryGetValue("DueDate", out dueDateField)) {
       if (dueDateField.Value.ValueType == FieldValueType.Date) {
         DateTime dueDate = dueDateField.Value.AsDate();
         Console.WriteLine($"    Due Date: '{dueDate}', with confidence {dueDateField.Confidence}");
       }
     }

     FormField vendorNameField;
     if (invoice.Fields.TryGetValue("VendorName", out vendorNameField)) {
       if (vendorNameField.Value.ValueType == FieldValueType.String) {
         string vendorName = vendorNameField.Value.AsString();
         Console.WriteLine($"    Vendor Name: '{vendorName}', with confidence {vendorNameField.Confidence}");
       }
     }

     FormField vendorAddressField;
     if (invoice.Fields.TryGetValue("VendorAddress", out vendorAddressField)) {
       if (vendorAddressField.Value.ValueType == FieldValueType.String) {
         string vendorAddress = vendorAddressField.Value.AsString();
         Console.WriteLine($"    Vendor Address: '{vendorAddress}', with confidence {vendorAddressField.Confidence}");
       }
     }

     FormField customerNameField;
     if (invoice.Fields.TryGetValue("CustomerName", out customerNameField)) {
       if (customerNameField.Value.ValueType == FieldValueType.String) {
         string customerName = customerNameField.Value.AsString();
         Console.WriteLine($"    Customer Name: '{customerName}', with confidence {customerNameField.Confidence}");
       }
     }

     FormField customerAddressField;
     if (invoice.Fields.TryGetValue("CustomerAddress", out customerAddressField)) {
       if (customerAddressField.Value.ValueType == FieldValueType.String) {
         string customerAddress = customerAddressField.Value.AsString();
         Console.WriteLine($"    Customer Address: '{customerAddress}', with confidence {customerAddressField.Confidence}");
       }
     }

     FormField customerAddressRecipientField;
     if (invoice.Fields.TryGetValue("CustomerAddressRecipient", out customerAddressRecipientField)) {
       if (customerAddressRecipientField.Value.ValueType == FieldValueType.String) {
         string customerAddressRecipient = customerAddressRecipientField.Value.AsString();
         Console.WriteLine($"    Customer address recipient: '{customerAddressRecipient}', with confidence {customerAddressRecipientField.Confidence}");
       }
     }

     FormField invoiceTotalField;
     if (invoice.Fields.TryGetValue("InvoiceTotal", out invoiceTotalField)) {
       if (invoiceTotalField.Value.ValueType == FieldValueType.Float) {
         float invoiceTotal = invoiceTotalField.Value.AsFloat();
         Console.WriteLine($"    Invoice Total: '{invoiceTotal}', with confidence {invoiceTotalField.Confidence}");
       }
     }
   }
 }
}

运行应用程序

选择 formRecognizer_quickstart 旁的绿色“启动”按钮,来生成并运行程序,或者按 F5

运行 Visual Studio 程序的屏幕截图。

参考文档 | 库源代码 | 包 (Maven) | 示例

在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:

先决条件

  • Azure 订阅 - 免费创建订阅

  • Java 开发工具包 (JDK) 版本 8 或更高版本。 有关详细信息,请参阅支持的 Java 版本和更新计划

  • Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

创建新的 Gradle 项目

在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建名为 form-recognizer-app 的新目录,并导航到该目录。

mkdir form-recognizer-app && form-recognizer-app
  1. 从工作目录运行 gradle init 命令。 此命令将创建 Gradle 的基本生成文件,其中包括 build.gradle.kts - 在运行时将使用该文件创建并配置应用程序。

    gradle init --type basic
    
  2. 当提示你选择一个 DSL 时,选择 Kotlin

  3. 接受默认项目名称 (form-recognizer-app)

安装客户端库

本快速入门使用 Gradle 依赖项管理器。 可以在 Maven 中央存储库中找到客户端库以及其他依赖项管理器的信息。

在项目的 build.gradle.kts 文件中,以 implementation 语句的形式包含客户端库及所需的插件和设置。

plugins {
    java
    application
}
application {
    mainClass.set("FormRecognizer")
}
repositories {
    mavenCentral()
}
dependencies {
    implementation(group = "com.azure", name = "azure-ai-formrecognizer", version = "3.1.1")
}

创建 Java 文件

在工作目录中运行以下命令:

mkdir -p src/main/java

创建以下目录结构:

应用程序的 Java 目录结构的屏幕截图。

导航到 Java 目录,创建一个名为 FormRecognizer.java 的文件。 在你偏好的编辑器或 IDE 中打开该文件,并添加以下包声明和 import 语句:

import com.azure.ai.formrecognizer.*;
import com.azure.ai.formrecognizer.models.*;

import java.util.concurrent.atomic.AtomicReference;
import java.util.List;
import java.util.Map;
import java.time.LocalDate;

import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.http.rest.PagedIterable;
import com.azure.core.util.Context;
import com.azure.core.util.polling.SyncPoller;

选择要复制并粘贴到应用程序的 main 方法中的代码示例:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

试用:布局模型

从文档中提取文本、选择标记、文本类型和表结构及其边界区域坐标。

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 若要分析 URI 中的特定文件,将使用 beginRecognizeContentFromUrl 方法。
  • 我们已将文件 URI 值添加到 main 方法的 formUrl 变量中。

使用以下代码更新应用程序的 FormRecognizer 类(请务必使用 Azure 门户文档智能实例中的值更新密钥和终结点变量):


static final String key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";

public static void main(String[] args) {FormRecognizerClient recognizerClient = new FormRecognizerClientBuilder()
                .credential(new AzureKeyCredential(key)).endpoint(endpoint).buildClient();

    String formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";

    System.out.println("Get form content...");
        GetContent(recognizerClient, formUrl);
  }
    private static void GetContent(FormRecognizerClient recognizerClient, String invoiceUri) {
        String analyzeFilePath = invoiceUri;
        SyncPoller<FormRecognizerOperationResult, List<FormPage>> recognizeContentPoller = recognizerClient
                .beginRecognizeContentFromUrl(analyzeFilePath);

        List<FormPage> contentResult = recognizeContentPoller.getFinalResult();
        // </snippet_getcontent_call>
        // <snippet_getcontent_print>
        contentResult.forEach(formPage -> {
            // Table information
            System.out.println("----Recognizing content ----");
            System.out.printf("Has width: %f and height: %f, measured with unit: %s.%n", formPage.getWidth(),
                    formPage.getHeight(), formPage.getUnit());
            formPage.getTables().forEach(formTable -> {
                System.out.printf("Table has %d rows and %d columns.%n", formTable.getRowCount(),
                        formTable.getColumnCount());
                formTable.getCells().forEach(formTableCell -> {
                    System.out.printf("Cell has text %s.%n", formTableCell.getText());
                });
                System.out.println();
            });
        });
    }

试用:预生成模型

此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。

  • 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档
  • 若要分析 URI 中的特定文件,将使用 beginRecognizeInvoicesFromUrl
  • 我们已将文件 URI 值添加到 main 方法的 invoiceUrl 变量中。
  • 为简洁起见,此处并未显示服务返回的所有字段。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

选择一个预生成模型

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:

  • 发票:从发票中提取文本、选择标记、表、字段和关键信息。
  • 收据:从收据中提取文本和关键信息。
  • 身份文档:从驾照和国际护照中提取文本和关键信息。
  • 名片:从名片中提取文本和关键信息。

使用以下代码更新应用程序的 FormRecognizer 类(请务必使用 Azure 门户文档智能实例中的值更新密钥和终结点变量):


static final String key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";

public static void main(String[] args) {
    FormRecognizerClient recognizerClient = new FormRecognizerClientBuilder().credential(new AzureKeyCredential(key)).endpoint(endpoint).buildClient();

    String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

    System.out.println("Analyze invoice...");
        AnalyzeInvoice(recognizerClient, invoiceUrl);
  }
    private static void AnalyzeInvoice(FormRecognizerClient recognizerClient, String invoiceUrl) {
      SyncPoller < FormRecognizerOperationResult,
        List < RecognizedForm >> recognizeInvoicesPoller = recognizerClient.beginRecognizeInvoicesFromUrl(invoiceUrl);
      List < RecognizedForm > recognizedInvoices = recognizeInvoicesPoller.getFinalResult();

      for (int i = 0; i < recognizedInvoices.size(); i++) {
        RecognizedForm recognizedInvoice = recognizedInvoices.get(i);
        Map < String,
        FormField > recognizedFields = recognizedInvoice.getFields();
        System.out.printf("----------- Recognized invoice info for page %d -----------%n", i);
        FormField vendorNameField = recognizedFields.get("VendorName");
        if (vendorNameField != null) {
            if (FieldValueType.STRING == vendorNameField.getValue().getValueType()) {
                String merchantName = vendorNameField.getValue().asString();
                System.out.printf("Vendor Name: %s, confidence: %.2f%n", merchantName, vendorNameField.getConfidence());
            }
        }

        FormField vendorAddressField = recognizedFields.get("VendorAddress");
        if (vendorAddressField != null) {
            if (FieldValueType.STRING == vendorAddressField.getValue().getValueType()) {
                String merchantAddress = vendorAddressField.getValue().asString();
                System.out.printf("Vendor address: %s, confidence: %.2f%n", merchantAddress, vendorAddressField.getConfidence());
            }
        }

        FormField customerNameField = recognizedFields.get("CustomerName");
        if (customerNameField != null) {
            if (FieldValueType.STRING == customerNameField.getValue().getValueType()) {
                String merchantAddress = customerNameField.getValue().asString();
                System.out.printf("Customer Name: %s, confidence: %.2f%n", merchantAddress, customerNameField.getConfidence());
            }
        }

        FormField customerAddressRecipientField = recognizedFields.get("CustomerAddressRecipient");
        if (customerAddressRecipientField != null) {
            if (FieldValueType.STRING == customerAddressRecipientField.getValue().getValueType()) {
                String customerAddr = customerAddressRecipientField.getValue().asString();
                System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n", customerAddr, customerAddressRecipientField.getConfidence());
            }
        }

        FormField invoiceIdField = recognizedFields.get("InvoiceId");
        if (invoiceIdField != null) {
            if (FieldValueType.STRING == invoiceIdField.getValue().getValueType()) {
                String invoiceId = invoiceIdField.getValue().asString();
                System.out.printf("Invoice Id: %s, confidence: %.2f%n", invoiceId, invoiceIdField.getConfidence());
            }
        }

        FormField invoiceDateField = recognizedFields.get("InvoiceDate");
        if (customerNameField != null) {
            if (FieldValueType.DATE == invoiceDateField.getValue().getValueType()) {
                LocalDate invoiceDate = invoiceDateField.getValue().asDate();
                System.out.printf("Invoice Date: %s, confidence: %.2f%n", invoiceDate, invoiceDateField.getConfidence());
            }
        }

        FormField invoiceTotalField = recognizedFields.get("InvoiceTotal");
        if (customerAddressRecipientField != null) {
            if (FieldValueType.FLOAT == invoiceTotalField.getValue().getValueType()) {
                Float invoiceTotal = invoiceTotalField.getValue().asFloat();
                System.out.printf("Invoice Total: %.2f, confidence: %.2f%n", invoiceTotal, invoiceTotalField.getConfidence());
            }
        }
    }
}

生成并运行应用程序

导航回到主项目目录 form-recognizer-app

  1. 使用 build 命令生成应用程序:
gradle build
  1. 使用 run 命令运行应用程序:
gradle run

参考文档 | 库源代码 | 包 (npm) | 示例

在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:

先决条件

  • Azure 订阅 - 免费创建订阅

  • 最新版本的 Visual Studio Code 或者你首选的 IDE。

  • 最新 LTS 版本的 Node.js

  • Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

    提示

    如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

  1. 创建新的 Node.js 应用程序。 在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。

    mkdir form-recognizer-app && cd form-recognizer-app
    
  2. 运行 npm init 命令以使用 package.json 文件创建一个 node 应用程序。

    npm init
    
  3. 安装 ai-form-recognizer 客户端库 npm 包:

    npm install @azure/ai-form-recognizer
    

    应用的 package.json 文件将使用依赖项进行更新。

  4. 创建一个名为 index.js 的文件,将其打开,并导入以下库:

    const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
    
  5. 为资源的 Azure 终结点和密钥创建变量:

    const key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
    const endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
    
  6. 此时,JavaScript 应用程序应包含以下代码行:

    
    const { FormRecognizerClient, AzureKeyCredential } = require("@azure/ai-form-recognizer");
    
    const endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
    const key = "PASTE_YOUR_FORM_RECOGNIZER_KEY_HERE";
    

选择要复制并粘贴到应用程序中的代码示例:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

试用:布局模型

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 我们已将文件 URI 值添加到文件顶部附近的 formUrl 变量中。
  • 若要分析 URI 中的特定文件,将使用 beginRecognizeContent 方法。

将以下代码添加到布局应用程序的 key 变量下方的行中

const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

async function recognizeContent() {
    const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(key));
    const poller = await client.beginRecognizeContentFromUrl(formUrl);
    const pages = await poller.pollUntilDone();

    if (!pages || pages.length === 0) {
        throw new Error("Expecting non-empty list of pages!");
    }

    for (const page of pages) {
        console.log(
            `Page ${page.pageNumber}: width ${page.width} and height ${page.height} with unit ${page.unit}`
        );
        for (const table of page.tables) {
            for (const cell of table.cells) {
                console.log(`cell [${cell.rowIndex},${cell.columnIndex}] has text ${cell.text}`);
            }
        }
    }
}

recognizeContent().catch((err) => {
    console.error("The sample encountered an error:", err);
});

试用:预生成模型

此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。 请参阅预先构建的概念页面,获取发票字段的完整列表

  • 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URI 值添加到文件顶部的 invoiceUrl 变量中。
  • 若要分析 URI 中的特定文件,将使用 beginRecognizeInvoices 方法。
  • 为简洁起见,此处并未显示服务返回的所有字段。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

选择一个预生成模型

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:

  • 发票:从发票中提取文本、选择标记、表、字段和关键信息。
  • 收据:从收据中提取文本和关键信息。
  • 身份文档:从驾照和国际护照中提取文本和关键信息。
  • 名片:从名片中提取文本和关键信息。

将以下代码添加到预生成发票应用程序的 key 变量下方


const invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf";

async function recognizeInvoices() {

    const client = new FormRecognizerClient(endpoint, new AzureKeyCredential(key));

    const poller = await client.beginRecognizeInvoicesFromUrl(invoiceUrl);
    const [invoice] = await poller.pollUntilDone();

    if (invoice === undefined) {
        throw new Error("Failed to extract data from at least one invoice.");
    }

    /**
     * This is a helper function for printing a simple field with an elemental type.
     */
    function fieldToString(field) {
        const {
            name,
            valueType,
            value,
            confidence
        } = field;
        return `${name} (${valueType}): '${value}' with confidence ${confidence}'`;
    }

    console.log("Invoice fields:");

    /**
     * Invoices contain a lot of optional fields, but they are all of elemental types
     * such as strings, numbers, and dates, so we will just enumerate them all.
     */
    for (const [name, field] of Object.entries(invoice.fields)) {
        if (field.valueType !== "array" && field.valueType !== "object") {
            console.log(`- ${name} ${fieldToString(field)}`);
        }
    }

    // Invoices also support nested line items, so we can iterate over them.
    let idx = 0;

    console.log("- Items:");

    const items = invoice.fields["Items"]?.value;
    for (const item of items ?? []) {
        const value = item.value;

        // Each item has several subfields that are nested within the item. We'll
        // map over this list of the subfields and filter out any fields that
        // weren't found. Not all fields will be returned every time, only those
        // that the service identified for the particular document in question.

        const subFields = [
                "Description",
                "Quantity",
                "Unit",
                "UnitPrice",
                "ProductCode",
                "Date",
                "Tax",
                "Amount"
            ]
            .map((fieldName) => value[fieldName])
            .filter((field) => field !== undefined);

        console.log(
            [
                `  - Item #${idx}`,
                // Now we will convert those fields into strings to display
                ...subFields.map((field) => `    - ${fieldToString(field)}`)
            ].join("\n")
        );
    }
}

recognizeInvoices().catch((err) => {
    console.error("The sample encountered an error:", err);
});

参考文档 | 库源代码 | 包 (PyPi) | 示例

在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:

先决条件

  • Azure 订阅 - 免费创建订阅

  • Python 3.x

    • 你的 Python 安装应包含 pip。 可以通过在命令行上运行 pip --version 来检查是否安装了 pip。 通过安装最新版本的 Python 获取 pip。
  • Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

    提示

    如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

设置

在本地环境中打开一个终端窗口,并使用 pip 安装适用于 Python 的 Azure 文档智能客户端库:

pip install azure-ai-formrecognizer

创建新的 Python 应用程序

在首选编辑器或 IDE 中创建一个名为 form_recognizer_quickstart.py 的新 Python 应用程序。 然后导入以下库:

import os
from azure.ai.formrecognizer import FormRecognizerClient
from azure.core.credentials import AzureKeyCredential

为 Azure 资源终结点和密钥创建变量

endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"

此时,Python 应用程序应包含以下代码行:

import os
from azure.core.exceptions import ResourceNotFoundError
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"

选择要复制并粘贴到应用程序中的代码示例:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

试用:布局模型

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  • 我们已将文件 URI 值添加到文件顶部附近的 formUrl 变量中。
  • 若要分析 URI 中的特定文件,将使用 begin_recognize_content_from_url 方法。

将以下代码添加到布局应用程序的 key 变量下方的行中


  def format_bounding_box(bounding_box):
    if not bounding_box:
        return "N/A"
    return ", ".join(["[{}, {}]".format(p.x, p.y) for p in bounding_box])

 def recognize_content():
    # sample document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

    form_recognizer_client = FormRecognizerClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = form_recognizer_client.begin_recognize_content_from_url(formUrl)
    form_pages = poller.result()

    for idx, content in enumerate(form_pages):
        print(
            "Page has width: {} and height: {}, measured with unit: {}".format(
                content.width, content.height, content.unit
            )
        )
        for table_idx, table in enumerate(content.tables):
            print(
                "Table # {} has {} rows and {} columns".format(
                    table_idx, table.row_count, table.column_count
                )
            )
            print(
                "Table # {} location on page: {}".format(
                    table_idx, format_bounding_box(table.bounding_box)
                )
            )
            for cell in table.cells:
                print(
                    "...Cell[{}][{}] has text '{}' within bounding box '{}'".format(
                        cell.row_index,
                        cell.column_index,
                        cell.text,
                        format_bounding_box(cell.bounding_box),
                    )
                )

        for line_idx, line in enumerate(content.lines):
            print(
                "Line # {} has word count '{}' and text '{}' within bounding box '{}'".format(
                    line_idx,
                    len(line.words),
                    line.text,
                    format_bounding_box(line.bounding_box),
                )
            )
            if line.appearance:
                if (
                    line.appearance.style_name == "handwriting"
                    and line.appearance.style_confidence > 0.8
                ):
                    print(
                        "Text line '{}' is handwritten and might be a signature.".format(
                            line.text
                        )
                    )
            for word in line.words:
                print(
                    "...Word '{}' has a confidence of {}".format(
                        word.text, word.confidence
                    )
                )

        for selection_mark in content.selection_marks:
            print(
                "Selection mark is '{}' within bounding box '{}' and has a confidence of {}".format(
                    selection_mark.state,
                    format_bounding_box(selection_mark.bounding_box),
                    selection_mark.confidence,
                )
            )
        print("----------------------------------------")


if __name__ == "__main__":
    recognize_content()

试用:预生成模型

此示例演示了如何以发票为例,使用预先训练的模型来分析某些类型常见文档中的数据。 请参阅预先构建的概念页面,获取发票字段的完整列表

  • 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档
  • 我们已将文件 URI 值添加到文件顶部的“formUrl”变量中。
  • 若要分析 URI 中的给定文件,将使用“begin_recognize_invoices_from_url”方法。
  • 为简洁起见,此处并未显示服务返回的所有字段。 若要查看所有支持的字段和相应类型的列表,请参阅发票概念页。

选择一个预生成模型

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:

  • 发票:从发票中提取文本、选择标记、表、字段和关键信息。
  • 收据:从收据中提取文本和关键信息。
  • 身份文档:从驾照和国际护照中提取文本和关键信息。
  • 名片:从名片中提取文本和关键信息。

将以下代码添加到预生成发票应用程序的 key 变量下方


def recognize_invoice():

    invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"

    form_recognizer_client = FormRecognizerClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    poller = form_recognizer_client.begin_recognize_invoices_from_url(
        invoiceUrl, locale="en-US"
    )
    invoices = poller.result()

    for idx, invoice in enumerate(invoices):
        vendor_name = invoice.fields.get("VendorName")
        if vendor_name:
            print(
                "Vendor Name: {} has confidence: {}".format(
                    vendor_name.value, vendor_name.confidence
                )
            )
        vendor_address = invoice.fields.get("VendorAddress")
        if vendor_address:
            print(
                "Vendor Address: {} has confidence: {}".format(
                    vendor_address.value, vendor_address.confidence
                )
            )
        vendor_address_recipient = invoice.fields.get("VendorAddressRecipient")
        if vendor_address_recipient:
            print(
                "Vendor Address Recipient: {} has confidence: {}".format(
                    vendor_address_recipient.value, vendor_address_recipient.confidence
                )
            )
        customer_name = invoice.fields.get("CustomerName")
        if customer_name:
            print(
                "Customer Name: {} has confidence: {}".format(
                    customer_name.value, customer_name.confidence
                )
            )
        customer_id = invoice.fields.get("CustomerId")
        if customer_id:
            print(
                "Customer Id: {} has confidence: {}".format(
                    customer_id.value, customer_id.confidence
                )
            )
        customer_address = invoice.fields.get("CustomerAddress")
        if customer_address:
            print(
                "Customer Address: {} has confidence: {}".format(
                    customer_address.value, customer_address.confidence
                )
            )
        customer_address_recipient = invoice.fields.get("CustomerAddressRecipient")
        if customer_address_recipient:
            print(
                "Customer Address Recipient: {} has confidence: {}".format(
                    customer_address_recipient.value,
                    customer_address_recipient.confidence,
                )
            )
        invoice_id = invoice.fields.get("InvoiceId")
        if invoice_id:
            print(
                "Invoice Id: {} has confidence: {}".format(
                    invoice_id.value, invoice_id.confidence
                )
            )
        invoice_date = invoice.fields.get("InvoiceDate")
        if invoice_date:
            print(
                "Invoice Date: {} has confidence: {}".format(
                    invoice_date.value, invoice_date.confidence
                )
            )
        invoice_total = invoice.fields.get("InvoiceTotal")
        if invoice_total:
            print(
                "Invoice Total: {} has confidence: {}".format(
                    invoice_total.value, invoice_total.confidence
                )
            )
        due_date = invoice.fields.get("DueDate")
        if due_date:
            print(
                "Due Date: {} has confidence: {}".format(
                    due_date.value, due_date.confidence
                )
            )
        purchase_order = invoice.fields.get("PurchaseOrder")
        if purchase_order:
            print(
                "Purchase Order: {} has confidence: {}".format(
                    purchase_order.value, purchase_order.confidence
                )
            )
        billing_address = invoice.fields.get("BillingAddress")
        if billing_address:
            print(
                "Billing Address: {} has confidence: {}".format(
                    billing_address.value, billing_address.confidence
                )
            )
        billing_address_recipient = invoice.fields.get("BillingAddressRecipient")
        if billing_address_recipient:
            print(
                "Billing Address Recipient: {} has confidence: {}".format(
                    billing_address_recipient.value,
                    billing_address_recipient.confidence,
                )
            )
        shipping_address = invoice.fields.get("ShippingAddress")
        if shipping_address:
            print(
                "Shipping Address: {} has confidence: {}".format(
                    shipping_address.value, shipping_address.confidence
                )
            )
        shipping_address_recipient = invoice.fields.get("ShippingAddressRecipient")
        if shipping_address_recipient:
            print(
                "Shipping Address Recipient: {} has confidence: {}".format(
                    shipping_address_recipient.value,
                    shipping_address_recipient.confidence,
                )
            )
        print("Invoice items:")
        for idx, item in enumerate(invoice.fields.get("Items").value):
            item_description = item.value.get("Description")
            if item_description:
                print(
                    "......Description: {} has confidence: {}".format(
                        item_description.value, item_description.confidence
                    )
                )
            item_quantity = item.value.get("Quantity")
            if item_quantity:
                print(
                    "......Quantity: {} has confidence: {}".format(
                        item_quantity.value, item_quantity.confidence
                    )
                )
            unit = item.value.get("Unit")
            if unit:
                print(
                    "......Unit: {} has confidence: {}".format(
                        unit.value, unit.confidence
                    )
                )
            unit_price = item.value.get("UnitPrice")
            if unit_price:
                print(
                    "......Unit Price: {} has confidence: {}".format(
                        unit_price.value, unit_price.confidence
                    )
                )
            product_code = item.value.get("ProductCode")
            if product_code:
                print(
                    "......Product Code: {} has confidence: {}".format(
                        product_code.value, product_code.confidence
                    )
                )
            item_date = item.value.get("Date")
            if item_date:
                print(
                    "......Date: {} has confidence: {}".format(
                        item_date.value, item_date.confidence
                    )
                )
            tax = item.value.get("Tax")
            if tax:
                print(
                    "......Tax: {} has confidence: {}".format(tax.value, tax.confidence)
                )
            amount = item.value.get("Amount")
            if amount:
                print(
                    "......Amount: {} has confidence: {}".format(
                        amount.value, amount.confidence
                    )
                )
        subtotal = invoice.fields.get("SubTotal")
        if subtotal:
            print(
                "Subtotal: {} has confidence: {}".format(
                    subtotal.value, subtotal.confidence
                )
            )
        total_tax = invoice.fields.get("TotalTax")
        if total_tax:
            print(
                "Total Tax: {} has confidence: {}".format(
                    total_tax.value, total_tax.confidence
                )
            )
        previous_unpaid_balance = invoice.fields.get("PreviousUnpaidBalance")
        if previous_unpaid_balance:
            print(
                "Previous Unpaid Balance: {} has confidence: {}".format(
                    previous_unpaid_balance.value, previous_unpaid_balance.confidence
                )
            )
        amount_due = invoice.fields.get("AmountDue")
        if amount_due:
            print(
                "Amount Due: {} has confidence: {}".format(
                    amount_due.value, amount_due.confidence
                )
            )
        service_start_date = invoice.fields.get("ServiceStartDate")
        if service_start_date:
            print(
                "Service Start Date: {} has confidence: {}".format(
                    service_start_date.value, service_start_date.confidence
                )
            )
        service_end_date = invoice.fields.get("ServiceEndDate")
        if service_end_date:
            print(
                "Service End Date: {} has confidence: {}".format(
                    service_end_date.value, service_end_date.confidence
                )
            )
        service_address = invoice.fields.get("ServiceAddress")
        if service_address:
            print(
                "Service Address: {} has confidence: {}".format(
                    service_address.value, service_address.confidence
                )
            )
        service_address_recipient = invoice.fields.get("ServiceAddressRecipient")
        if service_address_recipient:
            print(
                "Service Address Recipient: {} has confidence: {}".format(
                    service_address_recipient.value,
                    service_address_recipient.confidence,
                )
            )
        remittance_address = invoice.fields.get("RemittanceAddress")
        if remittance_address:
            print(
                "Remittance Address: {} has confidence: {}".format(
                    remittance_address.value, remittance_address.confidence
                )
            )
        remittance_address_recipient = invoice.fields.get("RemittanceAddressRecipient")
        if remittance_address_recipient:
            print(
                "Remittance Address Recipient: {} has confidence: {}".format(
                    remittance_address_recipient.value,
                    remittance_address_recipient.confidence,
                )
            )


if __name__ == "__main__":
    recognize_invoice()

运行应用程序

  1. 导航到 form_recognizer_quickstart.py 文件所在的文件夹。

  2. 在终端中键入以下命令:

python form_recognizer_quickstart.py

| 文档智能 REST API | Azure REST API 参考 |

在本快速入门中,你将使用以下 API 提取表单和文档中的结构化数据:

先决条件

  • Azure 订阅 - 免费创建订阅

  • 已安装 cURL

  • PowerShell 6.0 及以上版本,或类似的命令行应用程序。

  • Azure AI 服务或文档智能资源。 具有 Azure 订阅后,在 Azure 门户中创建单服务多服务文档智能资源以获取密钥和终结点。 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。

    提示

    如果计划通过一个终结点/密钥访问多个 Azure AI 服务,请创建 Azure AI 服务资源。 请创建仅供文档智能访问的文档智能资源。 请注意,如果你打算使用 Microsoft Entra 身份验证,则需要单一服务资源。

  • 部署资源后,选择“转到资源”。 需要从创建的资源获取密钥和终结点,以便将应用程序连接到文档智能 API。 稍后需要在本快速入门中将密钥和终结点粘贴到代码中:

    该屏幕截图显示了 Azure 门户中密钥和终结点的位置。

选择要复制并粘贴到应用程序中的代码示例:

重要

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。 对于生产来说,请使用安全的方式存储和访问凭据,例如 Azure Key Vault。 有关详细信息,请参阅 Azure AI 服务安全性

试用:布局模型

  • 对于此示例,需要 URI 中的一个文档文件。 在本快速入门中,可使用示例文档
  1. {endpoint} 替换为你通过文档智能订阅获取的终结点。
  2. {key} 替换为从上一步复制的密钥。
  3. \"{your-document-url} 替换为示例文档 URL:
https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf

请求

curl -v -i POST "https://{endpoint}/formrecognizer/v2.1/layout/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{​​​​​​​'urlSource': '{your-document-url}'}​​​​​​​​"

Operation-Location

你将收到 202 (Success) 响应,其中包含“Operation-Location”标头。 此头的值包含一个可用于查询异步操作状态和获取结果的结果 ID:

https://cognitiveservice/formrecognizer/v2.1/layout/analyzeResults/{resultId}

在以下示例中,analyzeResults/ 后面作为 URL 一部分的字符串就是结果 ID。

https://cognitiveservice/formrecognizer/v2/layout/analyzeResults/54f0b076-4e38-43e5-81bd-b85b8835fdfb

获取布局结果

调用分析布局 API 后,可调用获取分析布局结果 API,来获取操作状态和已提取的数据 。 运行该命令之前,请进行以下更改:

  1. {endpoint} 替换为你通过文档智能订阅获取的终结点。
  2. {key} 替换为从上一步复制的密钥。
  3. {resultId} 替换为上一步中的结果 ID。

请求

curl -v -X GET "https://{endpoint}/formrecognizer/v2.1/layout/analyzeResults/{resultId}" -H "Ocp-Apim-Subscription-Key: {key}"

检查结果

你将收到包含 JSON 内容的 200 (success) 响应。

请查看以下发票图像及其相应的 JSON 输出。

  • "readResults" 节点包含每一行文本,以及其各自在页面上的边界框位置。
  • selectionMarks 节点显示每个选择标记(复选框、单选按钮)以及其状态是 selected 还是 unselected
  • "pageResults" 部分包含提取的表。 对于每个表,将会提取文本、行和列索引、行和列跨距、边界框等。

包含表的 Contoso 项目声明文档。

响应正文

可以在 GitHub 上查看完整示例输出

试用:预生成模型

  • 对于此示例,我们将使用预生成模型来分析一个发票文档。 在本快速入门中,可使用示例发票文档

选择一个预生成模型

不止发票,还有几个预生成模型可供选择,每个模型都有自己的一组受支持的字段。 用于分析操作的模型取决于要分析的文档类型。 下面是文档智能服务目前支持的预生成模型:

  • 发票:从发票中提取文本、选择标记、表、字段和关键信息。
  • 收据:从收据中提取文本和关键信息。
  • 身份文档:从驾照和国际护照中提取文本和关键信息。
  • 名片:从名片中提取文本和关键信息。

运行该命令之前,请进行以下更改:

  1. {endpoint} 替换为你通过文档智能订阅获取的终结点。

  2. {key} 替换为从上一步复制的密钥。

  3. \"{your-document-url} 替换为示例发票 URL:

    https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf
    

请求

curl -v -i POST https://{endpoint}/formrecognizer/v2.1/prebuilt/invoice/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key:  {key}" --data-ascii "{​​​​​​​'urlSource': '{your invoice URL}'}​​​​​​​​"

Operation-Location

你将收到 202 (Success) 响应,其中包含“Operation-Location”标头。 此头的值包含一个可用于查询异步操作状态和获取结果的结果 ID:

https://cognitiveservice/formrecognizer/v2.1/prebuilt/receipt/analyzeResults/{resultId}

在以下示例中,analyzeResults/ 后面作为 URL 一部分的字符串就是结果 ID:

https://cognitiveservice/formrecognizer/v2.1/prebuilt/invoice/analyzeResults/54f0b076-4e38-43e5-81bd-b85b8835fdfb

获取账单结果

调用分析账单 API 后,可调用获取分析账单结果 API,来获取操作状态和已提取的数据 。 运行该命令之前,请进行以下更改:

  1. {endpoint} 替换为你通过文档智能密钥获取的终结点。 可以在文档智能资源“概述”选项卡中找到它。
  2. {resultId} 替换为上一步中的结果 ID。
  3. {key} 替换为你的密钥。

请求

curl -v -X GET "https://{endpoint}/formrecognizer/v2.1/prebuilt/invoice/analyzeResults/{resultId}" -H "Ocp-Apim-Subscription-Key: {key}"

检查响应

你将收到包含 JSON 输出的 200 (Success) 响应。

  • "readResults" 字段包含从发票中提取的每行文本。
  • "pageResults" 包含从发票中提取的表和选择标记。
  • "documentResults" 字段包含发票中最相关部分的键/值信息。

请参阅示例发票文档。

响应正文

请参阅 GitHub 上的完整示例输出

就是这样,做得很棒!

后续步骤

  • 若要获得增强的体验和高级模型质量,请尝试文档智能工作室

    • 工作室支持使用 v2.1 标记的数据训练的任何模型。

    • 更改日志提供有关从 v3.1 迁移到 v4.0 的详细信息。