透過 REST API 建立自定義分析器

Content Understanding 分析器會定義如何處理和擷取內容中的見解。它們可確保所有內容的統一處理和輸出結構，以提供可靠且可預測的結果。我們針對常見的使用案例提供預先建置的分析器。本指南說明如何自定義這些分析器，以更符合您的需求。

在本指南中，我們使用 cURL 命令行工具。如果未安裝，您可以下載適用於開發環境的適當版本。

先決條件

若要開始使用，請確定您具有下列資源和權限：

Azure 訂用帳戶。如果您沒有 Azure 訂用帳戶，請建立免費帳戶。
一旦您有了 Azure 訂用帳戶，請在 Azure 入口網站建立 Microsoft Foundry 資源。請務必在支持的區域中建立它。
- 此資源會在入口網站的 Foundry>Foundry 下列出。
為你的內容理解資源設定預設模型部署。設定預設值會建立與你用於內容理解請求的 Foundry 模型的連結。請選擇下列其中一個方法：
- Portal
- REST API
1. 前往內容理解設定頁面
2. 請選擇左上角的「+ 新增資源」按鈕
3. 選擇你想使用的 Foundry 資源，然後點選「下一步」，然後儲存
  - 務必勾選「若無預設值，啟用所需模型自動部署」。這能確保您的資源已完全設定必要的 GPT-4.1、GPT-4.1-mini 及 text-embedding-3-large 模型。不同的預組分析儀需要不同型號。
透過這些步驟，你在 Foundry 資源中建立了內容理解與 Foundry 模型之間的連結。
1. 在您的 Foundry 資源中建立 GPT-4.1、GPT-4.1-mini 及 text-embedding-3-large 的 Foundry 模型部署。關於如何部署這些模型的詳細資訊，請參閱在 Microsoft Foundry 入口網站中建立模型部署。不同的預建分析儀需要不同的模型，所以你需要部署這三種。
2. 在資源層級定義預設模型部署。
  
  執行下列 cURL 命令之前，請先對 HTTP 要求進行下列變更：
  - 在 Azure 入口網站中，將 {endpoint} 和 {key} 替換成您 Foundry 實例中的對應值。
  - 將{myGPT41Deployment}、{myGPT41MiniDeployment}與{myEmbeddingDeployment}替換成你 Foundry 資源中實際的模型部署名稱。
```
curl -i -X PATCH "{endpoint}/contentunderstanding/defaults?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "modelDeployments": {
          "gpt-4.1": "{myGPT41Deployment}",
          "gpt-4.1-mini": "{myGPT41MiniDeployment}",
          "text-embedding-3-large": "{myEmbeddingDeployment}"
        }
      }'
```
透過這些步驟，你在 Foundry 資源中建立了內容理解與 Foundry 模型之間的連結。

定義分析器架構

若要建立自定義分析器，請定義欄位架構，以描述您想要擷取的結構化數據。在下列範例中，我們會根據預先建置的檔分析器建立分析器來處理收據。

使用下列內容建立名為 receipt.json 的 JSON 檔案：

{
  "description": "Sample receipt analyzer",
  "baseAnalyzerId": "prebuilt-document",
  "models": {
      "completion": "gpt-4.1",
      "embedding": "text-embedding-ada-002"

    },
  "config": {
    "returnDetails": true,
    "enableFormula": false,
    "disableContentFiltering": false,
    "estimateFieldSourceAndConfidence": true,
    "tableFormat": "html"
  },
 "fieldSchema": {
    "fields": {
      "VendorName": {
        "type": "string",
        "method": "extract",
        "description": "Vendor issuing the receipt"
      },
      "Items": {
        "type": "array",
        "method": "extract",
        "items": {
          "type": "object",
          "properties": {
            "Description": {
              "type": "string",
              "method": "extract",
              "description": "Description of the item"
            },
            "Amount": {
              "type": "number",
              "method": "extract",
              "description": "Amount of the item"
            }
          }
        }
      }
    }
  }
}

如果你有各種文件需要處理，但只想分類和分析收據，你可以建立分析器先分類文件。接著，將它路由到你上面建立的分析器，並採用以下架構。

使用下列內容建立名為 categorize.json 的 JSON 檔案：

{
  "baseAnalyzerId": "prebuilt-document",
  // Use the base analyzer to invoke the document specific capabilities.

  //Specify the model the analyzer should use. This is one of the supported completion models and one of the supported embeddings model. The specific deployment used during analyze is set on the resource or provided in the analyze request.
  "models": {
      "completion": "gpt-4.1",
      "embedding": "text-embedding-ada-002"

    },
  "config": {
    // Enable splitting of the input into segments. Set this property to false if you only expect a single document within the input file. When specified and enableSegment=false, the whole content will be classified into one of the categories.
    "enableSegment": false,

    "contentCategories": {
      // Category name.
      "receipt": {
        // Description to help with classification and splitting.
        "description": "Any images or documents of receipts",

        // Define the analyzer that any content classified as a receipt should be routed to
        "analyzerId": "receipt"
      },

      "invoice": {
        "description": "Any images or documents of invoice",
        "analyzerId": "prebuilt-invoice"
      },
      "policeReport": {
        "description": "A police or law enforcement report detailing the events that lead to the loss."
        // Don't perform analysis for this category.
      }

    },

    // Omit original content object and only return content objects from additional analysis.
    "omitContent": true
  }

  //You can use fieldSchema here to define fields that are needed from the entire input content.

}

若要建立自定義分析器，請定義欄位架構，以描述您想要擷取的結構化數據。在下列範例中，我們會根據預先建置的影像分析器來建立分析器，以處理圖表和圖形的影像。

使用下列內容建立名為 request_body.json 的 JSON 檔案：

{
  "description": "Sample image analyzer for charts and graphs",
  "baseAnalyzerId": "prebuilt-image",
  "models": {
      "completion": "gpt-4.1"
    },
  "config": {
    "disableContentFiltering": false
 },
 "fieldSchema": {
    "fields": {
      "Title": {
        "type": "string"
      },
      "ChartType": {
        "type": "string",
        "method": "classify",
        "enum": [ "bar", "line", "pie" ]
      }
    }
  }
}

若要建立自定義分析器，請定義欄位架構，以描述您想要擷取的結構化數據。在下列範例中，我們會根據預先建置的通話中心分析器來建立分析器，以處理客戶支援通話錄製。

使用下列內容建立名為 request_body.json 的 JSON 檔案：

{
  "description": "Sample customer support call analyzer",
  "baseAnalyzerId": "prebuilt-audio",
  "config": {
    "locales": ["en-US", "fr-FR"],
    "returnDetails": true,
    "disableContentFiltering": false
  },
  "fieldSchema": {
    "fields": {
      "Summary": {
        "type": "string",
        "method": "generate"
      },
      "Sentiment": {
        "type": "string",
        "method": "classify",
        "enum": ["Positive", "Neutral", "Negative"]
      },
      "People": {
        "type": "array",
        "description": "List of people mentioned",
        "items": {
          "type": "object",
          "properties": {
            "Name": { "type": "string" },
            "Role": { "type": "string" }
          }
        }
      }
    }
  }
}

若要建立自定義分析器，請定義欄位架構，以描述您想要擷取的結構化數據。在下列範例中，我們會根據預先建置的影片分析器來建立分析器，以處理產品示範和評論。

使用下列內容建立名為 request_body.json 的 JSON 檔案：

{
  "description": "Sample product demo video analyzer",
  "baseAnalyzerId": "prebuilt-video",
  "models": {
      "completion": "gpt-4.1"
    },
  "config": {
    "locales": ["en-US", "fr-FR"],
    "returnDetails": true,
    "enableFace": false,
    "disableFaceBlurring": false,
    "personDirectoryId": null,
    "segmentationMode": "auto",
    "disableContentFiltering": false
  },
   "fieldSchema": {
    "fields": {
      "Segments": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "SegmentId": {
              "type": "string"
            },
            "Description": {
              "type": "string",
              "method": "generate",
              "description": "Detailed summary of the video segment, focusing on product characteristics, lighting, and color palette."
            },
            "Sentiment": {
              "type": "string",
              "method": "classify",
              "enum": ["Positive", "Neutral", "Negative"]
            }
          }
        }
      }
    }
  }
}

建立分析器

PUT 要求

先建立一個收據分析器，然後再建立分類分析器。

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @receipt.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

PUT 回應

201 Created 回應包含 Operation-Location 標頭，其中包含可用來追蹤此異步分析器建立作業狀態的 URL。

201 Created
Operation-Location: {endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2025-05-01-preview

完成時，在作業位置 URL 上執行 HTTP GET 會傳 "status": "succeeded"回。

curl -i -X GET "{endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}"

分析檔案

傳送檔案

您現在可以使用您建立的自定義分析器來處理檔案，並擷取您在架構中定義的欄位。

在執行 cURL 命令之前，請先對 HTTP 要求進行下列變更：

將 {endpoint} 和 {key} 替換為 Azure portal Foundry 實例中的端點與鍵值。
請將你用該{analyzerId}檔案建立的自訂分析器名稱替換categorize.json。
將 {fileUrl} 取代為要分析之檔案的可公開存取 URL，例如具有共用存取簽章 (SAS) 或範例 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/receipt.png 的 Azure 儲存體 Blob 路徑。

將 {endpoint} 和 {key} 取代為來自 Azure 入口網站 Microsoft Foundry 執行個體的端點與金鑰值。
將 {analyzerId} 取代為稍早建立的自訂分析器名稱。
將 {fileUrl} 取代為要分析之檔案的可公開存取 URL，例如具有共用存取簽章 (SAS) 或範例 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpg 的 Azure 儲存體 Blob 路徑。

將 {endpoint} 和 {key} 取代為來自 Azure 入口網站 Microsoft Foundry 執行個體的端點與金鑰值。
將 {analyzerId} 取代為稍早建立的自訂分析器名稱。
將 {fileUrl} 取代為要分析之檔案的可公開存取 URL，例如具有共用存取簽章 (SAS) 或範例 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wav 的 Azure 儲存體 Blob 路徑。

將 {endpoint} 和 {key} 取代為來自 Azure 入口網站 Microsoft Foundry 執行個體的端點與金鑰值。
將 {analyzerId} 取代為稍早建立的自訂分析器名稱。
將 {fileUrl} 取代為要分析之檔案的可公開存取 URL，例如具有共用存取簽章 (SAS) 或範例 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4 的 Azure 儲存體 Blob 路徑。

POST 要求

這個例子使用你用categorize.json檔案建立的自訂分析器來分析收據。

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/receipt.png"
          }          
        ]
      }'

這個例子是用你自訂的分析器來分析圖表或圖形影像。

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpg"
          }          
        ]
      }'

這個範例是用你建立的自訂分析器來分析客戶支援通話錄音。

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wav"
          }          
        ]
      }'

這個例子是用你自製的分析器來分析產品展示影片。

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4"
          }          
        ]
      }'

POST 回應

202 Accepted 回應包括 {resultId}，您可以使用它來追蹤此非同步作業的狀態。

{
  "id": {resultId},
  "status": "Running",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": []
  }
}

取得分析結果

使用來自 POST 回應的 Operation-Location 並擷取分析的結果。

GET 要求

curl -i -X GET "{endpoint}/contentunderstanding/analyzerResults/{resultId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}"

GET 回應

200 OK回應包含status顯示作業進度的欄位。

如果操作成功完成，則 status 為 Succeeded。
如果是 running 或 notStarted，請再次手動或使用腳本呼叫 API：在每次呼叫之間至少等候一秒。

範例回應

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "path": "input1/segment1",
        "category": "receipt",
        "markdown": "Contoso\n\n123 Main Street\nRedmond, WA 98052\n\n987-654-3210\n\n6/10/2019 13:59\nSales Associate: Paul\n\n\n<table>\n<tr>\n<td>2 Surface Pro 6</td>\n<td>$1,998.00</td>\n</tr>\n<tr>\n<td>3 Surface Pen</td>\n<td>$299.97</td>\n</tr>\n</table> ...",
        "fields": {
          "VendorName": {
            "type": "string",
            "valueString": "Contoso",
            "spans": [{"offset": 0,"length": 7}],
            "confidence": 0.996,
            "source": "D(1,774.0000,72.0000,974.0000,70.0000,974.0000,111.0000,774.0000,113.0000)"
          },
          "Items": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "Description": {
                    "type": "string",
                    "valueString": "2 Surface Pro 6",
                    "spans": [ { "offset": 115, "length": 15}],
                    "confidence": 0.423,
                    "source": "D(1,704.0000,482.0000,875.0000,482.0000,875.0000,508.0000,704.0000,508.0000)"
                  },
                  "Amount": {
                    "type": "number",
                    "valueNumber": 1998,
                    "spans": [{ "offset": 140,"length": 9}
                    ],
                    "confidence": 0.957,
                    "source": "D(1,952.0000,482.0000,1048.0000,482.0000,1048.0000,508.0000,952.0000,509.0000)"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1,
            "angle": -0.0944,
            "width": 1743,
            "height": 878
          }
        ],
        "analyzerId": "{analyzerId}",
        "mimeType": "image/png"
      }
    ]
  },
  "usage": {
    "documentPages": 1,
    "tokens": {
      "contextualization": 1000
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "![image](image)\n",
        "fields": {
          "Title": {
            "type": "string",
            "valueString": "Weekly Work Hours Distribution"
          },
          "ChartType": {
            "type": "string",
            "valueString": "pie"
          }
        },
       "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1
          }
        ],
        "analyzerId": "{analyzerId}",
        "mimeType": "image/jpeg"
      }
    ]
  },
  "usage": {
    "tokens": {
      "contextualization": 1000
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Audio: 00:00.000 => 01:54.670\nTranscript\n```\n<v Agent>Thank you for calling Woodgrove Travel...\n<v Customer>Hi Isabella, my name is John Smith...\n<v Agent>Could you provide flight details?\n<v Customer>Contoso Airways, flight CA123...\n<v Agent>Sorry to 
                     hear that...\n<v Customer>Flight delay made me miss meeting...\n<v Agent>We’ll offer a partial refund...\n<v Customer>Thanks, appreciate your help!\n```",
        "fields": {
          "Summary": {
            "type": "string",
            "valueString": "John Smith contacted Woodgrove Travel to report a negative experience with a flight on Contoso Airways ..."
          },
          "Sentiment": {
            "type": "string",
            "valueString": "Positive"
          },
          "People": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "Name": {
                    "type": "string",
                    "valueString": "Isabella Taylor"
                  },
                  "Role": {
                    "type": "string",
                    "valueString": "Agent"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 114670,
        "transcriptPhrases": [
          {
            "speaker": "Agent",
            "startTimeMs": 80,
            "endTimeMs": 2160,
            "text": "Thank you for calling Woodgrove Travel.",
            "words": []
          }, ...

        ]
      }
    ]
  },
  "usage": {
    "audioHours": 0.032,
    "tokens": {
      "contextualization": 3194.445
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SS",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Video: 00:00 => 00:43\n## Segment 1: Island view\nTranscript\n```\n00:01 --> 00:06\n<Speaker 1>Good data improves TTS.\n```\nKey Frames: ![](keyFrame.726.jpg) ## Segment 2: Data center\nTranscript\n```\n00:07 --> 00:13\n<Speaker 2>We trained on 3,000   
                     hours.\n```\nKey Frames: ![](keyFrame.2046.jpg) ![](keyFrame.4884.jpg)",
        "fields": {
          "Segments": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  
                  "SegmentId": {
                    "type": "string",
                    "valueString": "00:00:00.000-00:00:01.467"
                  },
                  "Description": {
                    "type": "string",
                    "valueString": "The video opens with a dramatic aerial shot of a small airplane flying over a tropical island surrounded by turquoise waters. The logos for 'Flight Simulator' and 'Microsoft Azure AI' are prominently displayed, indicating a collaboration or feature integration between the two."
                  },
                  "Sentiment": {
                    "type": "string",
                    "valueString": "Positive"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 43866,
        "width": 1080,
        "height": 608,
        "KeyFrameTimesMs": [733, ... , 43233],
        "transcriptPhrases": [
          {
            "speaker": "Speaker 1",
            "startTimeMs": 1360,
            "endTimeMs": 6640,
            "text": "When it comes to the neural TTS, in order to get a good voice, it's better to have good data.",
            "words": []
          }, ...
        ],
        "cameraShotTimesMs": [1467, ...  42033],
        "segments": [
          {
            "startTimeMs": 0,
            "endTimeMs": 1467,
            "description": "The video begins with a scenic aerial view of an island, showcasing the collaboration between Flight Simulator and Microsoft Azure AI.",
            "segmentId": "1"
          }, ...
        ]
      }
    ]
  },
  "usage": {
    "videoHours": 0.013,
    "tokens": {
      "contextualization": 12222.223
    }
  }
}

後續步驟

檢閱範例程式碼：視覺文件搜尋。
檢閱範例程式碼：分析器範本。
試著在 Foundry 裡使用 Content Understanding 來處理你的文件內容。

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-11-24

共用方式為

透過 REST API 建立自定義分析器

先決條件

定義分析器架構

建立分析器

PUT 要求

PUT 回應

分析檔案

傳送檔案

POST 要求

POST 回應

取得分析結果

GET 要求

GET 回應

範例回應

後續步驟

意見反應

其他資源