사용자 지정 분석기 만들기

Content Understanding 분석기는 콘텐츠에서 인사이트를 처리하고 추출하는 방법을 정의합니다. 모든 콘텐츠에서 균일한 처리 및 출력 구조를 보장하므로 안정적이고 예측 가능한 결과를 얻을 수 있습니다. 일반적인 사용 사례의 경우 미리 빌드된 분석기를 사용할 수 있습니다. 이 가이드에서는 이러한 분석기를 사용자 지정하여 요구에 더 잘 맞는 방법을 보여 줍니다.

이 가이드에서는 Content Understanding REST API 를 사용하여 콘텐츠에서 구조화된 데이터를 추출하는 사용자 지정 분석기를 만드는 방법을 보여 줍니다.

필수 조건

활성 Azure 구독입니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
지원되는 지역에서 만든 Microsoft Foundry 리소스입니다.
- 포털은 Foundry>Foundry 아래에 이 리소스를 나열합니다.
Content Understanding 리소스에 대한 기본 모델 배포를 설정합니다. 기본적으로 설정하면 Content Understanding 요청에 사용하는 Microsoft Foundry 모델에 대한 연결을 만듭니다. 다음 방법 중 하나를 선택합니다.
- 포털
- REST API
1. Content Understanding 설정 페이지로 이동합니다.
2. 왼쪽 위에서 + 리소스 추가 단추를 선택합니다.
3. 사용할 Foundry 리소스를 선택하고 다음>저장을 선택합니다.
  
  사용 가능한 기본값이 없는 경우 필수 모델에 대한 자동 배포 사용 확인란이 선택되어 있는지 확인합니다. 이 선택은 리소스를 필요한 GPT-4.1, GPT-4.1-mini, 및 text-embedding-3-large 모델로 완전히 설정합니다. 미리 빌드된 분석기별로 다른 모델이 필요합니다.
이러한 단계를 수행하여 Foundry 리소스에서 Content Understanding과 Foundry 모델 간에 연결을 설정합니다.
1. Foundry 리소스에서 GPT-4.1, GPT-4.1-mini, text-embedding-3-large 모델의 Foundry 모델 배포를 만드세요. 이러한 모델을 배포하는 방법에 대한 자세한 내용은 Microsoft Foundry 포털에서 모델 배포 만들기를 참조하세요. 미리 빌드된 분석기별로 다른 모델이 필요하므로 세 가지를 모두 배포해야 합니다.
2. 리소스 수준에서 기본 모델 배포를 정의합니다. 다음 cURL 명령을 실행하기 전에 HTTP 요청을 다음과 같이 변경합니다.
  1. Azure Portal에서 Foundry 인스턴스의 해당 값으로 {endpoint} 및 {key}을(를) 교체합니다.
  2. {myGPT41Deployment}, {myGPT41MiniDeployment}, {myEmbeddingDeployment}를 Foundry 리소스에서 실제 모델 배포 이름으로 교체합니다.
```
curl -i -X PATCH "{endpoint}/contentunderstanding/defaults?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "modelDeployments": {
          "gpt-4.1": "{myGPT41Deployment}",
          "gpt-4.1-mini": "{myGPT41MiniDeployment}",
          "text-embedding-3-large": "{myEmbeddingDeployment}"
        }
      }'
```
cURL이(가) 개발 환경에 설치되었습니다.

분석기 스키마 정의

사용자 지정 분석기를 만들려면 추출하려는 구조적 데이터를 설명하는 필드 스키마를 정의합니다. 다음 예제에서는 영수증 처리를 위해 미리 빌드된 문서 분석기를 기반으로 분석기를 만듭니다.

다음 콘텐츠로 명명된 JSON 파일을 만듭니다 receipt.json .

{
  "description": "Sample receipt analyzer",
  "baseAnalyzerId": "prebuilt-document",
  "models": {
      "completion": "gpt-4.1",
      "embedding": "text-embedding-3-large"

    },
  "config": {
    "returnDetails": true,
    "enableFormula": false,
    "estimateFieldSourceAndConfidence": true,
    "tableFormat": "html"
  },
 "fieldSchema": {
    "fields": {
      "VendorName": {
        "type": "string",
        "method": "extract",
        "description": "Vendor issuing the receipt"
      },
      "Items": {
        "type": "array",
        "method": "extract",
        "items": {
          "type": "object",
          "properties": {
            "Description": {
              "type": "string",
              "method": "extract",
              "description": "Description of the item"
            },
            "Amount": {
              "type": "number",
              "method": "extract",
              "description": "Amount of the item"
            }
          }
        }
      }
    }
  }
}

처리해야 하는 다양한 유형의 문서가 있지만 영수증만 분류하고 분석하려는 경우 먼저 문서를 분류하는 분석기를 만듭니다. 그런 다음, 다음 스키마를 사용하여 이전에 만든 분석기로 라우팅합니다.

다음 콘텐츠로 명명된 JSON 파일을 만듭니다 categorize.json .

{
  "baseAnalyzerId": "prebuilt-document",
  // Use the base analyzer to invoke the document specific capabilities.

  //Specify the model the analyzer should use. This is one of the supported completion models and one of the supported embeddings model. The specific deployment used during analyze is set on the resource or provided in the analyze request.
  "models": {
      "completion": "gpt-4.1"
    },
  "config": {
    // Enable splitting of the input into segments. Set this property to false if you only expect a single document within the input file. When specified and enableSegment=false, the whole content will be classified into one of the categories.
    "enableSegment": false,

    "contentCategories": {
      // Category name.
      "receipt": {
        // Description to help with classification and splitting.
        "description": "Any images or documents of receipts",

        // Define the analyzer that any content classified as a receipt should be routed to
        "analyzerId": "receipt"
      },

      "invoice": {
        "description": "Any images or documents of invoice",
        "analyzerId": "prebuilt-invoice"
      },
      "policeReport": {
        "description": "A police or law enforcement report detailing the events that lead to the loss."
        // Don't perform analysis for this category.
      }

    },

    // Omit original content object and only return content objects from additional analysis.
    "omitContent": true
  }

  //You can use fieldSchema here to define fields that are needed from the entire input content.

}

사용자 지정 분석기를 만들려면 추출하려는 구조적 데이터를 설명하는 필드 스키마를 정의합니다. 다음 예제에서는 차트 및 그래프 의 이미지를 처리하기 위해 미리 빌드된 이미지 분석기를 기반으로 분석기를 만듭니다.

다음 콘텐츠로 명명된 JSON 파일을 만듭니다 request_body.json .

{
  "description": "Sample image analyzer for charts and graphs",
  "baseAnalyzerId": "prebuilt-image",
  "models": {
      "completion": "gpt-4.1"
    },
 "fieldSchema": {
    "fields": {
      "Title": {
        "type": "string"
      },
      "ChartType": {
        "type": "string",
        "method": "classify",
        "enum": [ "bar", "line", "pie" ]
      }
    }
  }
}

사용자 지정 분석기를 만들려면 추출하려는 구조적 데이터를 설명하는 필드 스키마를 정의합니다. 다음 예제에서는 고객 지원 전화 통화 녹취를 처리하기 위해 사전에 구성된 콜 센터 분석기를 기반으로 하는 분석기를 만듭니다.

다음 콘텐츠로 명명된 JSON 파일을 만듭니다 request_body.json .

{
  "description": "Sample customer support call analyzer",
  "baseAnalyzerId": "prebuilt-audio",
  "config": {
    "locales": ["en-US", "fr-FR"],
    "returnDetails": true
  },
  "fieldSchema": {
    "fields": {
      "Summary": {
        "type": "string",
        "method": "generate"
      },
      "Sentiment": {
        "type": "string",
        "method": "classify",
        "enum": ["Positive", "Neutral", "Negative"]
      },
      "People": {
        "type": "array",
        "description": "List of people mentioned",
        "items": {
          "type": "object",
          "properties": {
            "Name": { "type": "string" },
            "Role": { "type": "string" }
          }
        }
      }
    }
  }
}

사용자 지정 분석기를 만들려면 추출하려는 구조적 데이터를 설명하는 필드 스키마를 정의합니다. 다음 예제에서는 제품 데모 및 검토를 처리하기 위해 미리 빌드된 비디오 분석기를 기반으로 분석기를 만듭니다.

다음 콘텐츠로 명명된 JSON 파일을 만듭니다 request_body.json .

{
  "description": "Sample product demo video analyzer",
  "baseAnalyzerId": "prebuilt-video",
  "models": {
      "completion": "gpt-4.1"
    },
  "config": {
    "locales": ["en-US", "fr-FR"],
    "returnDetails": true,
    "disableFaceBlurring": false
  },
   "fieldSchema": {
    "fields": {
      "Segments": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "SegmentId": {
              "type": "string"
            },
            "Description": {
              "type": "string",
              "method": "generate",
              "description": "Detailed summary of the video segment, focusing on product characteristics, lighting, and color palette."
            },
            "Sentiment": {
              "type": "string",
              "method": "classify",
              "enum": ["Positive", "Neutral", "Negative"]
            }
          }
        }
      }
    }
  }
}

분석기 만들기

PUT 요청

먼저 영수증 분석기를 만든 다음 범주 분석기를 만듭니다.

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @receipt.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

PUT 응답

응답은 URL이 있는 201 Created 헤더를 포함하며, 이 URL을 사용하여 Operation-Location 비동기 분석기 만들기 작업의 상태를 추적할 수 있습니다.

201 Created
Operation-Location: {endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2025-05-01-preview

작업이 완료되면 작업 위치 URL의 HTTP GET이 반환됩니다 "status": "succeeded".

curl -i -X GET "{endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}"

파일 분석

파일 제출

이제 만든 사용자 지정 분석기를 사용하여 파일을 처리하고 스키마에서 정의한 필드를 추출할 수 있습니다.

cURL 명령을 실행하기 전에 HTTP 요청에 다음과 같은 변경 내용을 적용합니다.

{endpoint} 및 {key}를 Azure 포털의 Foundry 인스턴스에서 가져온 엔드포인트 및 키 값으로 교체하십시오.
{analyzerId}를 categorize.json 파일을 사용하여 만든 사용자 정의 분석기의 이름으로 바꾸십시오.
{fileUrl}을 분석할 파일의 공개적으로 액세스 가능한 URL(예: SAS(공유 액세스 서명)이 있는 Azure Storage Blob 경로 또는 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/receipt.png)로 바꿉니다.

{endpoint} 및 {key}를 Azure 포털의 Foundry 인스턴스에서 가져온 엔드포인트 및 키 값으로 교체하십시오.
사용자 지정 분석기의 이름으로 {analyzerId}를 바꿉니다.
{fileUrl}을 분석할 파일의 공개적으로 액세스 가능한 URL(예: SAS(공유 액세스 서명)이 있는 Azure Storage Blob 경로 또는 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpg)로 바꿉니다.

{endpoint} 및 {key}를 Azure 포털의 Foundry 인스턴스에서 가져온 엔드포인트 및 키 값으로 교체하십시오.
사용자 지정 분석기의 이름으로 {analyzerId}를 바꿉니다.
{fileUrl}을 분석할 파일의 공개적으로 액세스 가능한 URL(예: SAS(공유 액세스 서명)이 있는 Azure Storage Blob 경로 또는 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wav)로 바꿉니다.

{endpoint} 및 {key}를 Azure 포털의 Foundry 인스턴스에서 가져온 엔드포인트 및 키 값으로 교체하십시오.
사용자 지정 분석기의 이름으로 {analyzerId}를 바꿉니다.
{fileUrl}을 분석할 파일의 공개적으로 액세스 가능한 URL(예: SAS(공유 액세스 서명)이 있는 Azure Storage Blob 경로 또는 샘플 URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4)로 바꿉니다.

POST 요청

이 예제에서는 파일과 함께 categorize.json 만든 사용자 지정 분석기를 사용하여 영수증을 분석합니다.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/receipt.png"
          }          
        ]
      }'

이 예제에서는 만든 사용자 지정 분석기를 사용하여 차트 또는 그래프 이미지를 분석합니다.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpg"
          }          
        ]
      }'

이 예제에서는 자신이 만든 사용자 지정 분석기를 사용하여 고객 지원 통화 녹음을 분석합니다.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wav"
          }          
        ]
      }'

이 예제에서는 사용자가 만든 사용자 지정 분석기를 사용하여 제품 데모 비디오를 분석합니다.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4"
          }          
        ]
      }'

POST 응답

응답에는 202 Accepted 이 비동기 작업의 상태를 추적하는 데 사용할 수 있는 항목이 포함됩니다 {resultId} .

{
  "id": {resultId},
  "status": "Running",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": []
  }
}

분석 결과 가져오기

응답의 Operation-LocationPOST 결과를 사용하여 분석 결과를 가져옵니다.

GET 요청

curl -i -X GET "{endpoint}/contentunderstanding/analyzerResults/{resultId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}"

GET 응답

200 OK 응답에는 작업의 진행률을 보여 주는 필드가 포함됩니다status.

status은(는) 작업이 성공적으로 완료된 경우 Succeeded입니다.
상태가 running 아니면 notStarted수동으로 API를 다시 호출하거나 스크립트를 사용합니다. 요청 사이에 1초 이상 기다립니다.

샘플 응답

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "path": "input1/segment1",
        "category": "receipt",
        "markdown": "Contoso\n\n123 Main Street\nRedmond, WA 98052\n\n987-654-3210\n\n6/10/2019 13:59\nSales Associate: Paul\n\n\n<table>\n<tr>\n<td>2 Surface Pro 6</td>\n<td>$1,998.00</td>\n</tr>\n<tr>\n<td>3 Surface Pen</td>\n<td>$299.97</td>\n</tr>\n</table> ...",
        "fields": {
          "VendorName": {
            "type": "string",
            "valueString": "Contoso",
            "spans": [{"offset": 0,"length": 7}],
            "confidence": 0.996,
            "source": "D(1,774.0000,72.0000,974.0000,70.0000,974.0000,111.0000,774.0000,113.0000)"
          },
          "Items": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "Description": {
                    "type": "string",
                    "valueString": "2 Surface Pro 6",
                    "spans": [ { "offset": 115, "length": 15}],
                    "confidence": 0.423,
                    "source": "D(1,704.0000,482.0000,875.0000,482.0000,875.0000,508.0000,704.0000,508.0000)"
                  },
                  "Amount": {
                    "type": "number",
                    "valueNumber": 1998,
                    "spans": [{ "offset": 140,"length": 9}
                    ],
                    "confidence": 0.957,
                    "source": "D(1,952.0000,482.0000,1048.0000,482.0000,1048.0000,508.0000,952.0000,509.0000)"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1,
            "angle": -0.0944,
            "width": 1743,
            "height": 878
          }
        ],
        "analyzerId": "{analyzerId}",
        "mimeType": "image/png"
      }
    ]
  },
  "usage": {
    "documentPages": 1,
    "tokens": {
      "contextualization": 1000
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "![image](image)\n",
        "fields": {
          "Title": {
            "type": "string",
            "valueString": "Weekly Work Hours Distribution"
          },
          "ChartType": {
            "type": "string",
            "valueString": "pie"
          }
        },
       "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1
          }
        ],
        "analyzerId": "{analyzerId}",
        "mimeType": "image/jpeg"
      }
    ]
  },
  "usage": {
    "tokens": {
      "contextualization": 1000
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Audio: 00:00.000 => 01:54.670\nTranscript\n```\n<v Agent>Thank you for calling Woodgrove Travel...\n<v Customer>Hi Isabella, my name is John Smith...\n<v Agent>Could you provide flight details?\n<v Customer>Contoso Airways, flight CA123...\n<v Agent>Sorry to 
                     hear that...\n<v Customer>Flight delay made me miss meeting...\n<v Agent>We'll offer a partial refund...\n<v Customer>Thanks, appreciate your help!\n```",
        "fields": {
          "Summary": {
            "type": "string",
            "valueString": "John Smith contacted Woodgrove Travel to report a negative experience with a flight on Contoso Airways ..."
          },
          "Sentiment": {
            "type": "string",
            "valueString": "Positive"
          },
          "People": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "Name": {
                    "type": "string",
                    "valueString": "Isabella Taylor"
                  },
                  "Role": {
                    "type": "string",
                    "valueString": "Agent"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 114670,
        "transcriptPhrases": [
          {
            "speaker": "Agent",
            "startTimeMs": 80,
            "endTimeMs": 2160,
            "text": "Thank you for calling Woodgrove Travel.",
            "words": []
          }, ...

        ]
      }
    ]
  },
  "usage": {
    "audioHours": 0.032,
    "tokens": {
      "contextualization": 3194.445
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SS",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Video: 00:00 => 00:43\n## Segment 1: Island view\nTranscript\n```\n00:01 --> 00:06\n<Speaker 1>Good data improves TTS.\n```\nKey Frames: ![](keyFrame.726.jpg) ## Segment 2: Data center\nTranscript\n```\n00:07 --> 00:13\n<Speaker 2>We trained on 3,000   
                     hours.\n```\nKey Frames: ![](keyFrame.2046.jpg) ![](keyFrame.4884.jpg)",
        "fields": {
          "Segments": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  
                  "SegmentId": {
                    "type": "string",
                    "valueString": "00:00:00.000-00:00:01.467"
                  },
                  "Description": {
                    "type": "string",
                    "valueString": "The video opens with a dramatic aerial shot of a small airplane flying over a tropical island surrounded by turquoise waters. The logos for 'Flight Simulator' and 'Microsoft Azure AI' are prominently displayed, indicating a collaboration or feature integration between the two."
                  },
                  "Sentiment": {
                    "type": "string",
                    "valueString": "Positive"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 43866,
        "width": 1080,
        "height": 608,
        "KeyFrameTimesMs": [733, ... , 43233],
        "transcriptPhrases": [
          {
            "speaker": "Speaker 1",
            "startTimeMs": 1360,
            "endTimeMs": 6640,
            "text": "When it comes to the neural TTS, in order to get a good voice, it's better to have good data.",
            "words": []
          }, ...
        ],
        "cameraShotTimesMs": [1467, ...  42033],
        "segments": [
          {
            "startTimeMs": 0,
            "endTimeMs": 1467,
            "description": "The video begins with a scenic aerial view of an island, showcasing the collaboration between Flight Simulator and Microsoft Azure AI.",
            "segmentId": "1"
          }, ...
        ]
      }
    ]
  },
  "usage": {
    "videoHours": 0.013,
    "tokens": {
      "contextualization": 12222.223
    }
  }
}

클라이언트 라이브러리 | 샘플 | SDK 원본

이 가이드에서는 Content Understanding Python SDK를 사용하여 콘텐츠에서 구조화된 데이터를 추출하는 사용자 지정 분석기를 만드는 방법을 보여 줍니다. 사용자 지정 분석기는 문서, 이미지, 오디오 및 비디오 콘텐츠 형식을 지원합니다.

필수 조건

활성 Azure 구독입니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
지원되는 지역에서 만든 Microsoft Foundry 리소스입니다.
리소스 엔드포인트 및 API 키(Azure Portal의 키 및 엔드포인트 에서 찾을 수 있습니다).
리소스에 대해 구성된 모델 배포 기본값입니다. 설정 지침을 보려면 모델 및 배포 또는 이 일회성 구성 스크립트를 참조하세요.
Python 3.9 이상.

설정

pip를 사용하여 Python용 Content Understanding 클라이언트 라이브러리를 설치합니다.
```
pip install azure-ai-contentunderstanding
```
필요에 따라 Microsoft Entra 인증을 위한 Azure ID 라이브러리를 설치합니다.
```
pip install azure-identity
```

환경 변수 설정하기

Content Understanding 서비스를 사용하여 인증하려면 샘플을 실행하기 전에 사용자 고유의 값으로 환경 변수를 설정합니다.

CONTENTUNDERSTANDING_ENDPOINT - Content Understanding 리소스의 엔드포인트입니다.
CONTENTUNDERSTANDING_KEY은 Content Understanding API 키를 의미하며, 만약 Microsoft Entra ID 의 DefaultAzureCredential을 사용하는 경우에는 생략할 수 있습니다.

윈도우즈

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux/macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

클라이언트 만들기

필요한 라이브러리 및 모델을 가져온 다음 리소스 엔드포인트 및 자격 증명을 사용하여 클라이언트를 만듭니다.

import os
import time
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.core.credentials import AzureKeyCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
key = os.environ["CONTENTUNDERSTANDING_KEY"]

client = ContentUnderstandingClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(key),
)

사용자 지정 분석기 만들기

다음 예제에서는 미리 빌드된 문서 기반 분석기를 기반으로 사용자 지정 문서 분석기를 만듭니다. 리터럴 텍스트는 extract으로, AI 생성 필드나 해석은 generate으로, 그리고 분류는 classify으로 세 가지 추출 방법을 사용하여 필드를 정의합니다.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)

# Generate a unique analyzer ID
analyzer_id = f"my_document_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="company_schema",
    description="Schema for extracting company information",
    fields={
        "company_name": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.EXTRACT,
            description="Name of the company",
            estimate_source_and_confidence=True,
        ),
        "total_amount": ContentFieldDefinition(
            type=ContentFieldType.NUMBER,
            method=GenerationMethod.EXTRACT,
            description="Total amount on the document",
            estimate_source_and_confidence=True,
        ),
        "document_summary": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.GENERATE,
            description=(
                "A brief summary of the document content"
            ),
        ),
        "document_type": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.CLASSIFY,
            description="Type of document",
            enum=[
                "invoice", "receipt", "contract",
                "report", "other",
            ],
        ),
    },
)

# Create analyzer configuration
config = ContentAnalyzerConfig(
    enable_formula=True,
    enable_layout=True,
    enable_ocr=True,
    estimate_field_source_and_confidence=True,
    return_details=True,
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-document",
    description=(
        "Custom analyzer for extracting company information"
    ),
    config=config,
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
        "embedding": "text-embedding-3-large",
    }, # Required when using field_schema and prebuilt-document base analyzer
)

# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

예제 출력은 다음과 같습니다.

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: ContentFieldType.STRING (GenerationMethod.EXTRACT)
    - total_amount: ContentFieldType.NUMBER (GenerationMethod.EXTRACT)
    - document_summary: ContentFieldType.STRING (GenerationMethod.GENERATE)
    - document_type: ContentFieldType.STRING (GenerationMethod.CLASSIFY)

팁 (조언)

이 코드는 SDK 리포지토리의 분석기 만들기 샘플을 기반으로 합니다.

필요에 따라 분류자 분석기를 만들어 문서를 분류하고 결과를 사용하여 문서를 미리 빌드된 분석기 또는 사용자 지정 분석기에 라우팅할 수 있습니다. 분류 워크플로에 대한 사용자 지정 분석기를 만드는 예제는 다음과 같습니다.

import time
from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentCategoryDefinition,
)

# Generate a unique analyzer ID
analyzer_id = f"my_classifier_{int(time.time())}"

print(f"Creating classifier '{analyzer_id}'...")

# Define content categories for classification
categories = {
    "Loan_Application": ContentCategoryDefinition(
        description="Documents submitted by individuals or businesses to request funding, "
        "typically including personal or business details, financial history, "
        "loan amount, purpose, and supporting documentation."
    ),
    "Invoice": ContentCategoryDefinition(
        description="Billing documents issued by sellers or service providers to request "
        "payment for goods or services, detailing items, prices, taxes, totals, "
        "and payment terms."
    ),
    "Bank_Statement": ContentCategoryDefinition(
        description="Official statements issued by banks that summarize account activity "
        "over a period, including deposits, withdrawals, fees, and balances."
    ),
}

# Create analyzer configuration
config = ContentAnalyzerConfig(
    return_details=True,
    enable_segment=True,  # Enable automatic segmentation by category
    content_categories=categories,
)

# Create the classifier analyzer
classifier = ContentAnalyzer(
    base_analyzer_id="prebuilt-document",
    description="Custom classifier for financial document categorization",
    config=config,
    models={"completion": "gpt-4.1"},
)

# Create the classifier
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=classifier,
)
result = poller.result()  # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)

print(f"Classifier '{analyzer_id}' created successfully!")
if result.description:
    print(f"  Description: {result.description}")

팁 (조언)

이 코드는 SDK 리포지토리의 분류자 만들기 샘플을 기반으로 합니다.

다음 예제에서는 차트 및 그래프를 처리하기 위해 미리 빌드된 이미지 기반 분석기를 기반으로 사용자 지정 이미지 분석기를 만듭니다.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)

# Generate a unique analyzer ID
analyzer_id = f"my_image_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="chart_schema",
    description=(
        "Schema for extracting chart information"
    ),
    fields={
        "Title": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            description="Title of the chart",
        ),
        "ChartType": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.CLASSIFY,
            description="Type of chart",
            enum=["bar", "line", "pie"],
        ),
    },
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-image",
    description=(
        "Custom analyzer for charts and graphs"
    ),
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
    }, 
)

# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

예제 출력은 다음과 같습니다.

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: ContentFieldType.STRING (auto)
    - ChartType: ContentFieldType.STRING (GenerationMethod.CLASSIFY)

팁 (조언)

이 코드는 이미지 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 고객 지원 통화 녹음/녹화를 처리하기 위해 미리 빌드된 오디오 분석기를 기반으로 사용자 지정 오디오 분석기를 만듭니다.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)
# Generate a unique analyzer ID
analyzer_id = f"my_audio_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="call_center_schema",
    description=(
        "Schema for analyzing customer support calls"
    ),
    fields={
        "Summary": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.GENERATE,
            description="Summary of the call",
        ),
        "Sentiment": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.CLASSIFY,
            description="Overall sentiment of the call",
            enum=["Positive", "Neutral", "Negative"],
        ),
        "People": ContentFieldDefinition(
            type=ContentFieldType.ARRAY,
            description="List of people mentioned",
            item_definition=ContentFieldDefinition(
                type=ContentFieldType.OBJECT,
                properties={
                    "Name": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                    ),
                    "Role": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                    ),
                },
            ),
        ),
    },
)

# Create analyzer configuration
config = ContentAnalyzerConfig(
    locales=["en-US", "fr-FR"],
    return_details=True,
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-audio",
    description=(
        "Custom analyzer for customer support calls"
    ),
    config=config,
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
    },
)
# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

예제 출력은 다음과 같습니다.

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - Summary: ContentFieldType.STRING (GenerationMethod.GENERATE)
    - Sentiment: ContentFieldType.STRING (GenerationMethod.CLASSIFY)
    - People: ContentFieldType.ARRAY (auto)

팁 (조언)

이 코드는 오디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 제품 데모 및 검토를 처리하기 위해 미리 빌드된 비디오 기반 분석기를 기반으로 사용자 지정 비디오 분석기를 만듭니다.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)
# Generate a unique analyzer ID
analyzer_id = f"my_video_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="video_schema",
    description=(
        "Schema for analyzing product demo videos"
    ),
    fields={
        "Segments": ContentFieldDefinition(
            type=ContentFieldType.ARRAY,
            item_definition=ContentFieldDefinition(
                type=ContentFieldType.OBJECT,
                properties={
                    "SegmentId": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                    ),
                    "Description": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                        method=GenerationMethod.GENERATE,
                        description=(
                            "Detailed summary of the "
                            "video segment"
                        ),
                    ),
                    "Sentiment": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                        method=GenerationMethod.CLASSIFY,
                        enum=[
                            "Positive", "Neutral",
                            "Negative",
                        ],
                    ),
                },
            ),
        ),
    },
)

# Create analyzer configuration
config = ContentAnalyzerConfig(
    locales=["en-US", "fr-FR"],
    return_details=True,
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-video",
    description=(
        "Custom analyzer for product demo videos"
    ),
    config=config,
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
    }, 
)

# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

예제 출력은 다음과 같습니다.

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: ContentFieldType.ARRAY (auto)

팁 (조언)

이 코드는 비디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

사용자 지정 분석기 사용

분석기를 만든 후 이 분석기를 사용하여 문서를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

# --- Use the custom document analyzer ---
from azure.ai.contentunderstanding.models import AnalysisInput

print("\nAnalyzing document...")
document_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/document/invoice.pdf"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=document_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        company = content.fields.get("company_name")
        if company:
            print(f"Company Name: {company.value}")
            if company.confidence:
                print(
                    f"  Confidence:"
                    f" {company.confidence:.2f}"
                )

        total = content.fields.get("total_amount")
        if total:
            print(f"Total Amount: {total.value}")

        summary = content.fields.get(
            "document_summary"
        )
        if summary:
            print(f"Summary: {summary.value}")

        doc_type = content.fields.get("document_type")
        if doc_type:
            print(f"Document Type: {doc_type.value}")
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

예제 출력은 다음과 같습니다.

Analyzing document...
Company Name: CONTOSO LTD.
  Confidence: 0.81
Total Amount: 610.0
Summary: This document is an invoice from CONTOSO LTD. to Microsoft Corporation for consulting, document, and printing services provided during the service period. It details line items, subtotal, sales tax, total, previous unpaid balance, and the final amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

팁 (조언)

SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 이미지를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

from azure.ai.contentunderstanding.models import AnalysisInput

# --- Use the custom image analyzer ---
print("\nAnalyzing image...")
image_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/image/pieChart.jpg"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=image_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        title = content.fields.get("Title")
        if title:
            print(f"Title: {title.value}")

        chart_type = content.fields.get("ChartType")
        if chart_type:
            print(f"Chart Type: {chart_type.value}")
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

예제 출력은 다음과 같습니다.

Analyzing image...
Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...   
Analyzer 'my_image_analyzer_ID' deleted successfully.

팁 (조언)

SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 오디오 파일을 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

from azure.ai.contentunderstanding.models import AnalysisInput

print("\nAnalyzing audio...")
audio_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/audio/callCenterRecording.mp3"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=audio_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        summary = content.fields.get("Summary")
        if summary:
            print(f"Summary: {summary.value}")

        sentiment = content.fields.get("Sentiment")
        if sentiment:
            print(f"Sentiment: {sentiment.value}")
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

예제 출력은 다음과 같습니다.

Analyzing audio...
Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and informed her that her balance is 599 points. Maria confirmed she needed no further information and ended the call.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

팁 (조언)

SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 비디오를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

from azure.ai.contentunderstanding.models import AnalysisInput

print("\nAnalyzing video...")
video_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/videos/sdk_samples/FlightSimulator.mp4"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=video_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        segments = content.fields.get("Segments")
        if segments and segments.value:
            print(f"Found {len(segments.value)} segments")
            for i, segment in enumerate(
                segments.value
            ):
                if segment.value:
                    seg_id = segment.value.get(
                        "SegmentId"
                    )
                    desc = segment.value.get(
                        "Description"
                    )
                    print(f"Segment {i + 1}:")
                    if seg_id:
                        print(
                            f"  ID: {seg_id.value}"
                        )
                    if desc:
                        print(
                            f"  Desc: {desc.value}"
                        )
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

예제 출력은 다음과 같습니다.

Analyzing video...
Found 16 segments
Segment 1:
  ID: 00:00:00.000-00:00:01.467
  Desc: The video opens with a scenic aerial view of an island, featuring a small airplane flying over the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI,' indicating a collaboration or integration between the two products.
Segment 2:
  ID: 00:00:01.467-00:00:03.233
  Desc: A man is shown sitting in a modern office setting, likely preparing to speak or introduce the topic. The background features geometric wall decorations and a plant, giving a professional and contemporary atmosphere.
Segment 3:
  ID: 00:00:03.233-00:00:07.367
  Desc: The segment displays a close-up of audio waveforms on a screen, visually representing sound data. This is accompanied by narration about the importance of good data for neural TTS (Text-to-Speech) and the process of building a universal TTS model using 3,000 hours of data.
Segment 4:
  ID: 00:00:07.367-00:00:08.200
  Desc: Another man appears in a similar office environment, possibly continuing the explanation or providing additional insights about the TTS model.
Segment 5:
  ID: 00:00:08.200-00:00:11.367
  Desc: The video transitions to an outdoor scene showing a large facility surrounded by fields, likely representing a data center or server farm. This visual supports the narration about accumulating large amounts of data for the universal TTS model.
Segment 6:
  ID: 00:00:11.367-00:00:13.567
  Desc: Inside a data center, rows of servers are shown, emphasizing the technological infrastructure required for processing and storing vast amounts of audio data.
Segment 7:
  ID: 00:00:13.567-00:00:16.100
  Desc: The first man returns, continuing his explanation in the office setting. The narration discusses how the universal model captures audio nuances to generate more natural voices.
Segment 8:
  ID: 00:00:16.100-00:00:19.433
  Desc: A biplane is seen flying over a picturesque landscape, reinforcing the connection to Flight Simulator and showcasing the realism enabled by advanced AI voice technology.
Segment 9:
  ID: 00:00:19.433-00:00:23.967
  Desc: A plane flies past a castle surrounded by lush greenery and mountains, further highlighting the immersive environments possible in Flight Simulator. The narration continues to emphasize the natural quality of AI-generated voices.
Segment 10:
  ID: 00:00:23.967-00:00:30.033
  Desc: A bald man is interviewed in a modern office space, discussing the high fidelity and human-like quality of voices produced by cognitive services offerings. The background features glass walls and plants, maintaining a professional tone.
Segment 11:
  ID: 00:00:30.033-00:00:33.200
  Desc: The interview continues with the bald man, focusing on the benefits of the AI voice technology. The setting remains consistent, reinforcing the credibility and expertise of the speaker.        
Segment 12:
  ID: 00:00:33.200-00:00:35.267
  Desc: The video shifts to a top-down view of an airplane on a runway, preparing for movement. This visual ties back to the Flight Simulator theme and the realism of the simulation.
Segment 13:
  ID: 00:00:35.267-00:00:37.700
  Desc: A ground crew member directs an Airbus aircraft, with pilots visible in the cockpit. The scene demonstrates realistic airport operations, likely enhanced by AI-driven voice interactions.       
Segment 14:
  ID: 00:00:37.700-00:00:39.200
  Desc: Two ground crew members walk near an airplane on the tarmac, with airport buildings in the background. The visuals continue to showcase the detailed simulation environment.
Segment 15:
  ID: 00:00:39.200-00:00:42.033
  Desc: A close-up of an Airbus aircraft at the gate, with sunlight illuminating the scene. The realism of the simulation is highlighted, possibly referencing the natural-sounding AI voices used in communications.
Segment 16:
  ID: 00:00:42.033-00:00:43.866
  Desc: The video concludes with the Microsoft logo and branding, signaling the end of the product demo and reinforcing the association with Microsoft Azure AI.

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...   
Analyzer 'my_video_analyzer_ID' deleted successfully.

팁 (조언)

SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

클라이언트 라이브러리 | 샘플 | SDK 원본

이 가이드에서는 Content Understanding .NET SDK를 사용하여 콘텐츠에서 구조화된 데이터를 추출하는 사용자 지정 분석기를 만드는 방법을 보여 줍니다. 사용자 지정 분석기는 문서, 이미지, 오디오 및 비디오 콘텐츠 형식을 지원합니다.

필수 조건

활성 Azure 구독입니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
지원되는 지역에서 만든 Microsoft Foundry 리소스입니다.
리소스 엔드포인트 및 API 키(Azure Portal의 키 및 엔드포인트 에서 찾을 수 있습니다).
리소스에 대해 구성된 모델 배포 기본값입니다. 설정 지침을 보려면 모델 및 배포 또는 이 일회성 구성 스크립트를 참조하세요.
.NET의 현재 버전입니다.

설정

새 .NET 콘솔 애플리케이션을 만듭니다.

dotnet new console -n CustomAnalyzerTutorial
cd CustomAnalyzerTutorial

.NET용 Content Understanding 클라이언트 라이브러리를 설치합니다.
```
dotnet add package Azure.AI.ContentUnderstanding
```
필요에 따라 Microsoft Entra 인증을 위한 Azure ID 라이브러리를 설치합니다.
```
dotnet add package Azure.Identity
```

환경 변수 설정하기

Content Understanding 서비스를 사용하여 인증하려면 샘플을 실행하기 전에 사용자 고유의 값으로 환경 변수를 설정합니다.

CONTENTUNDERSTANDING_ENDPOINT - Content Understanding 리소스의 엔드포인트입니다.
CONTENTUNDERSTANDING_KEY은 Content Understanding API 키를 의미하며, 만약 Microsoft Entra ID 의 DefaultAzureCredential을 사용하는 경우에는 생략할 수 있습니다.

윈도우즈

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux/macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

클라이언트 만들기

using Azure;
using Azure.AI.ContentUnderstanding;

string endpoint = Environment.GetEnvironmentVariable(
    "CONTENTUNDERSTANDING_ENDPOINT");
string key = Environment.GetEnvironmentVariable(
    "CONTENTUNDERSTANDING_KEY");

var client = new ContentUnderstandingClient(
    new Uri(endpoint),
    new AzureKeyCredential(key)
);

사용자 지정 분석기 만들기

다음 예제에서는 미리 빌드된 문서 분석기를 기반으로 사용자 지정 문서 분석기를 만듭니다. 세 가지 extract 리터럴 텍스트, generate AI 생성 요약 및 classify 분류 추출 방법을 사용하여 필드를 정의합니다.

string analyzerId =
    $"my_document_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["company_name"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Extract,
            Description = "Name of the company"
        },
        ["total_amount"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.Number,
            Method = GenerationMethod.Extract,
            Description =
                "Total amount on the document"
        },
        ["document_summary"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Generate,
            Description =
                "A brief summary of the document content"
        },
        ["document_type"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Classify,
            Description = "Type of document"
        }
    })
{
    Name = "company_schema",
    Description =
        "Schema for extracting company information"
};

fieldSchema.Fields["document_type"].Enum.Add("invoice");
fieldSchema.Fields["document_type"].Enum.Add("receipt");
fieldSchema.Fields["document_type"].Enum.Add("contract");
fieldSchema.Fields["document_type"].Enum.Add("report");
fieldSchema.Fields["document_type"].Enum.Add("other");

var config = new ContentAnalyzerConfig
{
    EnableFormula = true,
    EnableLayout = true,
    EnableOcr = true,
    EstimateFieldSourceAndConfidence = true,
    ShouldReturnDetails = true
};

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-document",
    Description =
        "Custom analyzer for extracting"
        + " company information",
    Config = config,
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";
customAnalyzer.Models["embedding"] =
    "text-embedding-3-large"; // Required when using field_schema and prebuilt-document base analyzer

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: string (extract)
    - total_amount: number (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

팁 (조언)

이 코드는 SDK 리포지토리의 분석기 만들기 샘플을 기반으로 합니다.

// Generate a unique analyzer ID
string classifierId =
    $"my_classifier_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

Console.WriteLine(
    $"Creating classifier '{classifierId}'...");

// Define content categories for classification
var classifierConfig = new ContentAnalyzerConfig
{
    ShouldReturnDetails = true,
    EnableSegment = true
};

classifierConfig.ContentCategories
    .Add("Loan_Application",
        new ContentCategoryDefinition
        {
            Description =
                "Documents submitted by individuals"
                + " or businesses to request"
                + " funding, typically including"
                + " personal or business details,"
                + " financial history, loan amount,"
                + " purpose, and supporting"
                + " documentation."
        });

classifierConfig.ContentCategories
    .Add("Invoice",
        new ContentCategoryDefinition
        {
            Description =
                "Billing documents issued by"
                + " sellers or service providers"
                + " to request payment for goods"
                + " or services, detailing items,"
                + " prices, taxes, totals, and"
                + " payment terms."
        });

classifierConfig.ContentCategories
    .Add("Bank_Statement",
        new ContentCategoryDefinition
        {
            Description =
                "Official statements issued by"
                + " banks that summarize account"
                + " activity over a period,"
                + " including deposits,"
                + " withdrawals, fees,"
                + " and balances."
        });

// Create the classifier analyzer
var classifierAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-document",
    Description =
        "Custom classifier for financial"
        + " document categorization",
    Config = classifierConfig
};

classifierAnalyzer.Models["completion"] =
    "gpt-4.1";

var classifierOp =
    await client.CreateAnalyzerAsync(
        WaitUntil.Completed,
        classifierId,
        classifierAnalyzer);

// Get the full classifier details
var classifierDetails =
    await client.GetAnalyzerAsync(classifierId);
var classifierResult =
    classifierDetails.Value;

Console.WriteLine(
    $"Classifier '{classifierId}'"
    + " created successfully!");

if (classifierResult.Description != null)
{
    Console.WriteLine(
        $"  Description:"
        + $" {classifierResult.Description}");
}

팁 (조언)

이 코드는 분류 워크플로에 대한 분류자 만들기 샘플을 기반으로 합니다.

다음 예제에서는 차트 및 그래프를 처리하기 위해 미리 빌드된 이미지 분석기를 기반으로 사용자 지정 이미지 분석기를 만듭니다.

string analyzerId =
    $"my_image_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["Title"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Description = "Title of the chart"
        },
        ["ChartType"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Classify,
            Description = "Type of chart"
        }
    })
{
    Name = "chart_schema",
    Description =
        "Schema for extracting chart information"
};

fieldSchema.Fields["ChartType"].Enum.Add("bar");
fieldSchema.Fields["ChartType"].Enum.Add("line");
fieldSchema.Fields["ChartType"].Enum.Add("pie");

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-image",
    Description =
        "Custom analyzer for charts and graphs",
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

팁 (조언)

이 코드는 이미지 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 고객 지원 통화 녹음/녹화를 처리하기 위해 미리 빌드된 오디오 분석기를 기반으로 사용자 지정 오디오 분석기를 만듭니다.

string analyzerId =
    $"my_audio_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["Summary"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Generate,
            Description = "Summary of the call"
        },
        ["Sentiment"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Classify,
            Description =
                "Overall sentiment of the call"
        },
    })
{
    Name = "call_center_schema",
    Description =
        "Schema for analyzing customer"
        + " support calls"
};

fieldSchema.Fields["Sentiment"]
    .Enum.Add("Positive");
fieldSchema.Fields["Sentiment"]
    .Enum.Add("Neutral");
fieldSchema.Fields["Sentiment"]
    .Enum.Add("Negative");

var config = new ContentAnalyzerConfig
{
    ShouldReturnDetails = true
};

config.Locales.Add("en-US");
config.Locales.Add("fr-FR");

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-audio",
    Description =
        "Custom analyzer for customer"
        + " support calls",
    Config = config,
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (2):
    - Summary: string (generate)
    - Sentiment: string (classify)

팁 (조언)

이 코드는 오디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 제품 데모 및 검토를 처리하기 위해 미리 빌드된 비디오 분석기를 기반으로 사용자 지정 비디오 분석기를 만듭니다.

string analyzerId =
    $"my_video_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var segmentItemDef = new ContentFieldDefinition
{
    Type = ContentFieldType.Object
};
segmentItemDef.Properties.Add("SegmentId",
    new ContentFieldDefinition
    {
        Type = ContentFieldType.String
    });
segmentItemDef.Properties.Add("Description",
    new ContentFieldDefinition
    {
        Type = ContentFieldType.String,
        Method = GenerationMethod.Generate,
        Description =
            "Detailed summary of the "
            + "video segment"
    });
segmentItemDef.Properties.Add("Sentiment",
    new ContentFieldDefinition
    {
        Type = ContentFieldType.String,
        Method = GenerationMethod.Classify
    });

var segmentsDef = new ContentFieldDefinition
{
    Type = ContentFieldType.Array
};
segmentsDef.ItemDefinition = segmentItemDef;

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["Segments"] = segmentsDef
    })
{
    Name = "video_schema",
    Description =
        "Schema for analyzing product"
        + " demo videos"
};

var sentimentDef =
    fieldSchema.Fields["Segments"]
        .ItemDefinition.Properties["Sentiment"];
sentimentDef.Enum.Add("Positive");
sentimentDef.Enum.Add("Neutral");
sentimentDef.Enum.Add("Negative");

var config = new ContentAnalyzerConfig
{
    ShouldReturnDetails = true
};

config.Locales.Add("en-US");
config.Locales.Add("fr-FR");

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-video",
    Description =
        "Custom analyzer for product"
        + " demo videos",
    Config = config,
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: Array (auto)

팁 (조언)

이 코드는 비디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

사용자 지정 분석기 사용

분석기를 만든 후 이 분석기를 사용하여 문서를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

var documentUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = documentUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.FirstOrDefault()
    is DocumentContent content)
{
    if (content.Fields.TryGetValue(
        "company_name", out var companyField))
    {
        var name =
            companyField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Company Name: "
            + $"{name ?? "(not found)"}");
        Console.WriteLine(
            "  Confidence: "
            + (companyField.Confidence?
                .ToString("F2") ?? "N/A"));
    }

    if (content.Fields.TryGetValue(
        "total_amount", out var totalField))
    {
        var total =
            totalField is ContentNumberField nf
                ? nf.Value : null;
        Console.WriteLine(
            $"Total Amount: {total}");
    }

    if (content.Fields.TryGetValue(
        "document_summary", out var summaryField))
    {
        var summary =
            summaryField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Summary: "
            + $"{summary ?? "(not found)"}");
    }

    if (content.Fields.TryGetValue(
        "document_type", out var typeField))
    {
        var docType =
            typeField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Document Type: "
            + $"{docType ?? "(not found)"}");
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

예제 출력은 다음과 같습니다.

Company Name: CONTOSO LTD.
  Confidence: 0.88
Total Amount: 610
Summary: This document is an invoice from CONTOSO LTD. to MICROSOFT CORPORATION for consulting services, document fees, and printing fees, detailing service periods, billing and shipping addresses, itemized charges, and the total amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

팁 (조언)

.NET SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 이미지를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

var imageUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = imageUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.FirstOrDefault()
    is DocumentContent content)
{
    if (content.Fields.TryGetValue(
        "Title", out var titleField))
    {
        var title =
            titleField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Title: {title ?? "(not found)"}");
    }

    if (content.Fields.TryGetValue(
        "ChartType", out var chartField))
    {
        var chartType =
            chartField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Chart Type: "
            + $"{chartType ?? "(not found)"}");
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

예제 출력은 다음과 같습니다.

Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

팁 (조언)

.NET SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 오디오 파일을 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

var audioUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = audioUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.Count > 0)
{
    var content = analyzeResult.Contents[0];
    if (content.Fields != null)
    {
        if (content.Fields.TryGetValue(
            "Summary", out var summaryField))
        {
            var summary =
                summaryField
                    is ContentStringField sf
                    ? sf.Value : null;
            Console.WriteLine(
                $"Summary: "
                + $"{summary ?? "(not found)"}");
        }

        if (content.Fields.TryGetValue(
            "Sentiment", out var sentField))
        {
            var sentiment =
                sentField
                    is ContentStringField sf
                    ? sf.Value : null;
            Console.WriteLine(
                $"Sentiment: "
                + $"{sentiment ?? "(not found)"}");
        }
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

예제 출력은 다음과 같습니다.

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and informed her that her balance is 599 points. Maria confirmed she needed no further information and ended the call.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

팁 (조언)

.NET SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 비디오를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

var videoUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = videoUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.Count > 0)
{
    var content = analyzeResult.Contents[0];
    if (content.Fields != null
        && content.Fields.TryGetValue(
            "Segments", out var segmentsField)
        && segmentsField
            is ContentArrayField segmentsArr)
    {
        Console.WriteLine(
            $"Segments ({segmentsArr.Count}):");
        for (int i = 0;
            i < segmentsArr.Count; i++)
        {
            if (segmentsArr[i]
                is ContentObjectField segObj
                && segObj.Value != null)
            {
                Console.WriteLine(
                    $"  Segment {i + 1}:");
                if (segObj.Value.TryGetValue(
                    "Description",
                    out var descField))
                {
                    var desc =
                        descField
                            is ContentStringField sf
                            ? sf.Value : null;
                    Console.WriteLine(
                        $"    Description: "
                        + $"{desc ?? "(none)"}");
                }
                if (segObj.Value.TryGetValue(
                    "Sentiment",
                    out var sentField))
                {
                    var sent =
                        sentField
                            is ContentStringField sf
                            ? sf.Value : null;
                    Console.WriteLine(
                        $"    Sentiment: "
                        + $"{sent ?? "(none)"}");
                }
            }
        }
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

예제 출력은 다음과 같습니다.

Segments (16):
  Segment 1:
    Description: The video opens with a scenic aerial view of an island, featuring a small airplane flying over the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI,' indicating a collaboration or integration between the two.
    Sentiment: Positive
  Segment 2:
    Description: A man is shown sitting in a modern office environment, likely preparing to speak or introduce the topic. The background features geometric wall lights and a plant, giving a professional and contemporary feel.
    Sentiment: Neutral
  Segment 3:
    Description: The segment displays a close-up of audio waveforms on a screen, visually representing sound data. The accompanying audio discusses the importance of good data for neural TTS (Text-to-Speech) to achieve a high-quality voice.
    Sentiment: Neutral
  Segment 4:
    Description: Another man appears in a similar office setting, possibly continuing the explanation or providing additional commentary about the TTS model.
    Sentiment: Neutral
  Segment 5:
    Description: The video transitions to an outdoor scene showing a large facility surrounded by fields under a clear sky. This likely represents the data centers or infrastructure used for building the universal TTS model.
    Sentiment: Neutral
  Segment 6:
    Description: The segment moves inside a data center, showing rows of servers and high-tech equipment. This visual emphasizes the scale and technological sophistication behind the TTS model's development.
    Sentiment: Neutral
  Segment 7:
    Description: The first man returns, continuing his explanation in the office setting. The transcript mentions accumulating large amounts of data to capture audio nuances and generate natural voices.
    Sentiment: Positive
  Segment 8:
    Description: A biplane is shown flying over a picturesque landscape, highlighting the realism and immersive experience of the Flight Simulator. This visual connects the product's capabilities to the natural-sounding voices enabled by Azure AI.
    Sentiment: Positive
  Segment 9:
    Description: The segment features a plane flying near a castle surrounded by lush greenery and mountains. The visuals reinforce the immersive environments possible in Flight Simulator, enhanced by advanced AI voice technology.
    Sentiment: Positive
  Segment 10:
    Description: A bald man is interviewed in a modern office space, likely discussing the benefits of cognitive services offerings, such as higher fidelity and more human-like voices, as mentioned in the transcript.
    Sentiment: Positive
  Segment 11:
    Description: The interview continues with the bald man, focusing on the advantages of Azure AI's TTS technology. The transcript notes that the voices sound much more like actual human voices.
    Sentiment: Positive
  Segment 12:
    Description: The video shifts to an overhead view of an airplane on the runway, possibly preparing for pushback. This visual ties into the transcript mentioning 'Orlando ground 9555 requesting the end of pushback.'
    Sentiment: Neutral
  Segment 13:
    Description: A ground crew member directs an Airbus aircraft, with pilots visible in the cockpit. The transcript includes communication about pushback, demonstrating realistic voice interactions in the simulator.
    Sentiment: Neutral
  Segment 14:
    Description: Ground crew members are seen walking near airplanes on the tarmac, reinforcing the realism and operational detail in the Flight Simulator environment.
    Sentiment: Neutral
  Segment 15:
    Description: A close-up of an Airbus aircraft at the gate, with the transcript confirming the end of pushback. This segment highlights the simulator's attention to detail and realistic voice communications.
    Sentiment: Neutral
  Segment 16:
    Description: The video concludes with the Microsoft logo and branding, signaling the end of the product demo and reinforcing the partnership between Flight Simulator and Microsoft Azure AI.
    Sentiment: Positive

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

팁 (조언)

.NET SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

클라이언트 라이브러리 | 샘플 | SDK 원본

이 가이드에서는 Content Understanding Java SDK를 사용하여 콘텐츠에서 구조화된 데이터를 추출하는 사용자 지정 분석기를 만드는 방법을 보여 줍니다. 사용자 지정 분석기는 문서, 이미지, 오디오 및 비디오 콘텐츠 형식을 지원합니다.

필수 조건

활성 Azure 구독입니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
지원되는 지역에서 만든 Microsoft Foundry 리소스입니다.
리소스 엔드포인트 및 API 키(Azure Portal의 키 및 엔드포인트 에서 찾을 수 있습니다).
리소스에 대해 구성된 모델 배포 기본값입니다. 설정 지침을 보려면 모델 및 배포 또는 이 일회성 구성 스크립트를 참조하세요.
JDK(Java Development Kit) 버전 8 이상
Apache Maven.

설정

새 Maven 프로젝트를 만듭니다.

mvn archetype:generate -DgroupId=com.example \
    -DartifactId=custom-analyzer-tutorial \
    -DarchetypeArtifactId=maven-archetype-quickstart \
    -DinteractiveMode=false
cd custom-analyzer-tutorial

섹션의 pom.xml 파일에 <dependencies> Content Understanding 종속성을 추가합니다.

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-contentunderstanding</artifactId>
    <version>1.0.0</version>
</dependency>

필요에 따라 Microsoft Entra 인증을 위한 Azure ID 라이브러리를 추가합니다.

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.14.2</version>
</dependency>

환경 변수 설정하기

Content Understanding 서비스를 사용하여 인증하려면 샘플을 실행하기 전에 사용자 고유의 값으로 환경 변수를 설정합니다.

CONTENTUNDERSTANDING_ENDPOINT - Content Understanding 리소스의 엔드포인트입니다.
CONTENTUNDERSTANDING_KEY은 Content Understanding API 키를 의미하며, 만약 Microsoft Entra ID 의 DefaultAzureCredential을 사용하는 경우에는 생략할 수 있습니다.

윈도우즈

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux/macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

클라이언트 만들기

import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import com.azure.ai.contentunderstanding
    .ContentUnderstandingClient;
import com.azure.ai.contentunderstanding
    .ContentUnderstandingClientBuilder;
import com.azure.ai.contentunderstanding.models.*;

String endpoint =
    System.getenv("CONTENTUNDERSTANDING_ENDPOINT");
String key =
    System.getenv("CONTENTUNDERSTANDING_KEY");

ContentUnderstandingClient client =
    new ContentUnderstandingClientBuilder()
        .endpoint(endpoint)
        .credential(new AzureKeyCredential(key))
        .buildClient();

사용자 지정 분석기 만들기

String analyzerId =
    "my_document_analyzer_"
    + System.currentTimeMillis();

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();

ContentFieldDefinition companyNameDef =
    new ContentFieldDefinition();
companyNameDef.setType(ContentFieldType.STRING);
companyNameDef.setMethod(
    GenerationMethod.EXTRACT);
companyNameDef.setDescription(
    "Name of the company");
fields.put("company_name", companyNameDef);

ContentFieldDefinition totalAmountDef =
    new ContentFieldDefinition();
totalAmountDef.setType(ContentFieldType.NUMBER);
totalAmountDef.setMethod(
    GenerationMethod.EXTRACT);
totalAmountDef.setDescription(
    "Total amount on the document");
fields.put("total_amount", totalAmountDef);

ContentFieldDefinition summaryDef =
    new ContentFieldDefinition();
summaryDef.setType(ContentFieldType.STRING);
summaryDef.setMethod(
    GenerationMethod.GENERATE);
summaryDef.setDescription(
    "A brief summary of the document content");
fields.put("document_summary", summaryDef);

ContentFieldDefinition documentTypeDef =
    new ContentFieldDefinition();
documentTypeDef.setType(ContentFieldType.STRING);
documentTypeDef.setMethod(
    GenerationMethod.CLASSIFY);
documentTypeDef.setDescription(
    "Type of document");
documentTypeDef.setEnumProperty(
    Arrays.asList(
        "invoice", "receipt", "contract",
        "report", "other"
    ));
fields.put("document_type", documentTypeDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("company_schema");
fieldSchema.setDescription(
    "Schema for extracting company information");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");
models.put("embedding", "text-embedding-3-large"); // Required when using field_schema and prebuilt-document base analyzer

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-document")
        .setDescription(
            "Custom analyzer for extracting"
            + " company information")
        .setConfig(new ContentAnalyzerConfig()
            .setOcrEnabled(true)
            .setLayoutEnabled(true)
            .setFormulaEnabled(true)
            .setEstimateFieldSourceAndConfidence(
                true)
            .setReturnDetails(true))
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

예제 출력은 다음과 같습니다.

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - total_amount: number (extract)
    - company_name: string (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

팁 (조언)

이 코드는 SDK 리포지토리의 분석기 만들기 샘플을 기반으로 합니다.

// Generate a unique analyzer ID
String classifierId =
    "my_classifier_" + System.currentTimeMillis();

System.out.println(
    "Creating classifier '"
    + classifierId + "'...");

// Define content categories for classification
Map<String, ContentCategoryDefinition>
    categories = new HashMap<>();

categories.put("Loan_Application",
    new ContentCategoryDefinition()
        .setDescription(
            "Documents submitted by individuals"
            + " or businesses to request funding,"
            + " typically including personal or"
            + " business details, financial"
            + " history, loan amount, purpose,"
            + " and supporting documentation."));

categories.put("Invoice",
    new ContentCategoryDefinition()
        .setDescription(
            "Billing documents issued by sellers"
            + " or service providers to request"
            + " payment for goods or services,"
            + " detailing items, prices, taxes,"
            + " totals, and payment terms."));

categories.put("Bank_Statement",
    new ContentCategoryDefinition()
        .setDescription(
            "Official statements issued by banks"
            + " that summarize account activity"
            + " over a period, including deposits,"
            + " withdrawals, fees,"
            + " and balances."));

// Create the classifier
Map<String, String> classifierModels =
    new HashMap<>();
classifierModels.put("completion", "gpt-4.1");

ContentAnalyzer classifier =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-document")
        .setDescription(
            "Custom classifier for financial"
            + " document categorization")
        .setConfig(new ContentAnalyzerConfig()
            .setReturnDetails(true)
            .setSegmentEnabled(true)
            .setContentCategories(categories))
        .setModels(classifierModels);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> classifierOp =
    client.beginCreateAnalyzer(
        classifierId, classifier, true);
classifierOp.getFinalResult();

// Get the full classifier details
ContentAnalyzer classifierResult =
    client.getAnalyzer(classifierId);

System.out.println(
    "Classifier '" + classifierId
    + "' created successfully!");

if (classifierResult.getDescription() != null) {
    System.out.println(
        "  Description: "
        + classifierResult.getDescription());
}

팁 (조언)

이 코드는 분류 워크플로에 대한 분류자 만들기 샘플을 기반으로 합니다.

다음 예제에서는 차트 및 그래프를 처리하기 위해 미리 빌드된 이미지 분석기를 기반으로 사용자 지정 이미지 분석기를 만듭니다.

String analyzerId =
    "my_image_analyzer_"
    + System.currentTimeMillis();

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();

ContentFieldDefinition titleDef =
    new ContentFieldDefinition();
titleDef.setType(ContentFieldType.STRING);
titleDef.setDescription("Title of the chart");
fields.put("Title", titleDef);

ContentFieldDefinition chartTypeDef =
    new ContentFieldDefinition();
chartTypeDef.setType(ContentFieldType.STRING);
chartTypeDef.setMethod(
    GenerationMethod.CLASSIFY);
chartTypeDef.setDescription("Type of chart");
chartTypeDef.setEnumProperty(
    Arrays.asList("bar", "line", "pie"));
fields.put("ChartType", chartTypeDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("chart_schema");
fieldSchema.setDescription(
    "Schema for extracting chart information");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-image")
        .setDescription(
            "Custom analyzer for charts"
            + " and graphs")
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

예제 출력은 다음과 같습니다.

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

팁 (조언)

이 코드는 이미지 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 고객 지원 통화 녹음/녹화를 처리하기 위해 미리 빌드된 오디오 분석기를 기반으로 사용자 지정 오디오 분석기를 만듭니다.

String analyzerId =
    "my_audio_analyzer_"
    + System.currentTimeMillis();

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();

ContentFieldDefinition summaryDef =
    new ContentFieldDefinition();
summaryDef.setType(ContentFieldType.STRING);
summaryDef.setMethod(
    GenerationMethod.GENERATE);
summaryDef.setDescription("Summary of the call");
fields.put("Summary", summaryDef);

ContentFieldDefinition sentimentDef =
    new ContentFieldDefinition();
sentimentDef.setType(ContentFieldType.STRING);
sentimentDef.setMethod(
    GenerationMethod.CLASSIFY);
sentimentDef.setDescription(
    "Overall sentiment of the call");
sentimentDef.setEnumProperty(
    Arrays.asList(
        "Positive", "Neutral", "Negative"));
fields.put("Sentiment", sentimentDef);

// Define "People" as an array of objects
Map<String, ContentFieldDefinition> personProps =
    new HashMap<>();
ContentFieldDefinition nameDef =
    new ContentFieldDefinition();
nameDef.setType(ContentFieldType.STRING);
personProps.put("Name", nameDef);
ContentFieldDefinition roleDef =
    new ContentFieldDefinition();
roleDef.setType(ContentFieldType.STRING);
personProps.put("Role", roleDef);

ContentFieldDefinition personItemDef =
    new ContentFieldDefinition();
personItemDef.setType(ContentFieldType.OBJECT);
personItemDef.setProperties(personProps);

ContentFieldDefinition peopleDef =
    new ContentFieldDefinition();
peopleDef.setType(ContentFieldType.ARRAY);
peopleDef.setDescription(
    "List of people mentioned");
peopleDef.setItemDefinition(personItemDef);
fields.put("People", peopleDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("call_center_schema");
fieldSchema.setDescription(
    "Schema for analyzing customer"
    + " support calls");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-audio")
        .setDescription(
            "Custom analyzer for customer"
            + " support calls")
        .setConfig(new ContentAnalyzerConfig()
            .setLocales(
                Arrays.asList("en-US", "fr-FR"))
            .setReturnDetails(true))
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

예제 출력은 다음과 같습니다.

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - People: array (auto)
    - Summary: string (generate)
    - Sentiment: string (classify)

팁 (조언)

이 코드는 오디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 제품 데모 및 검토를 처리하기 위해 미리 빌드된 비디오 분석기를 기반으로 사용자 지정 비디오 분석기를 만듭니다.

String analyzerId =
    "my_video_analyzer_"
    + System.currentTimeMillis();

// Define segment properties
Map<String, ContentFieldDefinition> segProps =
    new HashMap<>();
ContentFieldDefinition segIdDef =
    new ContentFieldDefinition();
segIdDef.setType(ContentFieldType.STRING);
segProps.put("SegmentId", segIdDef);

ContentFieldDefinition descDef =
    new ContentFieldDefinition();
descDef.setType(ContentFieldType.STRING);
descDef.setMethod(GenerationMethod.GENERATE);
descDef.setDescription(
    "Detailed summary of the video segment");
segProps.put("Description", descDef);

ContentFieldDefinition sentDef =
    new ContentFieldDefinition();
sentDef.setType(ContentFieldType.STRING);
sentDef.setMethod(GenerationMethod.CLASSIFY);
sentDef.setEnumProperty(
    Arrays.asList(
        "Positive", "Neutral", "Negative"));
segProps.put("Sentiment", sentDef);

ContentFieldDefinition segItemDef =
    new ContentFieldDefinition();
segItemDef.setType(ContentFieldType.OBJECT);
segItemDef.setProperties(segProps);

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();
ContentFieldDefinition segmentsDef =
    new ContentFieldDefinition();
segmentsDef.setType(ContentFieldType.ARRAY);
segmentsDef.setItemDefinition(segItemDef);
fields.put("Segments", segmentsDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("video_schema");
fieldSchema.setDescription(
    "Schema for analyzing product demo videos");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-video")
        .setDescription(
            "Custom analyzer for product"
            + " demo videos")
        .setConfig(new ContentAnalyzerConfig()
            .setLocales(
                Arrays.asList("en-US", "fr-FR"))
            .setReturnDetails(true))
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

예제 출력은 다음과 같습니다.

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: array (auto)

팁 (조언)

이 코드는 비디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

사용자 지정 분석기 사용

분석기를 만든 후 이 분석기를 사용하여 문서를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

String documentUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf";

AnalysisInput input = new AnalysisInput();
input.setUrl(documentUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()
    && analyzeResult.getContents().get(0)
        instanceof DocumentContent) {
    DocumentContent content =
        (DocumentContent) analyzeResult
            .getContents().get(0);

    ContentField companyField =
        content.getFields() != null
            ? content.getFields()
                .get("company_name") : null;
    if (companyField
        instanceof ContentStringField) {
        ContentStringField sf =
            (ContentStringField) companyField;
        System.out.println(
            "Company Name: " + sf.getValue());
        System.out.println(
            "  Confidence: "
            + companyField.getConfidence());
    }

    ContentField totalField =
        content.getFields() != null
            ? content.getFields()
                .get("total_amount") : null;
    if (totalField != null) {
        System.out.println(
            "Total Amount: "
            + totalField.getValue());
    }

    ContentField summaryField =
        content.getFields() != null
            ? content.getFields()
                .get("document_summary") : null;
    if (summaryField
        instanceof ContentStringField) {
        ContentStringField sf =
            (ContentStringField) summaryField;
        System.out.println(
            "Summary: " + sf.getValue());
    }

    ContentField typeField =
        content.getFields() != null
            ? content.getFields()
                .get("document_type") : null;
    if (typeField
        instanceof ContentStringField) {
        ContentStringField sf =
            (ContentStringField) typeField;
        System.out.println(
            "Document Type: " + sf.getValue());
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

예제 출력은 다음과 같습니다.

Company Name: CONTOSO LTD.
  Confidence: 0.781
Total Amount: 610.0
Summary: This document is an invoice from CONTOSO LTD. to Microsoft Corporation for consulting services, document fees, and printing fees, detailing service dates, itemized charges, taxes, and the total amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

팁 (조언)

Java SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 이미지를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

String imageUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg";

AnalysisInput input = new AnalysisInput();
input.setUrl(imageUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()) {
    var content =
        analyzeResult.getContents().get(0);

    if (content.getFields() != null) {
        ContentField titleField =
            content.getFields().get("Title");
        if (titleField
            instanceof ContentStringField) {
            System.out.println(
                "Title: "
                + ((ContentStringField) titleField)
                    .getValue());
        }

        ContentField chartField =
            content.getFields().get("ChartType");
        if (chartField
            instanceof ContentStringField) {
            System.out.println(
                "Chart Type: "
                + ((ContentStringField) chartField)
                    .getValue());
        }
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

예제 출력은 다음과 같습니다.

Title: Weekly Working Hours Distribution
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

팁 (조언)

Java SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 오디오 파일을 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

String audioUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3";

AnalysisInput input = new AnalysisInput();
input.setUrl(audioUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()) {
    var content =
        analyzeResult.getContents().get(0);

    if (content.getFields() != null) {
        ContentField summaryField =
            content.getFields().get("Summary");
        if (summaryField
            instanceof ContentStringField) {
            System.out.println(
                "Summary: "
                + ((ContentStringField)
                    summaryField)
                    .getValue());
        }

        ContentField sentField =
            content.getFields().get("Sentiment");
        if (sentField
            instanceof ContentStringField) {
            System.out.println(
                "Sentiment: "
                + ((ContentStringField) sentField)
                    .getValue());
        }
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

예제 출력은 다음과 같습니다.

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the customer service representative, confirmed her identity by requesting her date of birth and provided her point balance. The conversation ended politely with no further requests.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

팁 (조언)

Java SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 비디오를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

String videoUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4";

AnalysisInput input = new AnalysisInput();
input.setUrl(videoUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()) {
    var content =
        analyzeResult.getContents().get(0);
    System.out.println(
        "Content kind: " + content.getKind());
    if (content.getFields() != null) {
        ContentField segmentsField =
            content.getFields().get("Segments");
        if (segmentsField
            instanceof ContentArrayField) {
            ContentArrayField segments =
                (ContentArrayField) segmentsField;
            System.out.println(
                "Segments (" + segments.size()
                + "):");
            for (int i = 0;
                i < segments.size(); i++) {
                ContentField seg = segments.get(i);
                if (seg instanceof
                    ContentObjectField) {
                    ContentObjectField obj =
                        (ContentObjectField) seg;
                    ContentField idField =
                        obj.getFieldOrDefault(
                            "SegmentId");
                    ContentField descField =
                        obj.getFieldOrDefault(
                            "Description");
                    ContentField sentField =
                        obj.getFieldOrDefault(
                            "Sentiment");
                    String segId = idField != null
                        ? String.valueOf(
                            idField.getValue())
                        : "N/A";
                    String desc = descField != null
                        ? String.valueOf(
                            descField.getValue())
                        : "N/A";
                    String sent = sentField != null
                        ? String.valueOf(
                            sentField.getValue())
                        : "N/A";
                    System.out.println(
                        "  Segment " + segId
                        + ": " + desc
                        + " (Sentiment: "
                        + sent + ")");
                }
            }
        }
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

예제 출력은 다음과 같습니다.

Content kind: audioVisual
Segments (16):
  Segment 00:00:00.000-00:00:01.467: The video opens with a scenic aerial view of an island, featuring a small aircraft flying above the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI,' indicating a collaboration or integration between the two products. (Sentiment: Positive)
  Segment 00:00:01.467-00:00:03.233: A man is shown in an interview setting, sitting in a modern office environment. The transcript begins discussing neural TTS (Text-to-Speech) and the importance of good data for achieving a high-quality voice. (Sentiment: Neutral)
  Segment 00:00:03.233-00:00:07.367: The visuals shift to a digital audio waveform, emphasizing the technical aspect of TTS. The transcript explains that a universal TTS model was built using 3,000 hours of data, highlighting the scale and quality of the dataset. (Sentiment: Positive)
  Segment 00:00:07.367-00:00:08.200: Another man appears in an interview setting, continuing the discussion about the accumulation of data for the universal TTS model. The transcript notes that the model captures audio nuances for more natural voice generation. (Sentiment: Positive)
  Segment 00:00:08.200-00:00:11.367: The video transitions to an outdoor scene showing a large facility, likely a data center, set in a rural landscape. This visually supports the scale of infrastructure required for the TTS model. (Sentiment: Neutral)
  Segment 00:00:11.367-00:00:13.567: Inside a data center, rows of servers are shown, reinforcing the technological backbone of the TTS system. The transcript continues to emphasize the accumulation of data and the model's capabilities. (Sentiment: Neutral)
  Segment 00:00:13.567-00:00:16.100: The interview returns to the first man, who elaborates on the universal model's ability to generate natural voices. The transcript mentions the model's ability to capture nuances, supporting the visuals. (Sentiment: Positive)
  Segment 00:00:16.100-00:00:19.433: A biplane is seen flying over a coastal landscape, visually connecting the Flight Simulator experience to the advanced AI voice technology discussed earlier. (Sentiment: Positive)
  Segment 00:00:19.433-00:00:23.967: A scenic view of a castle with a plane flying overhead, further showcasing the immersive environments possible in Flight Simulator. The transcript highlights the naturalness of the generated voices. (Sentiment: Positive)
  Segment 00:00:23.967-00:00:30.033: A bald man is interviewed in a modern office setting. The transcript discusses the high fidelity of cognitive services offerings, noting that the voices sound much more like actual human voices. (Sentiment: Positive)
  Segment 00:00:30.033-00:00:33.200: The interview with the bald man continues, reinforcing the message about the realism and fidelity of the AI-generated voices. (Sentiment: Positive)
  Segment 00:00:33.200-00:00:35.267: The video shows an overhead view of an airplane on the tarmac, possibly preparing for pushback. The transcript transitions to a simulated ATC (Air Traffic Control) exchange, demonstrating the practical application of TTS in Flight Simulator. (Sentiment: Neutral)
  Segment 00:00:35.267-00:00:37.700: A ground crew member directs an Airbus aircraft, visually representing the realism and immersion of Flight Simulator. The transcript includes ATC communication, showing the integration of natural-sounding AI voices. (Sentiment: Positive)
  Segment 00:00:37.700-00:00:39.200: Ground crew members are seen walking on the tarmac near aircraft, continuing the realistic airport environment. The transcript features further ATC communication. (Sentiment: Neutral)
  Segment 00:00:39.200-00:00:42.033: A close-up of an Airbus aircraft at the gate, reinforcing the realism and detail in Flight Simulator. The transcript continues with ATC exchanges, demonstrating the natural voice output. (Sentiment: Positive)
  Segment 00:00:42.033-00:00:43.866: The video ends with the Microsoft logo and branding, signifying the conclusion of the demo and reinforcing the partnership between Flight Simulator and Microsoft Azure AI. (Sentiment: Positive)

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

팁 (조언)

Java SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

클라이언트 라이브러리 | 샘플 | SDK 원본

이 가이드에서는 Content Understanding JavaScript SDK를 사용하여 콘텐츠에서 구조화된 데이터를 추출하는 사용자 지정 분석기를 만드는 방법을 보여 줍니다. 사용자 지정 분석기는 문서, 이미지, 오디오 및 비디오 콘텐츠 형식을 지원합니다.

필수 조건

활성 Azure 구독입니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
지원되는 지역에서 만든 Microsoft Foundry 리소스입니다.
리소스 엔드포인트 및 API 키(Azure Portal의 키 및 엔드포인트 에서 찾을 수 있습니다).
리소스에 대해 구성된 모델 배포 기본값입니다. 설정 지침을 보려면 모델 및 배포 또는 이 일회성 구성 스크립트를 참조하세요.
Node.js LTS 버전입니다.

설정

새 Node.js 프로젝트를 만듭니다.

mkdir custom-analyzer-tutorial
cd custom-analyzer-tutorial
npm init -y

Content Understanding 클라이언트 라이브러리를 설치합니다.
```
npm install @azure/ai-content-understanding
```
필요에 따라 Microsoft Entra 인증을 위한 Azure ID 라이브러리를 설치합니다.
```
npm install @azure/identity
```

환경 변수 설정하기

Content Understanding 서비스를 사용하여 인증하려면 샘플을 실행하기 전에 사용자 고유의 값으로 환경 변수를 설정합니다.

CONTENTUNDERSTANDING_ENDPOINT - Content Understanding 리소스의 엔드포인트입니다.
CONTENTUNDERSTANDING_KEY은 Content Understanding API 키를 의미하며, 만약 Microsoft Entra ID 의 DefaultAzureCredential을 사용하는 경우에는 생략할 수 있습니다.

윈도우즈

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux/macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

클라이언트 만들기

const { AzureKeyCredential } =
    require("@azure/core-auth");
const {
    ContentUnderstandingClient,
} = require("@azure/ai-content-understanding");

const endpoint =
    process.env["CONTENTUNDERSTANDING_ENDPOINT"];
const key =
    process.env["CONTENTUNDERSTANDING_KEY"];

const client = new ContentUnderstandingClient(
    endpoint,
    new AzureKeyCredential(key)
);

사용자 지정 분석기 만들기

const analyzerId =
    `my_document_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom analyzer for extracting"
        + " company information",
    config: {
        enableFormula: true,
        enableLayout: true,
        enableOcr: true,
        estimateFieldSourceAndConfidence: true,
        returnDetails: true,
    },
    fieldSchema: {
        name: "company_schema",
        description:
            "Schema for extracting company"
            + " information",
        fields: {
            company_name: {
                type: "string",
                method: "extract",
                description:
                    "Name of the company",
            },
            total_amount: {
                type: "number",
                method: "extract",
                description:
                    "Total amount on the"
                    + " document",
            },
            document_summary: {
                type: "string",
                method: "generate",
                description:
                    "A brief summary of the"
                    + " document content",
            },
            document_type: {
                type: "string",
                method: "classify",
                description: "Type of document",
                enum: [
                    "invoice", "receipt",
                    "contract", "report", "other",
                ],
            },
        },
    },
    models: {
        completion: "gpt-4.1",
        embedding: "text-embedding-3-large", // Required when using field_schema and prebuilt-document base analyzer
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: string (extract)
    - total_amount: number (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

팁 (조언)

이 코드는 SDK 리포지토리의 분석기 만들기 샘플을 기반으로 합니다.

const classifierId =
    `my_classifier_${Math.floor(
        Date.now() / 1000
    )}`;

console.log(
    `Creating classifier '${classifierId}'...`
);

const classifierAnalyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom classifier for financial"
        + " document categorization",
    config: {
        returnDetails: true,
        enableSegment: true,
        contentCategories: {
            Loan_Application: {
                description:
                    "Documents submitted by"
                    + " individuals or"
                    + " businesses to request"
                    + " funding, typically"
                    + " including personal or"
                    + " business details,"
                    + " financial history,"
                    + " loan amount, purpose,"
                    + " and supporting"
                    + " documentation.",
            },
            Invoice: {
                description:
                    "Billing documents issued"
                    + " by sellers or service"
                    + " providers to request"
                    + " payment for goods or"
                    + " services, detailing"
                    + " items, prices, taxes,"
                    + " totals, and payment"
                    + " terms.",
            },
            Bank_Statement: {
                description:
                    "Official statements"
                    + " issued by banks that"
                    + " summarize account"
                    + " activity over a"
                    + " period, including"
                    + " deposits, withdrawals,"
                    + " fees, and balances.",
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const classifierPoller =
    client.createAnalyzer(
        classifierId, classifierAnalyzer
    );
await classifierPoller.pollUntilDone();

const classifierResult =
    await client.getAnalyzer(classifierId);

console.log(
    `Classifier '${classifierId}' created`
    + ` successfully!`
);

if (classifierResult.description) {
    console.log(
        `  Description: `
        + `${classifierResult.description}`
    );
}

팁 (조언)

이 코드는 분류 워크플로에 대한 분류자 만들기 샘플을 기반으로 합니다.

다음 예제에서는 차트 및 그래프를 처리하기 위해 미리 빌드된 이미지 분석기를 기반으로 사용자 지정 이미지 분석기를 만듭니다.

const analyzerId =
    `my_image_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-image",
    description:
        "Custom analyzer for charts and graphs",
    fieldSchema: {
        name: "chart_schema",
        description:
            "Schema for extracting chart"
            + " information",
        fields: {
            Title: {
                type: "string",
                description:
                    "Title of the chart",
            },
            ChartType: {
                type: "string",
                method: "classify",
                description: "Type of chart",
                enum: ["bar", "line", "pie"],
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

팁 (조언)

이 코드는 이미지 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 고객 지원 통화 녹음/녹화를 처리하기 위해 미리 빌드된 오디오 분석기를 기반으로 사용자 지정 오디오 분석기를 만듭니다.

const analyzerId =
    `my_audio_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-audio",
    description:
        "Custom analyzer for customer"
        + " support calls",
    config: {
        locales: ["en-US", "fr-FR"],
        returnDetails: true,
    },
    fieldSchema: {
        name: "call_center_schema",
        description:
            "Schema for analyzing customer"
            + " support calls",
        fields: {
            Summary: {
                type: "string",
                method: "generate",
                description:
                    "Summary of the call",
            },
            Sentiment: {
                type: "string",
                method: "classify",
                description:
                    "Overall sentiment of"
                    + " the call",
                enum: [
                    "Positive", "Neutral",
                    "Negative",
                ],
            },
            People: {
                type: "array",
                description:
                    "List of people mentioned",
                itemDefinition: {
                    type: "object",
                    properties: {
                        Name: {
                            type: "string",
                        },
                        Role: {
                            type: "string",
                        },
                    },
                },
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - Summary: string (generate)
    - Sentiment: string (classify)
    - People: array (auto)

팁 (조언)

이 코드는 오디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 제품 데모 및 검토를 처리하기 위해 미리 빌드된 비디오 분석기를 기반으로 사용자 지정 비디오 분석기를 만듭니다.

const analyzerId =
    `my_video_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-video",
    description:
        "Custom analyzer for product"
        + " demo videos",
    config: {
        locales: ["en-US", "fr-FR"],
        returnDetails: true,
    },
    fieldSchema: {
        name: "video_schema",
        description:
            "Schema for analyzing product"
            + " demo videos",
        fields: {
            Segments: {
                type: "array",
                itemDefinition: {
                    type: "object",
                    properties: {
                        SegmentId: {
                            type: "string",
                        },
                        Description: {
                            type: "string",
                            method: "generate",
                            description:
                                "Detailed summary"
                                + " of the video"
                                + " segment",
                        },
                        Sentiment: {
                            type: "string",
                            method: "classify",
                            enum: [
                                "Positive",
                                "Neutral",
                                "Negative",
                            ],
                        },
                    },
                },
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: array (auto)

팁 (조언)

이 코드는 비디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

사용자 지정 분석기 사용

분석기를 만든 후 이 분석기를 사용하여 문서를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const documentUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf";

const analyzePoller = client.analyze(
    analyzerId, [{ url: documentUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const company =
            content.fields["company_name"];
        if (company) {
            console.log(
                `Company Name: `
                + `${company.value}`
            );
            console.log(
                `  Confidence: `
                + `${company.confidence}`
            );
        }

        const total =
            content.fields["total_amount"];
        if (total) {
            console.log(
                `Total Amount: `
                + `${total.value}`
            );
        }

        const summary =
            content.fields["document_summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const docType =
            content.fields["document_type"];
        if (docType) {
            console.log(
                `Document Type: `
                + `${docType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Company Name: CONTOSO LTD.
  Confidence: 0.739
Total Amount: 610
Summary: This document is an invoice from CONTOSO LTD. to Microsoft Corporation for consulting, document, and printing services provided during the service period. It details line items, subtotal, sales tax, total, previous unpaid balance, and the final amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

팁 (조언)

JavaScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 이미지를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const imageUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg";

const analyzePoller = client.analyze(
    analyzerId, [{ url: imageUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const title =
            content.fields["Title"];
        if (title) {
            console.log(
                `Title: ${title.value}`
            );
        }

        const chartType =
            content.fields["ChartType"];
        if (chartType) {
            console.log(
                `Chart Type: `
                + `${chartType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

팁 (조언)

JavaScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 오디오 파일을 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const audioUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3";

const analyzePoller = client.analyze(
    analyzerId, [{ url: audioUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const summary =
            content.fields["Summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const sentiment =
            content.fields["Sentiment"];
        if (sentiment) {
            console.log(
                `Sentiment: `
                + `${sentiment.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and then provided her with her point balance of 599 points. Maria confirmed she needed no further information and ended the call.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

팁 (조언)

JavaScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 비디오를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const videoUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4";

const analyzePoller = client.analyze(
    analyzerId, [{ url: videoUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    console.log(
        `Content kind: ${content.kind}`
    );
    if (content.fields) {
        const segments =
            content.fields["Segments"];
        if (segments && segments.value) {
            console.log(
                `Segments`
                + ` (${segments.value.length}):`
            );
            for (const segment
                of segments.value) {
                const segId =
                    segment.value
                        ?.SegmentId?.value
                    ?? "N/A";
                const desc =
                    segment.value
                        ?.Description?.value
                    ?? "N/A";
                const sent =
                    segment.value
                        ?.Sentiment?.value
                    ?? "N/A";
                console.log(
                    `  Segment: ${segId}`
                );
                console.log(
                    `    Description:`
                    + ` ${desc}`
                );
                console.log(
                    `    Sentiment:`
                    + ` ${sent}`
                );
            }
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Content kind: audioVisual
Segments (16):
  Segment: 00:00:00.000-00:00:01.467
    Description: The video opens with a scenic aerial view of an island surrounded by turquoise water. A small airplane is flying over the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI', indicating a collaboration or integration between the two.
    Sentiment: Positive
  Segment: 00:00:01.467-00:00:03.233
    Description: A man is shown sitting in a modern office setting, likely preparing to speak or introduce the topic. The background features geometric wall lights and a plant, giving a professional and contemporary atmosphere.
    Sentiment: Neutral
  Segment: 00:00:03.233-00:00:07.367
    Description: The screen displays a digital audio waveform, suggesting a focus on audio technology. The accompanying transcript discusses the importance of good data for neural TTS (Text-to-Speech) to achieve a high-quality voice.
    Sentiment: Neutral
  Segment: 00:00:07.367-00:00:08.200
    Description: Another man is shown in a similar office environment, possibly continuing the explanation or providing additional information about the product.
    Sentiment: Neutral
  Segment: 00:00:08.200-00:00:11.367
    Description: The video transitions to an outdoor scene showing a large facility with multiple buildings, set in a rural landscape. This likely represents the data centers or infrastructure supporting the technology.
    Sentiment: Neutral
  Segment: 00:00:11.367-00:00:13.567
    Description: The camera moves inside a data center, showing rows of servers and high-tech equipment. This emphasizes the scale and capability of the infrastructure used for the TTS model.
    Sentiment: Neutral
  Segment: 00:00:13.567-00:00:16.100
    Description: The man from earlier is shown again in the office, likely elaborating on the accumulation of data and the universal TTS model, as mentioned in the transcript.
    Sentiment: Neutral
  Segment: 00:00:16.100-00:00:19.433
    Description: A biplane is seen flying over a coastal city with clear blue water and lush green hills, highlighting the realism and immersive visuals of the Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:19.433-00:00:23.967
    Description: The video shows a castle surrounded by mountains and clouds, with a small aircraft flying nearby. This further showcases the detailed environments possible in the Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:23.967-00:00:30.033
    Description: A bald man is interviewed in a modern office setting. The transcript discusses the high fidelity and naturalness of voices generated by cognitive services, suggesting he is explaining the benefits of the technology.
    Sentiment: Positive
  Segment: 00:00:30.033-00:00:33.200
    Description: The bald man continues speaking, possibly providing more details about the product's capabilities and its impact on user experience.
    Sentiment: Positive
  Segment: 00:00:33.200-00:00:35.267
    Description: The video shifts to an overhead view of an airplane on the runway, preparing for movement. This scene likely relates to the realism of the simulator and the integration of AI-driven voice technology.
    Sentiment: Neutral
  Segment: 00:00:35.267-00:00:37.700
    Description: A ground crew member directs an Airbus aircraft, with pilots visible in the cockpit. This scene emphasizes the operational realism and communication aspects in the simulator.
    Sentiment: Neutral
  Segment: 00:00:37.700-00:00:39.200
    Description: Two ground crew members walk near an aircraft on the tarmac, with airport buildings and other planes in the background. The environment is realistic and detailed.
    Sentiment: Neutral
  Segment: 00:00:39.200-00:00:42.033
    Description: A close-up of an Airbus aircraft at the gate, with sunlight illuminating the scene. This further highlights the visual fidelity and immersive experience of the simulator.
    Sentiment: Positive
  Segment: 00:00:42.033-00:00:43.866
    Description: The video ends with the Microsoft logo and branding, signaling the conclusion of the product demo and reinforcing the partnership between Flight Simulator and Microsoft Azure AI.
    Sentiment: Positive

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

팁 (조언)

JavaScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

클라이언트 라이브러리 | 샘플 | SDK 원본

이 가이드에서는 Content Understanding TypeScript SDK를 사용하여 콘텐츠에서 구조화된 데이터를 추출하는 사용자 지정 분석기를 만드는 방법을 보여 줍니다. 사용자 지정 분석기는 문서, 이미지, 오디오 및 비디오 콘텐츠 형식을 지원합니다.

필수 조건

활성 Azure 구독입니다. Azure 계정이 없는 경우 체험 계정을 만드세요.
지원되는 지역에서 만든 Microsoft Foundry 리소스입니다.
리소스 엔드포인트 및 API 키(Azure Portal의 키 및 엔드포인트 에서 찾을 수 있습니다).
리소스에 대해 구성된 모델 배포 기본값입니다. 설정 지침을 보려면 모델 및 배포 또는 이 일회성 구성 스크립트를 참조하세요.
Node.js LTS 버전입니다.
TypeScript 5.x 이상.

설정

새 Node.js 프로젝트를 만듭니다.

mkdir custom-analyzer-tutorial
cd custom-analyzer-tutorial
npm init -y

TypeScript 및 Content Understanding 클라이언트 라이브러리를 설치합니다.
```
npm install typescript ts-node @azure/ai-content-understanding
```
필요에 따라 Microsoft Entra 인증을 위한 Azure ID 라이브러리를 설치합니다.
```
npm install @azure/identity
```

환경 변수 설정하기

Content Understanding 서비스를 사용하여 인증하려면 샘플을 실행하기 전에 사용자 고유의 값으로 환경 변수를 설정합니다.

CONTENTUNDERSTANDING_ENDPOINT - Content Understanding 리소스의 엔드포인트입니다.
CONTENTUNDERSTANDING_KEY은 Content Understanding API 키를 의미하며, 만약 Microsoft Entra ID 의 DefaultAzureCredential을 사용하는 경우에는 생략할 수 있습니다.

윈도우즈

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux/macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

클라이언트 만들기

import { AzureKeyCredential } from "@azure/core-auth";
import {
    ContentUnderstandingClient,
} from "@azure/ai-content-understanding";
import type {
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
} from "@azure/ai-content-understanding";

const endpoint =
    process.env["CONTENTUNDERSTANDING_ENDPOINT"]!;
const key =
    process.env["CONTENTUNDERSTANDING_KEY"]!;

const client = new ContentUnderstandingClient(
    endpoint,
    new AzureKeyCredential(key)
);

사용자 지정 분석기 만들기

const analyzerId =
    `my_document_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "company_schema",
    description:
        "Schema for extracting company"
        + " information",
    fields: {
        company_name: {
            type: "string",
            method: "extract",
            description:
                "Name of the company",
        },
        total_amount: {
            type: "number",
            method: "extract",
            description:
                "Total amount on the document",
        },
        document_summary: {
            type: "string",
            method: "generate",
            description:
                "A brief summary of the"
                + " document content",
        },
        document_type: {
            type: "string",
            method: "classify",
            description: "Type of document",
            enum: [
                "invoice", "receipt",
                "contract", "report", "other",
            ],
        },
    },
};

const config: ContentAnalyzerConfig = {
    enableFormula: true,
    enableLayout: true,
    enableOcr: true,
    estimateFieldSourceAndConfidence: true,
    returnDetails: true,
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom analyzer for extracting"
        + " company information",
    config,
    fieldSchema,
    models: {
        completion: "gpt-4.1",
        embedding: "text-embedding-3-large", // Required when using field_schema and prebuilt-document base analyzer
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: string (extract)
    - total_amount: number (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

팁 (조언)

이 코드는 SDK 리포지토리의 분석기 만들기 샘플을 기반으로 합니다.

const classifierId =
    `my_classifier_${Math.floor(
        Date.now() / 1000
    )}`;

console.log(
    `Creating classifier '${classifierId}'...`
);

const classifierAnalyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom classifier for financial"
        + " document categorization",
    config: {
        returnDetails: true,
        enableSegment: true,
        contentCategories: {
            Loan_Application: {
                description:
                    "Documents submitted by"
                    + " individuals or"
                    + " businesses to request"
                    + " funding, typically"
                    + " including personal or"
                    + " business details,"
                    + " financial history,"
                    + " loan amount, purpose,"
                    + " and supporting"
                    + " documentation.",
            },
            Invoice: {
                description:
                    "Billing documents issued"
                    + " by sellers or service"
                    + " providers to request"
                    + " payment for goods or"
                    + " services, detailing"
                    + " items, prices, taxes,"
                    + " totals, and payment"
                    + " terms.",
            },
            Bank_Statement: {
                description:
                    "Official statements"
                    + " issued by banks that"
                    + " summarize account"
                    + " activity over a"
                    + " period, including"
                    + " deposits, withdrawals,"
                    + " fees, and balances.",
            },
        },
    } as unknown as ContentAnalyzerConfig,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const classifierPoller =
    client.createAnalyzer(
        classifierId, classifierAnalyzer
    );
await classifierPoller.pollUntilDone();

const classifierResult =
    await client.getAnalyzer(classifierId);

console.log(
    `Classifier '${classifierId}' created`
    + ` successfully!`
);

if (classifierResult.description) {
    console.log(
        `  Description: `
        + `${classifierResult.description}`
    );
}

팁 (조언)

이 코드는 분류 워크플로에 대한 분류자 만들기 샘플을 기반으로 합니다.

다음 예제에서는 차트 및 그래프를 처리하기 위해 미리 빌드된 이미지 분석기를 기반으로 사용자 지정 이미지 분석기를 만듭니다.

const analyzerId =
    `my_image_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "chart_schema",
    description:
        "Schema for extracting chart"
        + " information",
    fields: {
        Title: {
            type: "string",
            description:
                "Title of the chart",
        },
        ChartType: {
            type: "string",
            method: "classify",
            description: "Type of chart",
            enum: ["bar", "line", "pie"],
        },
    },
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-image",
    description:
        "Custom analyzer for charts"
        + " and graphs",
    fieldSchema,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

팁 (조언)

이 코드는 이미지 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 고객 지원 통화 녹음/녹화를 처리하기 위해 미리 빌드된 오디오 분석기를 기반으로 사용자 지정 오디오 분석기를 만듭니다.

const analyzerId =
    `my_audio_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "call_center_schema",
    description:
        "Schema for analyzing customer"
        + " support calls",
    fields: {
        Summary: {
            type: "string",
            method: "generate",
            description:
                "Summary of the call",
        },
        Sentiment: {
            type: "string",
            method: "classify",
            description:
                "Overall sentiment of"
                + " the call",
            enum: [
                "Positive", "Neutral",
                "Negative",
            ],
        },
        People: {
            type: "array",
            description:
                "List of people mentioned",
            itemDefinition: {
                type: "object",
                properties: {
                    Name: { type: "string" },
                    Role: { type: "string" },
                },
            },
        },
    },
};

const config: ContentAnalyzerConfig = {
    locales: ["en-US", "fr-FR"],
    returnDetails: true,
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-audio",
    description:
        "Custom analyzer for customer"
        + " support calls",
    config,
    fieldSchema,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - Summary: string (generate)
    - Sentiment: string (classify)
    - People: array (auto)

팁 (조언)

이 코드는 오디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

다음 예제에서는 제품 데모 및 검토를 처리하기 위해 미리 빌드된 비디오 분석기를 기반으로 사용자 지정 비디오 분석기를 만듭니다.

const analyzerId =
    `my_video_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "video_schema",
    description:
        "Schema for analyzing product"
        + " demo videos",
    fields: {
        Segments: {
            type: "array",
            itemDefinition: {
                type: "object",
                properties: {
                    SegmentId: {
                        type: "string",
                    },
                    Description: {
                        type: "string",
                        method: "generate",
                        description:
                            "Detailed summary"
                            + " of the video"
                            + " segment",
                    },
                    Sentiment: {
                        type: "string",
                        method: "classify",
                        enum: [
                            "Positive",
                            "Neutral",
                            "Negative",
                        ],
                    },
                },
            },
        },
    },
};

const config: ContentAnalyzerConfig = {
    locales: ["en-US", "fr-FR"],
    returnDetails: true,
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-video",
    description:
        "Custom analyzer for product"
        + " demo videos",
    config,
    fieldSchema,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

예제 출력은 다음과 같습니다.

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: array (auto)

팁 (조언)

이 코드는 비디오 콘텐츠에 대한 분석기 만들기 샘플 패턴을 조정합니다.

사용자 지정 분석기 사용

분석기를 만든 후 이 분석기를 사용하여 문서를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const documentUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf";

const analyzePoller = client.analyze(
    analyzerId, [{ url: documentUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const company =
            content.fields["company_name"];
        if (company) {
            console.log(
                `Company Name: `
                + `${company.value}`
            );
            console.log(
                `  Confidence: `
                + `${company.confidence}`
            );
        }

        const total =
            content.fields["total_amount"];
        if (total) {
            console.log(
                `Total Amount: `
                + `${total.value}`
            );
        }

        const summary =
            content.fields["document_summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const docType =
            content.fields["document_type"];
        if (docType) {
            console.log(
                `Document Type: `
                + `${docType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Company Name: CONTOSO LTD.
  Confidence: 0.818
Total Amount: 610
Summary: This document is an invoice from CONTOSO LTD. to MICROSOFT CORPORATION for consulting, document, and printing services provided during the service period 10/14/2019 - 11/14/2019. It details line items, subtotal, sales tax, total, previous unpaid balance, and the final amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

팁 (조언)

TypeScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 이미지를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const imageUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg";

const analyzePoller = client.analyze(
    analyzerId, [{ url: imageUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const title =
            content.fields["Title"];
        if (title) {
            console.log(
                `Title: ${title.value}`
            );
        }

        const chartType =
            content.fields["ChartType"];
        if (chartType) {
            console.log(
                `Chart Type: `
                + `${chartType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

팁 (조언)

TypeScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 오디오 파일을 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const audioUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3";

const analyzePoller = client.analyze(
    analyzerId, [{ url: audioUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const summary =
            content.fields["Summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const sentiment =
            content.fields["Sentiment"];
        if (sentiment) {
            console.log(
                `Sentiment: `
                + `${sentiment.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and then provided her with her point balance of 599 points. Maria confirmed she did not need further assistance, and the call ended amicably.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

팁 (조언)

TypeScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

분석기를 만든 후 이를 사용하여 비디오를 분석하고 사용자 지정 필드를 추출합니다. 더 이상 필요하지 않은 경우 분석기를 삭제합니다.

const videoUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4";

const analyzePoller = client.analyze(
    analyzerId, [{ url: videoUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    console.log(
        `Content kind: ${content.kind}`
    );
    if (content.fields) {
        const segments =
            content.fields["Segments"];
        if (segments && segments.value) {
            const segArray =
                segments.value as any[];
            console.log(
                `Segments`
                + ` (${segArray.length}):`
            );
            for (const segment
                of segArray) {
                const segId =
                    segment.value
                        ?.SegmentId?.value
                    ?? "N/A";
                const desc =
                    segment.value
                        ?.Description?.value
                    ?? "N/A";
                const sent =
                    segment.value
                        ?.Sentiment?.value
                    ?? "N/A";
                console.log(
                    `  Segment: ${segId}`
                );
                console.log(
                    `    Description:`
                    + ` ${desc}`
                );
                console.log(
                    `    Sentiment:`
                    + ` ${sent}`
                );
            }
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

예제 출력은 다음과 같습니다.

Content kind: audioVisual
Segments (16):
  Segment: 00:00:00.000-00:00:01.467
    Description: The video opens with a scenic aerial view of an island surrounded by blue water, featuring a small airplane flying over it. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI', indicating a collaboration or integration between the two.
    Sentiment: Positive
  Segment: 00:00:01.467-00:00:03.233
    Description: A man is shown sitting in a modern office environment, likely preparing to speak or introduce the topic. The background includes plants and geometric wall lights, giving a professional and contemporary feel.
    Sentiment: Neutral
  Segment: 00:00:03.233-00:00:07.367
    Description: The video transitions to a close-up of a digital audio waveform, visually representing sound data. This segment aligns with the audio discussing the importance of good data for neural TTS (Text-to-Speech) and the creation of a universal TTS model using extensive audio data.
    Sentiment: Positive
  Segment: 00:00:07.367-00:00:08.200
    Description: Another man appears in a similar office setting, possibly continuing the explanation or providing additional commentary.
    Sentiment: Neutral
  Segment: 00:00:08.200-00:00:11.367
    Description: The scene shifts to an outdoor view of a large facility surrounded by green fields and blue skies, likely representing a data center or infrastructure supporting the TTS technology.
    Sentiment: Positive
  Segment: 00:00:11.367-00:00:13.567
    Description: Inside a data center, rows of servers are shown, emphasizing the technological backbone and scale of the operation required for processing large amounts of audio data.
    Sentiment: Positive
  Segment: 00:00:13.567-00:00:16.100
    Description: The first man returns, continuing his explanation in the office setting. The audio mentions the accumulation of data to capture audio nuances and generate natural voices.
    Sentiment: Positive
  Segment: 00:00:16.100-00:00:19.433
    Description: A biplane is seen flying over a coastal landscape, showcasing the immersive visuals of Flight Simulator. This segment highlights the realism and beauty of the simulation.
    Sentiment: Positive
  Segment: 00:00:19.433-00:00:23.967
    Description: A plane flies past a castle set against a mountainous backdrop, further demonstrating the detailed environments in Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:23.967-00:00:30.033
    Description: A bald man is interviewed in a modern office space, likely discussing the benefits of cognitive services offerings, such as higher fidelity and more human-like voices.
    Sentiment: Positive
  Segment: 00:00:30.033-00:00:33.200
    Description: The interview continues with the bald man, focusing on his commentary about the product's features and advantages.
    Sentiment: Positive
  Segment: 00:00:33.200-00:00:35.267
    Description: The video shifts to an overhead view of an airplane on the runway, preparing for movement, possibly referencing the realism of in-game operations.
    Sentiment: Neutral
  Segment: 00:00:35.267-00:00:37.700
    Description: A ground crew member directs an Airbus aircraft, highlighting the detailed simulation of airport operations in Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:37.700-00:00:39.200
    Description: Two ground crew members walk near an aircraft on the tarmac, reinforcing the realistic airport environment and operations.
    Sentiment: Neutral
  Segment: 00:00:39.200-00:00:42.033
    Description: A close-up of an Airbus aircraft at the gate, with sunlight and clouds in the background, further showcasing the visual fidelity of Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:42.033-00:00:43.866
    Description: The video concludes with the Microsoft logo and branding, signaling the end of the product demo and reinforcing the partnership.
    Sentiment: Positive

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

팁 (조언)

TypeScript SDK 샘플에서 분석기를 실행하는 더 많은 예제를 확인하세요.

코드 샘플 검토: 시각적 문서 검색.
코드 샘플 검토: 분석기 템플릿.
더 많은 Python SDK 샘플 살펴보기
더 많은 .NET SDK 샘플 살펴보기
더 많은 Java SDK 샘플 살펴보기
더 많은 JavaScript SDK 샘플 살펴보기
더 많은 TypeScript SDK 샘플 살펴보기
Foundry에서 Content Understanding을 사용하여 문서 콘텐츠를 처리해 보세요.

피드백

이 페이지가 도움이 되었나요?

Last updated on 2026-03-31

사용자 지정 분석기 만들기

필수 조건

분석기 스키마 정의

분석기 만들기

PUT 요청

PUT 응답

파일 분석

파일 제출

POST 요청

POST 응답

분석 결과 가져오기

GET 요청

GET 응답

샘플 응답

필수 조건

설정

환경 변수 설정하기

윈도우즈

Linux/macOS

클라이언트 만들기

사용자 지정 분석기 만들기

사용자 지정 분석기 사용

필수 조건

설정

환경 변수 설정하기

윈도우즈

Linux/macOS

클라이언트 만들기

사용자 지정 분석기 만들기

사용자 지정 분석기 사용

필수 조건

설정

환경 변수 설정하기

윈도우즈

Linux/macOS

클라이언트 만들기

사용자 지정 분석기 만들기

사용자 지정 분석기 사용

필수 조건

설정

환경 변수 설정하기

윈도우즈

Linux/macOS

클라이언트 만들기

사용자 지정 분석기 만들기

사용자 지정 분석기 사용

필수 조건

설정

환경 변수 설정하기

윈도우즈

Linux/macOS

클라이언트 만들기

사용자 지정 분석기 만들기

사용자 지정 분석기 사용

관련 콘텐츠

피드백

추가 리소스