Membuat penganalisis kustom

Penganalisis Pemahaman Konten menentukan cara memproses dan mengekstrak wawasan dari konten Anda. Mereka memastikan pemrosesan seragam dan struktur output di semua konten Anda, sehingga Anda mendapatkan hasil yang andal dan dapat diprediksi. Untuk kasus penggunaan umum, Anda dapat menggunakan penganalisis bawaan. Panduan ini menunjukkan bagaimana Anda dapat menyesuaikan penganalisis ini agar lebih sesuai dengan kebutuhan Anda.

Panduan ini menunjukkan kepada Anda cara menggunakan REST API Pemahaman Konten untuk membuat penganalisis kustom yang mengekstrak data terstruktur dari konten Anda.

Prasyarat

Langganan Azure aktif. Jika Anda tidak memiliki akun Azure, buat secara gratis.
Sumber daya Microsoft Foundry dibuat di wilayah yang didukung.
- Portal mencantumkan sumber daya ini di bawah Foundry>Foundry.
Siapkan penyebaran model default untuk sumber daya Pemahaman Konten Anda. Dengan mengatur default, Anda membuat koneksi ke model Microsoft Foundry yang Anda gunakan untuk permintaan Content Understanding. Pilih salah satu metode berikut:
- Portal
- REST API
1. Buka halaman pengaturan Pemahaman Konten.
2. Pilih tombol + Tambahkan sumber daya di kiri atas.
3. Pilih sumber daya Foundry yang ingin Anda gunakan dan pilihSimpan>.
  
  Pastikan bahwa kotak centang Aktifkan penyebaran otomatis untuk model yang diperlukan jika tidak ada default yang tersedia yang dipilih. Pilihan ini memastikan sumber daya Anda sepenuhnya disiapkan dengan model GPT-4.1, GPT-4.1-mini, dan text-embedding-3-large yang diperlukan. Penganalisis bawaan yang berbeda memerlukan model yang berbeda.
Dengan mengambil langkah-langkah ini, Anda menyiapkan koneksi antara model Content Understanding dan Foundry di sumber daya Foundry Anda.
1. Di sumber daya Foundry Anda, buat penyebaran model pada Foundry dari model GPT-4.1, GPT-4.1-mini, dan text-embedding-3-large. Untuk detail tentang cara menyebarkan model ini, lihat Membuat penyebaran model di portal Microsoft Foundry. Penganalisis bawaan yang berbeda memerlukan model yang berbeda, jadi Anda perlu mendeploy ketiga model tersebut.
2. Tentukan penyebaran model default di tingkat sumber daya. Sebelum Anda menjalankan perintah berikut cURL , buat perubahan berikut pada permintaan HTTP:
  1. Ganti {endpoint} dan {key} dengan nilai yang sesuai dari instans Foundry Anda di portal Microsoft Azure.
  2. Ganti {myGPT41Deployment}, {myGPT41MiniDeployment}, dan {myEmbeddingDeployment} dengan nama penyebaran model aktual Anda dari sumber daya Foundry Anda.
```
curl -i -X PATCH "{endpoint}/contentunderstanding/defaults?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "modelDeployments": {
          "gpt-4.1": "{myGPT41Deployment}",
          "gpt-4.1-mini": "{myGPT41MiniDeployment}",
          "text-embedding-3-large": "{myEmbeddingDeployment}"
        }
      }'
```
cURL diinstal untuk lingkungan dev Anda.

Menentukan skema penganalisis

Untuk membuat penganalisis kustom, tentukan skema bidang yang menjelaskan data terstruktur yang ingin Anda ekstrak. Dalam contoh berikut, Anda membuat penganalisis berdasarkan penganalisis dokumen bawaan untuk memproses tanda terima.

Buat file JSON bernama receipt.json dengan konten berikut:

{
  "description": "Sample receipt analyzer",
  "baseAnalyzerId": "prebuilt-document",
  "models": {
      "completion": "gpt-4.1",
      "embedding": "text-embedding-3-large"

    },
  "config": {
    "returnDetails": true,
    "enableFormula": false,
    "estimateFieldSourceAndConfidence": true,
    "tableFormat": "html"
  },
 "fieldSchema": {
    "fields": {
      "VendorName": {
        "type": "string",
        "method": "extract",
        "description": "Vendor issuing the receipt"
      },
      "Items": {
        "type": "array",
        "method": "extract",
        "items": {
          "type": "object",
          "properties": {
            "Description": {
              "type": "string",
              "method": "extract",
              "description": "Description of the item"
            },
            "Amount": {
              "type": "number",
              "method": "extract",
              "description": "Amount of the item"
            }
          }
        }
      }
    }
  }
}

Jika Anda memiliki berbagai jenis dokumen yang perlu Diproses, tetapi Anda ingin mengategorikan dan menganalisis tanda terima saja, buat penganalisis yang mengategorikan dokumen terlebih dahulu. Kemudian, rutekan ke penganalisis yang Anda buat sebelumnya dengan skema berikut.

Buat file JSON bernama categorize.json dengan konten berikut:

{
  "baseAnalyzerId": "prebuilt-document",
  // Use the base analyzer to invoke the document specific capabilities.

  //Specify the model the analyzer should use. This is one of the supported completion models and one of the supported embeddings model. The specific deployment used during analyze is set on the resource or provided in the analyze request.
  "models": {
      "completion": "gpt-4.1"
    },
  "config": {
    // Enable splitting of the input into segments. Set this property to false if you only expect a single document within the input file. When specified and enableSegment=false, the whole content will be classified into one of the categories.
    "enableSegment": false,

    "contentCategories": {
      // Category name.
      "receipt": {
        // Description to help with classification and splitting.
        "description": "Any images or documents of receipts",

        // Define the analyzer that any content classified as a receipt should be routed to
        "analyzerId": "receipt"
      },

      "invoice": {
        "description": "Any images or documents of invoice",
        "analyzerId": "prebuilt-invoice"
      },
      "policeReport": {
        "description": "A police or law enforcement report detailing the events that lead to the loss."
        // Don't perform analysis for this category.
      }

    },

    // Omit original content object and only return content objects from additional analysis.
    "omitContent": true
  }

  //You can use fieldSchema here to define fields that are needed from the entire input content.

}

Buat file JSON bernama request_body.json dengan konten berikut:

{
  "description": "Sample image analyzer for charts and graphs",
  "baseAnalyzerId": "prebuilt-image",
  "models": {
      "completion": "gpt-4.1"
    },
 "fieldSchema": {
    "fields": {
      "Title": {
        "type": "string"
      },
      "ChartType": {
        "type": "string",
        "method": "classify",
        "enum": [ "bar", "line", "pie" ]
      }
    }
  }
}

Buat file JSON bernama request_body.json dengan konten berikut:

{
  "description": "Sample customer support call analyzer",
  "baseAnalyzerId": "prebuilt-audio",
  "config": {
    "locales": ["en-US", "fr-FR"],
    "returnDetails": true
  },
  "fieldSchema": {
    "fields": {
      "Summary": {
        "type": "string",
        "method": "generate"
      },
      "Sentiment": {
        "type": "string",
        "method": "classify",
        "enum": ["Positive", "Neutral", "Negative"]
      },
      "People": {
        "type": "array",
        "description": "List of people mentioned",
        "items": {
          "type": "object",
          "properties": {
            "Name": { "type": "string" },
            "Role": { "type": "string" }
          }
        }
      }
    }
  }
}

Buat file JSON bernama request_body.json dengan konten berikut:

{
  "description": "Sample product demo video analyzer",
  "baseAnalyzerId": "prebuilt-video",
  "models": {
      "completion": "gpt-4.1"
    },
  "config": {
    "locales": ["en-US", "fr-FR"],
    "returnDetails": true,
    "disableFaceBlurring": false
  },
   "fieldSchema": {
    "fields": {
      "Segments": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "SegmentId": {
              "type": "string"
            },
            "Description": {
              "type": "string",
              "method": "generate",
              "description": "Detailed summary of the video segment, focusing on product characteristics, lighting, and color palette."
            },
            "Sentiment": {
              "type": "string",
              "method": "classify",
              "enum": ["Positive", "Neutral", "Negative"]
            }
          }
        }
      }
    }
  }
}

Membuat penganalisis

Permintaan PUT

Buat penganalisis tanda terima terlebih dahulu, lalu buat penganalisis kategoris.

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @receipt.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d @request_body.json

Respons PUT

Respon tersebut menyertakan header 201 Created dengan URL yang dapat Anda gunakan untuk melacak status operasi pembuatan penganalisis asinkron ini.

201 Created
Operation-Location: {endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2025-05-01-preview

Setelah operasi selesai, HTTP GET pada URL lokasi operasi mengembalikan "status": "succeeded".

curl -i -X GET "{endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}"

Menganalisis berkas tersebut

Unggah berkas

Anda sekarang dapat menggunakan penganalisis kustom yang Anda buat untuk memproses file dan mengekstrak bidang yang Anda tentukan dalam skema.

Sebelum menjalankan perintah cURL, buat perubahan berikut pada permintaan HTTP:

Ganti {endpoint} dan {key} dengan titik akhir dan nilai kunci dari instans Foundry portal Microsoft Azure Anda.
Ganti {analyzerId} dengan nama penganalisis kustom yang Anda buat dengan categorize.json file .
Ganti {fileUrl} dengan URL file yang dapat diakses publik untuk dianalisis, seperti jalur ke Azure Storage Blob dengan tanda tangan akses bersama (SAS) atau URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/receipt.pngsampel .

Ganti {endpoint} dan {key} dengan titik akhir dan nilai kunci dari instans Foundry portal Microsoft Azure Anda.
Ganti {analyzerId} dengan nama penganalisis kustom yang Anda buat.
Ganti {fileUrl} dengan URL file yang dapat diakses publik untuk dianalisis, seperti jalur ke Azure Storage Blob dengan tanda tangan akses bersama (SAS) atau URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpgsampel .

Ganti {endpoint} dan {key} dengan titik akhir dan nilai kunci dari instans Foundry portal Microsoft Azure Anda.
Ganti {analyzerId} dengan nama penganalisis kustom yang Anda buat.
Ganti {fileUrl} dengan URL file yang dapat diakses publik untuk dianalisis, seperti jalur ke Azure Storage Blob dengan tanda tangan akses bersama (SAS) atau URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wavsampel .

Ganti {endpoint} dan {key} dengan titik akhir dan nilai kunci dari instans Foundry portal Microsoft Azure Anda.
Ganti {analyzerId} dengan nama penganalisis kustom yang Anda buat.
Ganti {fileUrl} dengan URL file yang dapat diakses publik untuk dianalisis, seperti jalur ke Azure Storage Blob dengan tanda tangan akses bersama (SAS) atau URL https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4sampel .

Permintaan POST

Contoh ini menggunakan penganalisis kustom yang Anda buat dengan categorize.json file untuk menganalisis tanda terima.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/receipt.png"
          }          
        ]
      }'

Contoh ini menggunakan penganalisis kustom yang Anda buat untuk menganalisis bagan atau gambar grafik.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/pieChart.jpg"
          }          
        ]
      }'

Contoh ini menggunakan penganalisis kustom yang Anda buat untuk menganalisis rekaman panggilan dukungan pelanggan.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/audio.wav"
          }          
        ]
      }'

Contoh ini menggunakan penganalisis kustom yang Anda buat untuk menganalisis video demo produk.

curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs":[
          {
            "url": "https://github.com/Azure-Samples/azure-ai-content-understanding-python/raw/refs/heads/main/data/FlightSimulator.mp4"
          }          
        ]
      }'

Respons POST

Respons 202 Accepted mencakup {resultId} yang dapat Anda gunakan untuk melacak status operasi asinkron ini.

{
  "id": {resultId},
  "status": "Running",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": []
  }
}

Dapatkan hasil analisis

Gunakan Operation-Location dari tanggapan POST untuk mendapatkan hasil analisis.

Permintaan GET

curl -i -X GET "{endpoint}/contentunderstanding/analyzerResults/{resultId}?api-version=2025-11-01" \
  -H "Ocp-Apim-Subscription-Key: {key}"

Respons GET

Respon 200 OK berisi bidang status yang menunjukkan kemajuan operasi.

status adalah Succeeded jika operasi berhasil diselesaikan.
Jika statusnya adalah running atau notStarted, panggil API lagi secara manual atau gunakan skrip. Tunggu setidaknya satu detik di antara permintaan.

Contoh tanggapan

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "path": "input1/segment1",
        "category": "receipt",
        "markdown": "Contoso\n\n123 Main Street\nRedmond, WA 98052\n\n987-654-3210\n\n6/10/2019 13:59\nSales Associate: Paul\n\n\n<table>\n<tr>\n<td>2 Surface Pro 6</td>\n<td>$1,998.00</td>\n</tr>\n<tr>\n<td>3 Surface Pen</td>\n<td>$299.97</td>\n</tr>\n</table> ...",
        "fields": {
          "VendorName": {
            "type": "string",
            "valueString": "Contoso",
            "spans": [{"offset": 0,"length": 7}],
            "confidence": 0.996,
            "source": "D(1,774.0000,72.0000,974.0000,70.0000,974.0000,111.0000,774.0000,113.0000)"
          },
          "Items": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "Description": {
                    "type": "string",
                    "valueString": "2 Surface Pro 6",
                    "spans": [ { "offset": 115, "length": 15}],
                    "confidence": 0.423,
                    "source": "D(1,704.0000,482.0000,875.0000,482.0000,875.0000,508.0000,704.0000,508.0000)"
                  },
                  "Amount": {
                    "type": "number",
                    "valueNumber": 1998,
                    "spans": [{ "offset": 140,"length": 9}
                    ],
                    "confidence": 0.957,
                    "source": "D(1,952.0000,482.0000,1048.0000,482.0000,1048.0000,508.0000,952.0000,509.0000)"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1,
            "angle": -0.0944,
            "width": 1743,
            "height": 878
          }
        ],
        "analyzerId": "{analyzerId}",
        "mimeType": "image/png"
      }
    ]
  },
  "usage": {
    "documentPages": 1,
    "tokens": {
      "contextualization": 1000
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "![image](image)\n",
        "fields": {
          "Title": {
            "type": "string",
            "valueString": "Weekly Work Hours Distribution"
          },
          "ChartType": {
            "type": "string",
            "valueString": "pie"
          }
        },
       "kind": "document",
        "startPageNumber": 1,
        "endPageNumber": 1,
        "unit": "pixel",
        "pages": [
          {
            "pageNumber": 1
          }
        ],
        "analyzerId": "{analyzerId}",
        "mimeType": "image/jpeg"
      }
    ]
  },
  "usage": {
    "tokens": {
      "contextualization": 1000
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SSZ",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Audio: 00:00.000 => 01:54.670\nTranscript\n```\n<v Agent>Thank you for calling Woodgrove Travel...\n<v Customer>Hi Isabella, my name is John Smith...\n<v Agent>Could you provide flight details?\n<v Customer>Contoso Airways, flight CA123...\n<v Agent>Sorry to 
                     hear that...\n<v Customer>Flight delay made me miss meeting...\n<v Agent>We'll offer a partial refund...\n<v Customer>Thanks, appreciate your help!\n```",
        "fields": {
          "Summary": {
            "type": "string",
            "valueString": "John Smith contacted Woodgrove Travel to report a negative experience with a flight on Contoso Airways ..."
          },
          "Sentiment": {
            "type": "string",
            "valueString": "Positive"
          },
          "People": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  "Name": {
                    "type": "string",
                    "valueString": "Isabella Taylor"
                  },
                  "Role": {
                    "type": "string",
                    "valueString": "Agent"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 114670,
        "transcriptPhrases": [
          {
            "speaker": "Agent",
            "startTimeMs": 80,
            "endTimeMs": 2160,
            "text": "Thank you for calling Woodgrove Travel.",
            "words": []
          }, ...

        ]
      }
    ]
  },
  "usage": {
    "audioHours": 0.032,
    "tokens": {
      "contextualization": 3194.445
    }
  }
}

{
  "id": {resultId},
  "status": "Succeeded",
  "result": {
    "analyzerId": {analyzerId},
    "apiVersion": "2025-11-01",
    "createdAt": "YYYY-MM-DDTHH:MM:SS",
    "warnings": [],
    "contents": [
      {
        "markdown": "# Video: 00:00 => 00:43\n## Segment 1: Island view\nTranscript\n```\n00:01 --> 00:06\n<Speaker 1>Good data improves TTS.\n```\nKey Frames: ![](keyFrame.726.jpg) ## Segment 2: Data center\nTranscript\n```\n00:07 --> 00:13\n<Speaker 2>We trained on 3,000   
                     hours.\n```\nKey Frames: ![](keyFrame.2046.jpg) ![](keyFrame.4884.jpg)",
        "fields": {
          "Segments": {
            "type": "array",
            "valueArray": [
              {
                "type": "object",
                "valueObject": {
                  
                  "SegmentId": {
                    "type": "string",
                    "valueString": "00:00:00.000-00:00:01.467"
                  },
                  "Description": {
                    "type": "string",
                    "valueString": "The video opens with a dramatic aerial shot of a small airplane flying over a tropical island surrounded by turquoise waters. The logos for 'Flight Simulator' and 'Microsoft Azure AI' are prominently displayed, indicating a collaboration or feature integration between the two."
                  },
                  "Sentiment": {
                    "type": "string",
                    "valueString": "Positive"
                  }
                }
              }, ...
            ]
          }
        },
        "kind": "audioVisual",
        "startTimeMs": 0,
        "endTimeMs": 43866,
        "width": 1080,
        "height": 608,
        "KeyFrameTimesMs": [733, ... , 43233],
        "transcriptPhrases": [
          {
            "speaker": "Speaker 1",
            "startTimeMs": 1360,
            "endTimeMs": 6640,
            "text": "When it comes to the neural TTS, in order to get a good voice, it's better to have good data.",
            "words": []
          }, ...
        ],
        "cameraShotTimesMs": [1467, ...  42033],
        "segments": [
          {
            "startTimeMs": 0,
            "endTimeMs": 1467,
            "description": "The video begins with a scenic aerial view of an island, showcasing the collaboration between Flight Simulator and Microsoft Azure AI.",
            "segmentId": "1"
          }, ...
        ]
      }
    ]
  },
  "usage": {
    "videoHours": 0.013,
    "tokens": {
      "contextualization": 12222.223
    }
  }
}

Pustaka klien | Sampel | Sumber SDK

Panduan ini menunjukkan kepada Anda cara menggunakan Content Understanding Python SDK untuk membuat penganalisis kustom yang mengekstrak data terstruktur dari konten Anda. Penganalisis kustom mendukung jenis konten dokumen, gambar, audio, dan video.

Prasyarat

Langganan Azure aktif. Jika Anda tidak memiliki akun Azure, buat secara gratis.
Sumber daya Microsoft Foundry dibuat di wilayah yang didukung.
Titik akhir sumber daya dan kunci API Anda (ditemukan di bawah Kunci dan Titik Akhir di portal Microsoft Azure).
Pengaturan default penyebaran model telah dikonfigurasi untuk sumber daya Anda. Lihat Model dan penerapan atau skrip satu kali ini untuk konfigurasi untuk petunjuk penyiapan.
Python 3.9 atau yang lebih baru.

Pengaturan

Instal pustaka klien Content Understanding untuk Python dengan pip:
```
pip install azure-ai-contentunderstanding
```
Secara opsional, instal pustaka Azure Identity untuk autentikasi Microsoft Entra:
```
pip install azure-identity
```

Menyiapkan variabel lingkungan

Untuk mengautentikasi dengan layanan Content Understanding, atur variabel lingkungan dengan nilai Anda sendiri sebelum menjalankan sampel:

CONTENTUNDERSTANDING_ENDPOINT - titik akhir ke sumber daya Pemahaman Konten Anda.
CONTENTUNDERSTANDING_KEY - kunci CONTENT Understanding API Anda (opsional jika menggunakan Microsoft Entra ID DefaultAzureCredential).

Windows

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux / macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

Membuat klien

Impor pustaka dan model yang diperlukan, lalu buat klien dengan titik akhir dan kredensial sumber daya Anda.

import os
import time
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.core.credentials import AzureKeyCredential

endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
key = os.environ["CONTENTUNDERSTANDING_KEY"]

client = ContentUnderstandingClient(
    endpoint=endpoint,
    credential=AzureKeyCredential(key),
)

Membuat penganalisis kustom

Contoh berikut membuat penganalisis dokumen kustom berdasarkan penganalisis dasar dokumen bawaan. Ini mendefinisikan bidang menggunakan tiga metode ekstraksi: extract untuk teks harfiah, generate untuk bidang atau interpretasi yang dihasilkan AI, dan classify untuk kategorisasi.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)

# Generate a unique analyzer ID
analyzer_id = f"my_document_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="company_schema",
    description="Schema for extracting company information",
    fields={
        "company_name": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.EXTRACT,
            description="Name of the company",
            estimate_source_and_confidence=True,
        ),
        "total_amount": ContentFieldDefinition(
            type=ContentFieldType.NUMBER,
            method=GenerationMethod.EXTRACT,
            description="Total amount on the document",
            estimate_source_and_confidence=True,
        ),
        "document_summary": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.GENERATE,
            description=(
                "A brief summary of the document content"
            ),
        ),
        "document_type": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.CLASSIFY,
            description="Type of document",
            enum=[
                "invoice", "receipt", "contract",
                "report", "other",
            ],
        ),
    },
)

# Create analyzer configuration
config = ContentAnalyzerConfig(
    enable_formula=True,
    enable_layout=True,
    enable_ocr=True,
    estimate_field_source_and_confidence=True,
    return_details=True,
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-document",
    description=(
        "Custom analyzer for extracting company information"
    ),
    config=config,
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
        "embedding": "text-embedding-3-large",
    }, # Required when using field_schema and prebuilt-document base analyzer
)

# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

Contoh output terlihat seperti:

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: ContentFieldType.STRING (GenerationMethod.EXTRACT)
    - total_amount: ContentFieldType.NUMBER (GenerationMethod.EXTRACT)
    - document_summary: ContentFieldType.STRING (GenerationMethod.GENERATE)
    - document_type: ContentFieldType.STRING (GenerationMethod.CLASSIFY)

Petunjuk / Saran

Kode ini didasarkan pada sampel create analyzer di repositori SDK.

Secara opsional, Anda dapat membuat penganalisis pengklasifikasi untuk mengategorikan dokumen dan menggunakan hasilnya untuk merutekan dokumen ke penganalisis bawaan atau kustom yang Anda buat. Berikut adalah contoh pembuatan penganalisis kustom untuk alur kerja klasifikasi.

import time
from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentCategoryDefinition,
)

# Generate a unique analyzer ID
analyzer_id = f"my_classifier_{int(time.time())}"

print(f"Creating classifier '{analyzer_id}'...")

# Define content categories for classification
categories = {
    "Loan_Application": ContentCategoryDefinition(
        description="Documents submitted by individuals or businesses to request funding, "
        "typically including personal or business details, financial history, "
        "loan amount, purpose, and supporting documentation."
    ),
    "Invoice": ContentCategoryDefinition(
        description="Billing documents issued by sellers or service providers to request "
        "payment for goods or services, detailing items, prices, taxes, totals, "
        "and payment terms."
    ),
    "Bank_Statement": ContentCategoryDefinition(
        description="Official statements issued by banks that summarize account activity "
        "over a period, including deposits, withdrawals, fees, and balances."
    ),
}

# Create analyzer configuration
config = ContentAnalyzerConfig(
    return_details=True,
    enable_segment=True,  # Enable automatic segmentation by category
    content_categories=categories,
)

# Create the classifier analyzer
classifier = ContentAnalyzer(
    base_analyzer_id="prebuilt-document",
    description="Custom classifier for financial document categorization",
    config=config,
    models={"completion": "gpt-4.1"},
)

# Create the classifier
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=classifier,
)
result = poller.result()  # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)

print(f"Classifier '{analyzer_id}' created successfully!")
if result.description:
    print(f"  Description: {result.description}")

Petunjuk / Saran

Kode ini didasarkan pada sampel buat pengklasifikasi di repositori SDK.

Contoh berikut membuat penganalisis gambar kustom berdasarkan penganalisis dasar gambar bawaan untuk memproses bagan dan grafik.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)

# Generate a unique analyzer ID
analyzer_id = f"my_image_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="chart_schema",
    description=(
        "Schema for extracting chart information"
    ),
    fields={
        "Title": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            description="Title of the chart",
        ),
        "ChartType": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.CLASSIFY,
            description="Type of chart",
            enum=["bar", "line", "pie"],
        ),
    },
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-image",
    description=(
        "Custom analyzer for charts and graphs"
    ),
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
    }, 
)

# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

Contoh output terlihat seperti:

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: ContentFieldType.STRING (auto)
    - ChartType: ContentFieldType.STRING (GenerationMethod.CLASSIFY)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create analyzer untuk konten gambar.

Contoh berikut membuat penganalisis audio kustom berdasarkan penganalisis audio bawaan untuk memproses rekaman panggilan dukungan pelanggan.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)
# Generate a unique analyzer ID
analyzer_id = f"my_audio_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="call_center_schema",
    description=(
        "Schema for analyzing customer support calls"
    ),
    fields={
        "Summary": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.GENERATE,
            description="Summary of the call",
        ),
        "Sentiment": ContentFieldDefinition(
            type=ContentFieldType.STRING,
            method=GenerationMethod.CLASSIFY,
            description="Overall sentiment of the call",
            enum=["Positive", "Neutral", "Negative"],
        ),
        "People": ContentFieldDefinition(
            type=ContentFieldType.ARRAY,
            description="List of people mentioned",
            item_definition=ContentFieldDefinition(
                type=ContentFieldType.OBJECT,
                properties={
                    "Name": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                    ),
                    "Role": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                    ),
                },
            ),
        ),
    },
)

# Create analyzer configuration
config = ContentAnalyzerConfig(
    locales=["en-US", "fr-FR"],
    return_details=True,
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-audio",
    description=(
        "Custom analyzer for customer support calls"
    ),
    config=config,
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
    },
)
# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

Contoh output terlihat seperti:

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - Summary: ContentFieldType.STRING (GenerationMethod.GENERATE)
    - Sentiment: ContentFieldType.STRING (GenerationMethod.CLASSIFY)
    - People: ContentFieldType.ARRAY (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create analyzer untuk konten audio.

Contoh berikut membuat penganalisis video kustom berdasarkan penganalisis dasar video bawaan untuk memproses demo dan ulasan produk.

from azure.ai.contentunderstanding.models import (
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
    ContentFieldDefinition,
    ContentFieldType,
    GenerationMethod,
)
# Generate a unique analyzer ID
analyzer_id = f"my_video_analyzer_{int(time.time())}"

# Define field schema with custom fields
field_schema = ContentFieldSchema(
    name="video_schema",
    description=(
        "Schema for analyzing product demo videos"
    ),
    fields={
        "Segments": ContentFieldDefinition(
            type=ContentFieldType.ARRAY,
            item_definition=ContentFieldDefinition(
                type=ContentFieldType.OBJECT,
                properties={
                    "SegmentId": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                    ),
                    "Description": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                        method=GenerationMethod.GENERATE,
                        description=(
                            "Detailed summary of the "
                            "video segment"
                        ),
                    ),
                    "Sentiment": ContentFieldDefinition(
                        type=ContentFieldType.STRING,
                        method=GenerationMethod.CLASSIFY,
                        enum=[
                            "Positive", "Neutral",
                            "Negative",
                        ],
                    ),
                },
            ),
        ),
    },
)

# Create analyzer configuration
config = ContentAnalyzerConfig(
    locales=["en-US", "fr-FR"],
    return_details=True,
)

# Create the analyzer with field schema
analyzer = ContentAnalyzer(
    base_analyzer_id="prebuilt-video",
    description=(
        "Custom analyzer for product demo videos"
    ),
    config=config,
    field_schema=field_schema,
    models={
        "completion": "gpt-4.1",
    }, 
)

# Create the analyzer
poller = client.begin_create_analyzer(
    analyzer_id=analyzer_id,
    resource=analyzer,
)
result = poller.result() # Wait for creation to complete

# Get the full analyzer details after creation
result = client.get_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' created successfully!")

if result.description:
    print(f"  Description: {result.description}")

if result.field_schema and result.field_schema.fields:
    print(f"  Fields ({len(result.field_schema.fields)}):")
    for field_name, field_def in result.field_schema.fields.items():
        method = field_def.method if field_def.method else "auto"
        field_type = field_def.type if field_def.type else "unknown"
        print(f"    - {field_name}: {field_type} ({method})")

Contoh output terlihat seperti:

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: ContentFieldType.ARRAY (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create analyzer untuk konten video.

Menggunakan penganalisis kustom

Setelah membuat penganalisis, gunakan untuk menganalisis dokumen dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

# --- Use the custom document analyzer ---
from azure.ai.contentunderstanding.models import AnalysisInput

print("\nAnalyzing document...")
document_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/document/invoice.pdf"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=document_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        company = content.fields.get("company_name")
        if company:
            print(f"Company Name: {company.value}")
            if company.confidence:
                print(
                    f"  Confidence:"
                    f" {company.confidence:.2f}"
                )

        total = content.fields.get("total_amount")
        if total:
            print(f"Total Amount: {total.value}")

        summary = content.fields.get(
            "document_summary"
        )
        if summary:
            print(f"Summary: {summary.value}")

        doc_type = content.fields.get("document_type")
        if doc_type:
            print(f"Document Type: {doc_type.value}")
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

Contoh output terlihat seperti:

Analyzing document...
Company Name: CONTOSO LTD.
  Confidence: 0.81
Total Amount: 610.0
Summary: This document is an invoice from CONTOSO LTD. to Microsoft Corporation for consulting, document, and printing services provided during the service period. It details line items, subtotal, sales tax, total, previous unpaid balance, and the final amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis gambar dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

from azure.ai.contentunderstanding.models import AnalysisInput

# --- Use the custom image analyzer ---
print("\nAnalyzing image...")
image_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/image/pieChart.jpg"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=image_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        title = content.fields.get("Title")
        if title:
            print(f"Title: {title.value}")

        chart_type = content.fields.get("ChartType")
        if chart_type:
            print(f"Chart Type: {chart_type.value}")
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

Contoh output terlihat seperti:

Analyzing image...
Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...   
Analyzer 'my_image_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis file audio dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

from azure.ai.contentunderstanding.models import AnalysisInput

print("\nAnalyzing audio...")
audio_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/audio/callCenterRecording.mp3"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=audio_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        summary = content.fields.get("Summary")
        if summary:
            print(f"Summary: {summary.value}")

        sentiment = content.fields.get("Sentiment")
        if sentiment:
            print(f"Sentiment: {sentiment.value}")
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

Contoh output terlihat seperti:

Analyzing audio...
Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and informed her that her balance is 599 points. Maria confirmed she needed no further information and ended the call.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis video dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

from azure.ai.contentunderstanding.models import AnalysisInput

print("\nAnalyzing video...")
video_url = (
    "https://raw.githubusercontent.com/"
    "Azure-Samples/"
    "azure-ai-content-understanding-assets/"
    "main/videos/sdk_samples/FlightSimulator.mp4"
)

poller = client.begin_analyze(
    analyzer_id=analyzer_id,
    inputs=[AnalysisInput(url=video_url)],
)
result = poller.result()

if result.contents and len(result.contents) > 0:
    content = result.contents[0]
    if content.fields:
        segments = content.fields.get("Segments")
        if segments and segments.value:
            print(f"Found {len(segments.value)} segments")
            for i, segment in enumerate(
                segments.value
            ):
                if segment.value:
                    seg_id = segment.value.get(
                        "SegmentId"
                    )
                    desc = segment.value.get(
                        "Description"
                    )
                    print(f"Segment {i + 1}:")
                    if seg_id:
                        print(
                            f"  ID: {seg_id.value}"
                        )
                    if desc:
                        print(
                            f"  Desc: {desc.value}"
                        )
else:
    print("No content returned from analysis.")

# --- Clean up ---
print(f"\nCleaning up: deleting analyzer '{analyzer_id}'...")
client.delete_analyzer(analyzer_id=analyzer_id)
print(f"Analyzer '{analyzer_id}' deleted successfully.")

Contoh output terlihat seperti:

Analyzing video...
Found 16 segments
Segment 1:
  ID: 00:00:00.000-00:00:01.467
  Desc: The video opens with a scenic aerial view of an island, featuring a small airplane flying over the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI,' indicating a collaboration or integration between the two products.
Segment 2:
  ID: 00:00:01.467-00:00:03.233
  Desc: A man is shown sitting in a modern office setting, likely preparing to speak or introduce the topic. The background features geometric wall decorations and a plant, giving a professional and contemporary atmosphere.
Segment 3:
  ID: 00:00:03.233-00:00:07.367
  Desc: The segment displays a close-up of audio waveforms on a screen, visually representing sound data. This is accompanied by narration about the importance of good data for neural TTS (Text-to-Speech) and the process of building a universal TTS model using 3,000 hours of data.
Segment 4:
  ID: 00:00:07.367-00:00:08.200
  Desc: Another man appears in a similar office environment, possibly continuing the explanation or providing additional insights about the TTS model.
Segment 5:
  ID: 00:00:08.200-00:00:11.367
  Desc: The video transitions to an outdoor scene showing a large facility surrounded by fields, likely representing a data center or server farm. This visual supports the narration about accumulating large amounts of data for the universal TTS model.
Segment 6:
  ID: 00:00:11.367-00:00:13.567
  Desc: Inside a data center, rows of servers are shown, emphasizing the technological infrastructure required for processing and storing vast amounts of audio data.
Segment 7:
  ID: 00:00:13.567-00:00:16.100
  Desc: The first man returns, continuing his explanation in the office setting. The narration discusses how the universal model captures audio nuances to generate more natural voices.
Segment 8:
  ID: 00:00:16.100-00:00:19.433
  Desc: A biplane is seen flying over a picturesque landscape, reinforcing the connection to Flight Simulator and showcasing the realism enabled by advanced AI voice technology.
Segment 9:
  ID: 00:00:19.433-00:00:23.967
  Desc: A plane flies past a castle surrounded by lush greenery and mountains, further highlighting the immersive environments possible in Flight Simulator. The narration continues to emphasize the natural quality of AI-generated voices.
Segment 10:
  ID: 00:00:23.967-00:00:30.033
  Desc: A bald man is interviewed in a modern office space, discussing the high fidelity and human-like quality of voices produced by cognitive services offerings. The background features glass walls and plants, maintaining a professional tone.
Segment 11:
  ID: 00:00:30.033-00:00:33.200
  Desc: The interview continues with the bald man, focusing on the benefits of the AI voice technology. The setting remains consistent, reinforcing the credibility and expertise of the speaker.        
Segment 12:
  ID: 00:00:33.200-00:00:35.267
  Desc: The video shifts to a top-down view of an airplane on a runway, preparing for movement. This visual ties back to the Flight Simulator theme and the realism of the simulation.
Segment 13:
  ID: 00:00:35.267-00:00:37.700
  Desc: A ground crew member directs an Airbus aircraft, with pilots visible in the cockpit. The scene demonstrates realistic airport operations, likely enhanced by AI-driven voice interactions.       
Segment 14:
  ID: 00:00:37.700-00:00:39.200
  Desc: Two ground crew members walk near an airplane on the tarmac, with airport buildings in the background. The visuals continue to showcase the detailed simulation environment.
Segment 15:
  ID: 00:00:39.200-00:00:42.033
  Desc: A close-up of an Airbus aircraft at the gate, with sunlight illuminating the scene. The realism of the simulation is highlighted, possibly referencing the natural-sounding AI voices used in communications.
Segment 16:
  ID: 00:00:42.033-00:00:43.866
  Desc: The video concludes with the Microsoft logo and branding, signaling the end of the product demo and reinforcing the association with Microsoft Azure AI.

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...   
Analyzer 'my_video_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel SDK.

Pustaka klien | Sampel | Sumber SDK

Panduan ini menunjukkan kepada Anda cara menggunakan Content Understanding .NET SDK untuk membuat penganalisis kustom yang mengekstrak data terstruktur dari konten Anda. Penganalisis kustom mendukung jenis konten dokumen, gambar, audio, dan video.

Prasyarat

Langganan Azure aktif. Jika Anda tidak memiliki akun Azure, buat secara gratis.
Sumber daya Microsoft Foundry dibuat di wilayah yang didukung.
Titik akhir sumber daya dan kunci API Anda (ditemukan di bawah Kunci dan Titik Akhir di portal Microsoft Azure).
Pengaturan default penyebaran model telah dikonfigurasi untuk sumber daya Anda. Lihat Model dan penerapan atau skrip satu kali ini untuk konfigurasi untuk petunjuk penyiapan.
Versi .NET saat ini.

Pengaturan

Buat aplikasi konsol .NET baru:

dotnet new console -n CustomAnalyzerTutorial
cd CustomAnalyzerTutorial

Instal pustaka klien Content Understanding untuk .NET:
```
dotnet add package Azure.AI.ContentUnderstanding
```
Secara opsional, instal pustaka Azure Identity untuk autentikasi Microsoft Entra:
```
dotnet add package Azure.Identity
```

Menyiapkan variabel lingkungan

Untuk mengautentikasi dengan layanan Content Understanding, atur variabel lingkungan dengan nilai Anda sendiri sebelum menjalankan sampel:

CONTENTUNDERSTANDING_ENDPOINT - titik akhir ke sumber daya Pemahaman Konten Anda.
CONTENTUNDERSTANDING_KEY - kunci CONTENT Understanding API Anda (opsional jika menggunakan Microsoft Entra ID DefaultAzureCredential).

Windows

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux / macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

Membuat klien

using Azure;
using Azure.AI.ContentUnderstanding;

string endpoint = Environment.GetEnvironmentVariable(
    "CONTENTUNDERSTANDING_ENDPOINT");
string key = Environment.GetEnvironmentVariable(
    "CONTENTUNDERSTANDING_KEY");

var client = new ContentUnderstandingClient(
    new Uri(endpoint),
    new AzureKeyCredential(key)
);

Membuat penganalisis kustom

Contoh berikut membuat penganalisis dokumen kustom berdasarkan penganalisis dokumen bawaan. Ini mendefinisikan bidang menggunakan tiga metode ekstraksi: extract untuk teks harfiah, generate untuk ringkasan yang dihasilkan AI, dan classify untuk kategorisasi.

string analyzerId =
    $"my_document_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["company_name"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Extract,
            Description = "Name of the company"
        },
        ["total_amount"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.Number,
            Method = GenerationMethod.Extract,
            Description =
                "Total amount on the document"
        },
        ["document_summary"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Generate,
            Description =
                "A brief summary of the document content"
        },
        ["document_type"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Classify,
            Description = "Type of document"
        }
    })
{
    Name = "company_schema",
    Description =
        "Schema for extracting company information"
};

fieldSchema.Fields["document_type"].Enum.Add("invoice");
fieldSchema.Fields["document_type"].Enum.Add("receipt");
fieldSchema.Fields["document_type"].Enum.Add("contract");
fieldSchema.Fields["document_type"].Enum.Add("report");
fieldSchema.Fields["document_type"].Enum.Add("other");

var config = new ContentAnalyzerConfig
{
    EnableFormula = true,
    EnableLayout = true,
    EnableOcr = true,
    EstimateFieldSourceAndConfidence = true,
    ShouldReturnDetails = true
};

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-document",
    Description =
        "Custom analyzer for extracting"
        + " company information",
    Config = config,
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";
customAnalyzer.Models["embedding"] =
    "text-embedding-3-large"; // Required when using field_schema and prebuilt-document base analyzer

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

Contoh output terlihat seperti:

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: string (extract)
    - total_amount: number (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

Petunjuk / Saran

Kode ini didasarkan pada sampel Create Analyzer di repositori SDK.

// Generate a unique analyzer ID
string classifierId =
    $"my_classifier_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

Console.WriteLine(
    $"Creating classifier '{classifierId}'...");

// Define content categories for classification
var classifierConfig = new ContentAnalyzerConfig
{
    ShouldReturnDetails = true,
    EnableSegment = true
};

classifierConfig.ContentCategories
    .Add("Loan_Application",
        new ContentCategoryDefinition
        {
            Description =
                "Documents submitted by individuals"
                + " or businesses to request"
                + " funding, typically including"
                + " personal or business details,"
                + " financial history, loan amount,"
                + " purpose, and supporting"
                + " documentation."
        });

classifierConfig.ContentCategories
    .Add("Invoice",
        new ContentCategoryDefinition
        {
            Description =
                "Billing documents issued by"
                + " sellers or service providers"
                + " to request payment for goods"
                + " or services, detailing items,"
                + " prices, taxes, totals, and"
                + " payment terms."
        });

classifierConfig.ContentCategories
    .Add("Bank_Statement",
        new ContentCategoryDefinition
        {
            Description =
                "Official statements issued by"
                + " banks that summarize account"
                + " activity over a period,"
                + " including deposits,"
                + " withdrawals, fees,"
                + " and balances."
        });

// Create the classifier analyzer
var classifierAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-document",
    Description =
        "Custom classifier for financial"
        + " document categorization",
    Config = classifierConfig
};

classifierAnalyzer.Models["completion"] =
    "gpt-4.1";

var classifierOp =
    await client.CreateAnalyzerAsync(
        WaitUntil.Completed,
        classifierId,
        classifierAnalyzer);

// Get the full classifier details
var classifierDetails =
    await client.GetAnalyzerAsync(classifierId);
var classifierResult =
    classifierDetails.Value;

Console.WriteLine(
    $"Classifier '{classifierId}'"
    + " created successfully!");

if (classifierResult.Description != null)
{
    Console.WriteLine(
        $"  Description:"
        + $" {classifierResult.Description}");
}

Petunjuk / Saran

Kode ini didasarkan pada Sampel Buat Pengklasifikasi untuk alur kerja klasifikasi.

Contoh berikut membuat penganalisis gambar kustom berdasarkan penganalisis gambar bawaan untuk memproses bagan dan grafik.

string analyzerId =
    $"my_image_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["Title"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Description = "Title of the chart"
        },
        ["ChartType"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Classify,
            Description = "Type of chart"
        }
    })
{
    Name = "chart_schema",
    Description =
        "Schema for extracting chart information"
};

fieldSchema.Fields["ChartType"].Enum.Add("bar");
fieldSchema.Fields["ChartType"].Enum.Add("line");
fieldSchema.Fields["ChartType"].Enum.Add("pie");

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-image",
    Description =
        "Custom analyzer for charts and graphs",
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

Contoh output terlihat seperti:

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel Create Analyzer untuk konten gambar.

Contoh berikut membuat penganalisis audio kustom berdasarkan penganalisis audio bawaan untuk memproses rekaman panggilan dukungan pelanggan.

string analyzerId =
    $"my_audio_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["Summary"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Generate,
            Description = "Summary of the call"
        },
        ["Sentiment"] = new ContentFieldDefinition
        {
            Type = ContentFieldType.String,
            Method = GenerationMethod.Classify,
            Description =
                "Overall sentiment of the call"
        },
    })
{
    Name = "call_center_schema",
    Description =
        "Schema for analyzing customer"
        + " support calls"
};

fieldSchema.Fields["Sentiment"]
    .Enum.Add("Positive");
fieldSchema.Fields["Sentiment"]
    .Enum.Add("Neutral");
fieldSchema.Fields["Sentiment"]
    .Enum.Add("Negative");

var config = new ContentAnalyzerConfig
{
    ShouldReturnDetails = true
};

config.Locales.Add("en-US");
config.Locales.Add("fr-FR");

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-audio",
    Description =
        "Custom analyzer for customer"
        + " support calls",
    Config = config,
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

Contoh output terlihat seperti:

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (2):
    - Summary: string (generate)
    - Sentiment: string (classify)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel Create Analyzer untuk konten audio.

Contoh berikut membuat penganalisis video kustom berdasarkan penganalisis video bawaan untuk memproses demo dan ulasan produk.

string analyzerId =
    $"my_video_analyzer_{DateTimeOffset.UtcNow.ToUnixTimeSeconds()}";

var segmentItemDef = new ContentFieldDefinition
{
    Type = ContentFieldType.Object
};
segmentItemDef.Properties.Add("SegmentId",
    new ContentFieldDefinition
    {
        Type = ContentFieldType.String
    });
segmentItemDef.Properties.Add("Description",
    new ContentFieldDefinition
    {
        Type = ContentFieldType.String,
        Method = GenerationMethod.Generate,
        Description =
            "Detailed summary of the "
            + "video segment"
    });
segmentItemDef.Properties.Add("Sentiment",
    new ContentFieldDefinition
    {
        Type = ContentFieldType.String,
        Method = GenerationMethod.Classify
    });

var segmentsDef = new ContentFieldDefinition
{
    Type = ContentFieldType.Array
};
segmentsDef.ItemDefinition = segmentItemDef;

var fieldSchema = new ContentFieldSchema(
    new Dictionary<string, ContentFieldDefinition>
    {
        ["Segments"] = segmentsDef
    })
{
    Name = "video_schema",
    Description =
        "Schema for analyzing product"
        + " demo videos"
};

var sentimentDef =
    fieldSchema.Fields["Segments"]
        .ItemDefinition.Properties["Sentiment"];
sentimentDef.Enum.Add("Positive");
sentimentDef.Enum.Add("Neutral");
sentimentDef.Enum.Add("Negative");

var config = new ContentAnalyzerConfig
{
    ShouldReturnDetails = true
};

config.Locales.Add("en-US");
config.Locales.Add("fr-FR");

var customAnalyzer = new ContentAnalyzer
{
    BaseAnalyzerId = "prebuilt-video",
    Description =
        "Custom analyzer for product"
        + " demo videos",
    Config = config,
    FieldSchema = fieldSchema
};

customAnalyzer.Models["completion"] = "gpt-4.1";

var operation = await client.CreateAnalyzerAsync(
    WaitUntil.Completed,
    analyzerId,
    customAnalyzer);

ContentAnalyzer result = operation.Value;
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " created successfully!");

// Get the full analyzer details after creation
var analyzerDetails =
    await client.GetAnalyzerAsync(analyzerId);
result = analyzerDetails.Value;

if (result.Description != null)
{
    Console.WriteLine(
        $"  Description: {result.Description}");
}

if (result.FieldSchema?.Fields != null)
{
    Console.WriteLine(
        $"  Fields"
        + $" ({result.FieldSchema.Fields.Count}):");
    foreach (var kvp
        in result.FieldSchema.Fields)
    {
        var method =
            kvp.Value.Method?.ToString()
            ?? "auto";
        var fieldType =
            kvp.Value.Type?.ToString()
            ?? "unknown";
        Console.WriteLine(
            $"    - {kvp.Key}:"
            + $" {fieldType} ({method})");
    }
}

Contoh output terlihat seperti:

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: Array (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel Create Analyzer untuk konten video.

Menggunakan penganalisis kustom

Setelah membuat penganalisis, gunakan untuk menganalisis dokumen dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

var documentUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = documentUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.FirstOrDefault()
    is DocumentContent content)
{
    if (content.Fields.TryGetValue(
        "company_name", out var companyField))
    {
        var name =
            companyField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Company Name: "
            + $"{name ?? "(not found)"}");
        Console.WriteLine(
            "  Confidence: "
            + (companyField.Confidence?
                .ToString("F2") ?? "N/A"));
    }

    if (content.Fields.TryGetValue(
        "total_amount", out var totalField))
    {
        var total =
            totalField is ContentNumberField nf
                ? nf.Value : null;
        Console.WriteLine(
            $"Total Amount: {total}");
    }

    if (content.Fields.TryGetValue(
        "document_summary", out var summaryField))
    {
        var summary =
            summaryField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Summary: "
            + $"{summary ?? "(not found)"}");
    }

    if (content.Fields.TryGetValue(
        "document_type", out var typeField))
    {
        var docType =
            typeField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Document Type: "
            + $"{docType ?? "(not found)"}");
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

Contoh output terlihat seperti:

Company Name: CONTOSO LTD.
  Confidence: 0.88
Total Amount: 610
Summary: This document is an invoice from CONTOSO LTD. to MICROSOFT CORPORATION for consulting services, document fees, and printing fees, detailing service periods, billing and shipping addresses, itemized charges, and the total amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel .NET SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis gambar dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

var imageUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = imageUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.FirstOrDefault()
    is DocumentContent content)
{
    if (content.Fields.TryGetValue(
        "Title", out var titleField))
    {
        var title =
            titleField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Title: {title ?? "(not found)"}");
    }

    if (content.Fields.TryGetValue(
        "ChartType", out var chartField))
    {
        var chartType =
            chartField is ContentStringField sf
                ? sf.Value : null;
        Console.WriteLine(
            $"Chart Type: "
            + $"{chartType ?? "(not found)"}");
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

Contoh output terlihat seperti:

Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel .NET SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis file audio dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

var audioUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = audioUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.Count > 0)
{
    var content = analyzeResult.Contents[0];
    if (content.Fields != null)
    {
        if (content.Fields.TryGetValue(
            "Summary", out var summaryField))
        {
            var summary =
                summaryField
                    is ContentStringField sf
                    ? sf.Value : null;
            Console.WriteLine(
                $"Summary: "
                + $"{summary ?? "(not found)"}");
        }

        if (content.Fields.TryGetValue(
            "Sentiment", out var sentField))
        {
            var sentiment =
                sentField
                    is ContentStringField sf
                    ? sf.Value : null;
            Console.WriteLine(
                $"Sentiment: "
                + $"{sentiment ?? "(not found)"}");
        }
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

Contoh output terlihat seperti:

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and informed her that her balance is 599 points. Maria confirmed she needed no further information and ended the call.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel .NET SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis video dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

var videoUrl = new Uri(
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4"
);

var analyzeOperation = await client.AnalyzeAsync(
    WaitUntil.Completed,
    analyzerId,
    inputs: new[] {
        new AnalysisInput { Uri = videoUrl }
    });

var analyzeResult = analyzeOperation.Value;

if (analyzeResult.Contents?.Count > 0)
{
    var content = analyzeResult.Contents[0];
    if (content.Fields != null
        && content.Fields.TryGetValue(
            "Segments", out var segmentsField)
        && segmentsField
            is ContentArrayField segmentsArr)
    {
        Console.WriteLine(
            $"Segments ({segmentsArr.Count}):");
        for (int i = 0;
            i < segmentsArr.Count; i++)
        {
            if (segmentsArr[i]
                is ContentObjectField segObj
                && segObj.Value != null)
            {
                Console.WriteLine(
                    $"  Segment {i + 1}:");
                if (segObj.Value.TryGetValue(
                    "Description",
                    out var descField))
                {
                    var desc =
                        descField
                            is ContentStringField sf
                            ? sf.Value : null;
                    Console.WriteLine(
                        $"    Description: "
                        + $"{desc ?? "(none)"}");
                }
                if (segObj.Value.TryGetValue(
                    "Sentiment",
                    out var sentField))
                {
                    var sent =
                        sentField
                            is ContentStringField sf
                            ? sf.Value : null;
                    Console.WriteLine(
                        $"    Sentiment: "
                        + $"{sent ?? "(none)"}");
                }
            }
        }
    }
}

// --- Clean up ---
Console.WriteLine(
    $"\nCleaning up: deleting analyzer"
    + $" '{analyzerId}'...");
await client.DeleteAnalyzerAsync(analyzerId);
Console.WriteLine(
    $"Analyzer '{analyzerId}'"
    + " deleted successfully.");

Contoh output terlihat seperti:

Segments (16):
  Segment 1:
    Description: The video opens with a scenic aerial view of an island, featuring a small airplane flying over the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI,' indicating a collaboration or integration between the two.
    Sentiment: Positive
  Segment 2:
    Description: A man is shown sitting in a modern office environment, likely preparing to speak or introduce the topic. The background features geometric wall lights and a plant, giving a professional and contemporary feel.
    Sentiment: Neutral
  Segment 3:
    Description: The segment displays a close-up of audio waveforms on a screen, visually representing sound data. The accompanying audio discusses the importance of good data for neural TTS (Text-to-Speech) to achieve a high-quality voice.
    Sentiment: Neutral
  Segment 4:
    Description: Another man appears in a similar office setting, possibly continuing the explanation or providing additional commentary about the TTS model.
    Sentiment: Neutral
  Segment 5:
    Description: The video transitions to an outdoor scene showing a large facility surrounded by fields under a clear sky. This likely represents the data centers or infrastructure used for building the universal TTS model.
    Sentiment: Neutral
  Segment 6:
    Description: The segment moves inside a data center, showing rows of servers and high-tech equipment. This visual emphasizes the scale and technological sophistication behind the TTS model's development.
    Sentiment: Neutral
  Segment 7:
    Description: The first man returns, continuing his explanation in the office setting. The transcript mentions accumulating large amounts of data to capture audio nuances and generate natural voices.
    Sentiment: Positive
  Segment 8:
    Description: A biplane is shown flying over a picturesque landscape, highlighting the realism and immersive experience of the Flight Simulator. This visual connects the product's capabilities to the natural-sounding voices enabled by Azure AI.
    Sentiment: Positive
  Segment 9:
    Description: The segment features a plane flying near a castle surrounded by lush greenery and mountains. The visuals reinforce the immersive environments possible in Flight Simulator, enhanced by advanced AI voice technology.
    Sentiment: Positive
  Segment 10:
    Description: A bald man is interviewed in a modern office space, likely discussing the benefits of cognitive services offerings, such as higher fidelity and more human-like voices, as mentioned in the transcript.
    Sentiment: Positive
  Segment 11:
    Description: The interview continues with the bald man, focusing on the advantages of Azure AI's TTS technology. The transcript notes that the voices sound much more like actual human voices.
    Sentiment: Positive
  Segment 12:
    Description: The video shifts to an overhead view of an airplane on the runway, possibly preparing for pushback. This visual ties into the transcript mentioning 'Orlando ground 9555 requesting the end of pushback.'
    Sentiment: Neutral
  Segment 13:
    Description: A ground crew member directs an Airbus aircraft, with pilots visible in the cockpit. The transcript includes communication about pushback, demonstrating realistic voice interactions in the simulator.
    Sentiment: Neutral
  Segment 14:
    Description: Ground crew members are seen walking near airplanes on the tarmac, reinforcing the realism and operational detail in the Flight Simulator environment.
    Sentiment: Neutral
  Segment 15:
    Description: A close-up of an Airbus aircraft at the gate, with the transcript confirming the end of pushback. This segment highlights the simulator's attention to detail and realistic voice communications.
    Sentiment: Neutral
  Segment 16:
    Description: The video concludes with the Microsoft logo and branding, signaling the end of the product demo and reinforcing the partnership between Flight Simulator and Microsoft Azure AI.
    Sentiment: Positive

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel .NET SDK.

Pustaka klien | Sampel | Sumber SDK

Panduan ini menunjukkan kepada Anda cara menggunakan Content Understanding Java SDK untuk membuat penganalisis kustom yang mengekstrak data terstruktur dari konten Anda. Penganalisis kustom mendukung jenis konten dokumen, gambar, audio, dan video.

Prasyarat

Langganan Azure aktif. Jika Anda tidak memiliki akun Azure, buat secara gratis.
Sumber daya Microsoft Foundry dibuat di wilayah yang didukung.
Titik akhir sumber daya dan kunci API Anda (ditemukan di bawah Kunci dan Titik Akhir di portal Microsoft Azure).
Pengaturan default penyebaran model telah dikonfigurasi untuk sumber daya Anda. Lihat Model dan penerapan atau skrip satu kali ini untuk konfigurasi untuk petunjuk penyiapan.
Java Development Kit (JDK) versi 8 atau yang lebih baru.
Apache Maven.

Pengaturan

Buat proyek Maven baru:

mvn archetype:generate -DgroupId=com.example \
    -DartifactId=custom-analyzer-tutorial \
    -DarchetypeArtifactId=maven-archetype-quickstart \
    -DinteractiveMode=false
cd custom-analyzer-tutorial

Tambahkan dependensi Pemahaman Konten ke file pom.xml Anda di bagian <dependencies> :

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-contentunderstanding</artifactId>
    <version>1.0.0</version>
</dependency>

Secara opsional, tambahkan pustaka Azure Identity untuk autentikasi Microsoft Entra:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-identity</artifactId>
    <version>1.14.2</version>
</dependency>

Menyiapkan variabel lingkungan

Untuk mengautentikasi dengan layanan Content Understanding, atur variabel lingkungan dengan nilai Anda sendiri sebelum menjalankan sampel:

CONTENTUNDERSTANDING_ENDPOINT - titik akhir ke sumber daya Pemahaman Konten Anda.
CONTENTUNDERSTANDING_KEY - kunci CONTENT Understanding API Anda (opsional jika menggunakan Microsoft Entra ID DefaultAzureCredential).

Windows

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux / macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

Membuat klien

import java.util.Arrays;
import java.util.HashMap;
import java.util.Map;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import com.azure.ai.contentunderstanding
    .ContentUnderstandingClient;
import com.azure.ai.contentunderstanding
    .ContentUnderstandingClientBuilder;
import com.azure.ai.contentunderstanding.models.*;

String endpoint =
    System.getenv("CONTENTUNDERSTANDING_ENDPOINT");
String key =
    System.getenv("CONTENTUNDERSTANDING_KEY");

ContentUnderstandingClient client =
    new ContentUnderstandingClientBuilder()
        .endpoint(endpoint)
        .credential(new AzureKeyCredential(key))
        .buildClient();

Membuat penganalisis kustom

String analyzerId =
    "my_document_analyzer_"
    + System.currentTimeMillis();

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();

ContentFieldDefinition companyNameDef =
    new ContentFieldDefinition();
companyNameDef.setType(ContentFieldType.STRING);
companyNameDef.setMethod(
    GenerationMethod.EXTRACT);
companyNameDef.setDescription(
    "Name of the company");
fields.put("company_name", companyNameDef);

ContentFieldDefinition totalAmountDef =
    new ContentFieldDefinition();
totalAmountDef.setType(ContentFieldType.NUMBER);
totalAmountDef.setMethod(
    GenerationMethod.EXTRACT);
totalAmountDef.setDescription(
    "Total amount on the document");
fields.put("total_amount", totalAmountDef);

ContentFieldDefinition summaryDef =
    new ContentFieldDefinition();
summaryDef.setType(ContentFieldType.STRING);
summaryDef.setMethod(
    GenerationMethod.GENERATE);
summaryDef.setDescription(
    "A brief summary of the document content");
fields.put("document_summary", summaryDef);

ContentFieldDefinition documentTypeDef =
    new ContentFieldDefinition();
documentTypeDef.setType(ContentFieldType.STRING);
documentTypeDef.setMethod(
    GenerationMethod.CLASSIFY);
documentTypeDef.setDescription(
    "Type of document");
documentTypeDef.setEnumProperty(
    Arrays.asList(
        "invoice", "receipt", "contract",
        "report", "other"
    ));
fields.put("document_type", documentTypeDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("company_schema");
fieldSchema.setDescription(
    "Schema for extracting company information");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");
models.put("embedding", "text-embedding-3-large"); // Required when using field_schema and prebuilt-document base analyzer

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-document")
        .setDescription(
            "Custom analyzer for extracting"
            + " company information")
        .setConfig(new ContentAnalyzerConfig()
            .setOcrEnabled(true)
            .setLayoutEnabled(true)
            .setFormulaEnabled(true)
            .setEstimateFieldSourceAndConfidence(
                true)
            .setReturnDetails(true))
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

Contoh output terlihat seperti:

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - total_amount: number (extract)
    - company_name: string (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

Petunjuk / Saran

Kode ini didasarkan pada sampel Create Analyzer di repositori SDK.

// Generate a unique analyzer ID
String classifierId =
    "my_classifier_" + System.currentTimeMillis();

System.out.println(
    "Creating classifier '"
    + classifierId + "'...");

// Define content categories for classification
Map<String, ContentCategoryDefinition>
    categories = new HashMap<>();

categories.put("Loan_Application",
    new ContentCategoryDefinition()
        .setDescription(
            "Documents submitted by individuals"
            + " or businesses to request funding,"
            + " typically including personal or"
            + " business details, financial"
            + " history, loan amount, purpose,"
            + " and supporting documentation."));

categories.put("Invoice",
    new ContentCategoryDefinition()
        .setDescription(
            "Billing documents issued by sellers"
            + " or service providers to request"
            + " payment for goods or services,"
            + " detailing items, prices, taxes,"
            + " totals, and payment terms."));

categories.put("Bank_Statement",
    new ContentCategoryDefinition()
        .setDescription(
            "Official statements issued by banks"
            + " that summarize account activity"
            + " over a period, including deposits,"
            + " withdrawals, fees,"
            + " and balances."));

// Create the classifier
Map<String, String> classifierModels =
    new HashMap<>();
classifierModels.put("completion", "gpt-4.1");

ContentAnalyzer classifier =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-document")
        .setDescription(
            "Custom classifier for financial"
            + " document categorization")
        .setConfig(new ContentAnalyzerConfig()
            .setReturnDetails(true)
            .setSegmentEnabled(true)
            .setContentCategories(categories))
        .setModels(classifierModels);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> classifierOp =
    client.beginCreateAnalyzer(
        classifierId, classifier, true);
classifierOp.getFinalResult();

// Get the full classifier details
ContentAnalyzer classifierResult =
    client.getAnalyzer(classifierId);

System.out.println(
    "Classifier '" + classifierId
    + "' created successfully!");

if (classifierResult.getDescription() != null) {
    System.out.println(
        "  Description: "
        + classifierResult.getDescription());
}

Petunjuk / Saran

Kode ini didasarkan pada sampel Buat Pengklasifikasi untuk alur kerja klasifikasi.

Contoh berikut membuat penganalisis gambar kustom berdasarkan penganalisis gambar bawaan untuk memproses bagan dan grafik.

String analyzerId =
    "my_image_analyzer_"
    + System.currentTimeMillis();

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();

ContentFieldDefinition titleDef =
    new ContentFieldDefinition();
titleDef.setType(ContentFieldType.STRING);
titleDef.setDescription("Title of the chart");
fields.put("Title", titleDef);

ContentFieldDefinition chartTypeDef =
    new ContentFieldDefinition();
chartTypeDef.setType(ContentFieldType.STRING);
chartTypeDef.setMethod(
    GenerationMethod.CLASSIFY);
chartTypeDef.setDescription("Type of chart");
chartTypeDef.setEnumProperty(
    Arrays.asList("bar", "line", "pie"));
fields.put("ChartType", chartTypeDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("chart_schema");
fieldSchema.setDescription(
    "Schema for extracting chart information");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-image")
        .setDescription(
            "Custom analyzer for charts"
            + " and graphs")
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

Contoh output terlihat seperti:

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel Create Analyzer untuk konten gambar.

Contoh berikut membuat penganalisis audio kustom berdasarkan penganalisis audio bawaan untuk memproses rekaman panggilan dukungan pelanggan.

String analyzerId =
    "my_audio_analyzer_"
    + System.currentTimeMillis();

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();

ContentFieldDefinition summaryDef =
    new ContentFieldDefinition();
summaryDef.setType(ContentFieldType.STRING);
summaryDef.setMethod(
    GenerationMethod.GENERATE);
summaryDef.setDescription("Summary of the call");
fields.put("Summary", summaryDef);

ContentFieldDefinition sentimentDef =
    new ContentFieldDefinition();
sentimentDef.setType(ContentFieldType.STRING);
sentimentDef.setMethod(
    GenerationMethod.CLASSIFY);
sentimentDef.setDescription(
    "Overall sentiment of the call");
sentimentDef.setEnumProperty(
    Arrays.asList(
        "Positive", "Neutral", "Negative"));
fields.put("Sentiment", sentimentDef);

// Define "People" as an array of objects
Map<String, ContentFieldDefinition> personProps =
    new HashMap<>();
ContentFieldDefinition nameDef =
    new ContentFieldDefinition();
nameDef.setType(ContentFieldType.STRING);
personProps.put("Name", nameDef);
ContentFieldDefinition roleDef =
    new ContentFieldDefinition();
roleDef.setType(ContentFieldType.STRING);
personProps.put("Role", roleDef);

ContentFieldDefinition personItemDef =
    new ContentFieldDefinition();
personItemDef.setType(ContentFieldType.OBJECT);
personItemDef.setProperties(personProps);

ContentFieldDefinition peopleDef =
    new ContentFieldDefinition();
peopleDef.setType(ContentFieldType.ARRAY);
peopleDef.setDescription(
    "List of people mentioned");
peopleDef.setItemDefinition(personItemDef);
fields.put("People", peopleDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("call_center_schema");
fieldSchema.setDescription(
    "Schema for analyzing customer"
    + " support calls");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-audio")
        .setDescription(
            "Custom analyzer for customer"
            + " support calls")
        .setConfig(new ContentAnalyzerConfig()
            .setLocales(
                Arrays.asList("en-US", "fr-FR"))
            .setReturnDetails(true))
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

Contoh output terlihat seperti:

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - People: array (auto)
    - Summary: string (generate)
    - Sentiment: string (classify)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel Create Analyzer untuk konten audio.

Contoh berikut membuat penganalisis video kustom berdasarkan penganalisis video bawaan untuk memproses demo dan ulasan produk.

String analyzerId =
    "my_video_analyzer_"
    + System.currentTimeMillis();

// Define segment properties
Map<String, ContentFieldDefinition> segProps =
    new HashMap<>();
ContentFieldDefinition segIdDef =
    new ContentFieldDefinition();
segIdDef.setType(ContentFieldType.STRING);
segProps.put("SegmentId", segIdDef);

ContentFieldDefinition descDef =
    new ContentFieldDefinition();
descDef.setType(ContentFieldType.STRING);
descDef.setMethod(GenerationMethod.GENERATE);
descDef.setDescription(
    "Detailed summary of the video segment");
segProps.put("Description", descDef);

ContentFieldDefinition sentDef =
    new ContentFieldDefinition();
sentDef.setType(ContentFieldType.STRING);
sentDef.setMethod(GenerationMethod.CLASSIFY);
sentDef.setEnumProperty(
    Arrays.asList(
        "Positive", "Neutral", "Negative"));
segProps.put("Sentiment", sentDef);

ContentFieldDefinition segItemDef =
    new ContentFieldDefinition();
segItemDef.setType(ContentFieldType.OBJECT);
segItemDef.setProperties(segProps);

Map<String, ContentFieldDefinition> fields =
    new HashMap<>();
ContentFieldDefinition segmentsDef =
    new ContentFieldDefinition();
segmentsDef.setType(ContentFieldType.ARRAY);
segmentsDef.setItemDefinition(segItemDef);
fields.put("Segments", segmentsDef);

ContentFieldSchema fieldSchema =
    new ContentFieldSchema();
fieldSchema.setName("video_schema");
fieldSchema.setDescription(
    "Schema for analyzing product demo videos");
fieldSchema.setFields(fields);

Map<String, String> models = new HashMap<>();
models.put("completion", "gpt-4.1");

ContentAnalyzer customAnalyzer =
    new ContentAnalyzer()
        .setBaseAnalyzerId("prebuilt-video")
        .setDescription(
            "Custom analyzer for product"
            + " demo videos")
        .setConfig(new ContentAnalyzerConfig()
            .setLocales(
                Arrays.asList("en-US", "fr-FR"))
            .setReturnDetails(true))
        .setFieldSchema(fieldSchema)
        .setModels(models);

SyncPoller<ContentAnalyzerOperationStatus,
    ContentAnalyzer> operation =
    client.beginCreateAnalyzer(
        analyzerId, customAnalyzer, true);

ContentAnalyzer result =
    operation.getFinalResult();
System.out.println(
    "Analyzer '" + analyzerId
    + "' created successfully!");

if (result.getDescription() != null) {
    System.out.println(
        "  Description: "
        + result.getDescription());
}

if (result.getFieldSchema() != null
    && result.getFieldSchema()
        .getFields() != null) {
    System.out.println(
        "  Fields ("
        + result.getFieldSchema()
            .getFields().size() + "):");
    result.getFieldSchema().getFields()
        .forEach((fieldName, fieldDef) -> {
            String method =
                fieldDef.getMethod() != null
                    ? fieldDef.getMethod()
                        .toString()
                    : "auto";
            String type =
                fieldDef.getType() != null
                    ? fieldDef.getType()
                        .toString()
                    : "unknown";
            System.out.println(
                "    - " + fieldName
                + ": " + type
                + " (" + method + ")");
        });
}

Contoh output terlihat seperti:

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: array (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel Create Analyzer untuk konten video.

Menggunakan penganalisis kustom

Setelah membuat penganalisis, gunakan untuk menganalisis dokumen dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

String documentUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf";

AnalysisInput input = new AnalysisInput();
input.setUrl(documentUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()
    && analyzeResult.getContents().get(0)
        instanceof DocumentContent) {
    DocumentContent content =
        (DocumentContent) analyzeResult
            .getContents().get(0);

    ContentField companyField =
        content.getFields() != null
            ? content.getFields()
                .get("company_name") : null;
    if (companyField
        instanceof ContentStringField) {
        ContentStringField sf =
            (ContentStringField) companyField;
        System.out.println(
            "Company Name: " + sf.getValue());
        System.out.println(
            "  Confidence: "
            + companyField.getConfidence());
    }

    ContentField totalField =
        content.getFields() != null
            ? content.getFields()
                .get("total_amount") : null;
    if (totalField != null) {
        System.out.println(
            "Total Amount: "
            + totalField.getValue());
    }

    ContentField summaryField =
        content.getFields() != null
            ? content.getFields()
                .get("document_summary") : null;
    if (summaryField
        instanceof ContentStringField) {
        ContentStringField sf =
            (ContentStringField) summaryField;
        System.out.println(
            "Summary: " + sf.getValue());
    }

    ContentField typeField =
        content.getFields() != null
            ? content.getFields()
                .get("document_type") : null;
    if (typeField
        instanceof ContentStringField) {
        ContentStringField sf =
            (ContentStringField) typeField;
        System.out.println(
            "Document Type: " + sf.getValue());
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

Contoh output terlihat seperti:

Company Name: CONTOSO LTD.
  Confidence: 0.781
Total Amount: 610.0
Summary: This document is an invoice from CONTOSO LTD. to Microsoft Corporation for consulting services, document fees, and printing fees, detailing service dates, itemized charges, taxes, and the total amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel Java SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis gambar dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

String imageUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg";

AnalysisInput input = new AnalysisInput();
input.setUrl(imageUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()) {
    var content =
        analyzeResult.getContents().get(0);

    if (content.getFields() != null) {
        ContentField titleField =
            content.getFields().get("Title");
        if (titleField
            instanceof ContentStringField) {
            System.out.println(
                "Title: "
                + ((ContentStringField) titleField)
                    .getValue());
        }

        ContentField chartField =
            content.getFields().get("ChartType");
        if (chartField
            instanceof ContentStringField) {
            System.out.println(
                "Chart Type: "
                + ((ContentStringField) chartField)
                    .getValue());
        }
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

Contoh output terlihat seperti:

Title: Weekly Working Hours Distribution
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel Java SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis file audio dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

String audioUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3";

AnalysisInput input = new AnalysisInput();
input.setUrl(audioUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()) {
    var content =
        analyzeResult.getContents().get(0);

    if (content.getFields() != null) {
        ContentField summaryField =
            content.getFields().get("Summary");
        if (summaryField
            instanceof ContentStringField) {
            System.out.println(
                "Summary: "
                + ((ContentStringField)
                    summaryField)
                    .getValue());
        }

        ContentField sentField =
            content.getFields().get("Sentiment");
        if (sentField
            instanceof ContentStringField) {
            System.out.println(
                "Sentiment: "
                + ((ContentStringField) sentField)
                    .getValue());
        }
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

Contoh output terlihat seperti:

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the customer service representative, confirmed her identity by requesting her date of birth and provided her point balance. The conversation ended politely with no further requests.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel Java SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis video dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

String videoUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4";

AnalysisInput input = new AnalysisInput();
input.setUrl(videoUrl);

SyncPoller<ContentAnalyzerAnalyzeOperationStatus,
    AnalysisResult> analyzeOperation =
    client.beginAnalyze(
        analyzerId, Arrays.asList(input));

AnalysisResult analyzeResult =
    analyzeOperation.getFinalResult();

if (analyzeResult.getContents() != null
    && !analyzeResult.getContents().isEmpty()) {
    var content =
        analyzeResult.getContents().get(0);
    System.out.println(
        "Content kind: " + content.getKind());
    if (content.getFields() != null) {
        ContentField segmentsField =
            content.getFields().get("Segments");
        if (segmentsField
            instanceof ContentArrayField) {
            ContentArrayField segments =
                (ContentArrayField) segmentsField;
            System.out.println(
                "Segments (" + segments.size()
                + "):");
            for (int i = 0;
                i < segments.size(); i++) {
                ContentField seg = segments.get(i);
                if (seg instanceof
                    ContentObjectField) {
                    ContentObjectField obj =
                        (ContentObjectField) seg;
                    ContentField idField =
                        obj.getFieldOrDefault(
                            "SegmentId");
                    ContentField descField =
                        obj.getFieldOrDefault(
                            "Description");
                    ContentField sentField =
                        obj.getFieldOrDefault(
                            "Sentiment");
                    String segId = idField != null
                        ? String.valueOf(
                            idField.getValue())
                        : "N/A";
                    String desc = descField != null
                        ? String.valueOf(
                            descField.getValue())
                        : "N/A";
                    String sent = sentField != null
                        ? String.valueOf(
                            sentField.getValue())
                        : "N/A";
                    System.out.println(
                        "  Segment " + segId
                        + ": " + desc
                        + " (Sentiment: "
                        + sent + ")");
                }
            }
        }
    }
}

// --- Clean up ---
System.out.println(
    "\nCleaning up: deleting analyzer '"
    + analyzerId + "'...");
client.deleteAnalyzer(analyzerId);
System.out.println(
    "Analyzer '" + analyzerId
    + "' deleted successfully.");

Contoh output terlihat seperti:

Content kind: audioVisual
Segments (16):
  Segment 00:00:00.000-00:00:01.467: The video opens with a scenic aerial view of an island, featuring a small aircraft flying above the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI,' indicating a collaboration or integration between the two products. (Sentiment: Positive)
  Segment 00:00:01.467-00:00:03.233: A man is shown in an interview setting, sitting in a modern office environment. The transcript begins discussing neural TTS (Text-to-Speech) and the importance of good data for achieving a high-quality voice. (Sentiment: Neutral)
  Segment 00:00:03.233-00:00:07.367: The visuals shift to a digital audio waveform, emphasizing the technical aspect of TTS. The transcript explains that a universal TTS model was built using 3,000 hours of data, highlighting the scale and quality of the dataset. (Sentiment: Positive)
  Segment 00:00:07.367-00:00:08.200: Another man appears in an interview setting, continuing the discussion about the accumulation of data for the universal TTS model. The transcript notes that the model captures audio nuances for more natural voice generation. (Sentiment: Positive)
  Segment 00:00:08.200-00:00:11.367: The video transitions to an outdoor scene showing a large facility, likely a data center, set in a rural landscape. This visually supports the scale of infrastructure required for the TTS model. (Sentiment: Neutral)
  Segment 00:00:11.367-00:00:13.567: Inside a data center, rows of servers are shown, reinforcing the technological backbone of the TTS system. The transcript continues to emphasize the accumulation of data and the model's capabilities. (Sentiment: Neutral)
  Segment 00:00:13.567-00:00:16.100: The interview returns to the first man, who elaborates on the universal model's ability to generate natural voices. The transcript mentions the model's ability to capture nuances, supporting the visuals. (Sentiment: Positive)
  Segment 00:00:16.100-00:00:19.433: A biplane is seen flying over a coastal landscape, visually connecting the Flight Simulator experience to the advanced AI voice technology discussed earlier. (Sentiment: Positive)
  Segment 00:00:19.433-00:00:23.967: A scenic view of a castle with a plane flying overhead, further showcasing the immersive environments possible in Flight Simulator. The transcript highlights the naturalness of the generated voices. (Sentiment: Positive)
  Segment 00:00:23.967-00:00:30.033: A bald man is interviewed in a modern office setting. The transcript discusses the high fidelity of cognitive services offerings, noting that the voices sound much more like actual human voices. (Sentiment: Positive)
  Segment 00:00:30.033-00:00:33.200: The interview with the bald man continues, reinforcing the message about the realism and fidelity of the AI-generated voices. (Sentiment: Positive)
  Segment 00:00:33.200-00:00:35.267: The video shows an overhead view of an airplane on the tarmac, possibly preparing for pushback. The transcript transitions to a simulated ATC (Air Traffic Control) exchange, demonstrating the practical application of TTS in Flight Simulator. (Sentiment: Neutral)
  Segment 00:00:35.267-00:00:37.700: A ground crew member directs an Airbus aircraft, visually representing the realism and immersion of Flight Simulator. The transcript includes ATC communication, showing the integration of natural-sounding AI voices. (Sentiment: Positive)
  Segment 00:00:37.700-00:00:39.200: Ground crew members are seen walking on the tarmac near aircraft, continuing the realistic airport environment. The transcript features further ATC communication. (Sentiment: Neutral)
  Segment 00:00:39.200-00:00:42.033: A close-up of an Airbus aircraft at the gate, reinforcing the realism and detail in Flight Simulator. The transcript continues with ATC exchanges, demonstrating the natural voice output. (Sentiment: Positive)
  Segment 00:00:42.033-00:00:43.866: The video ends with the Microsoft logo and branding, signifying the conclusion of the demo and reinforcing the partnership between Flight Simulator and Microsoft Azure AI. (Sentiment: Positive)

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lain dari menjalankan penganalisis di sampel Java SDK.

Pustaka klien | Sampel | Sumber SDK

Panduan ini menunjukkan kepada Anda cara menggunakan Content Understanding JavaScript SDK untuk membuat penganalisis kustom yang mengekstrak data terstruktur dari konten Anda. Penganalisis kustom mendukung jenis konten dokumen, gambar, audio, dan video.

Prasyarat

Langganan Azure aktif. Jika Anda tidak memiliki akun Azure, buat secara gratis.
Sumber daya Microsoft Foundry dibuat di wilayah yang didukung.
Titik akhir sumber daya dan kunci API Anda (ditemukan di bawah Kunci dan Titik Akhir di portal Microsoft Azure).
Pengaturan default penyebaran model telah dikonfigurasi untuk sumber daya Anda. Lihat Model dan penerapan atau skrip satu kali ini untuk konfigurasi untuk petunjuk penyiapan.
Node.js Versi LTS.

Pengaturan

Buat proyek Node.js baru:

mkdir custom-analyzer-tutorial
cd custom-analyzer-tutorial
npm init -y

Instal pustaka klien Content Understanding:

npm install @azure/ai-content-understanding

Secara opsional, instal pustaka Azure Identity untuk autentikasi Microsoft Entra:
```
npm install @azure/identity
```

Menyiapkan variabel lingkungan

Untuk mengautentikasi dengan layanan Content Understanding, atur variabel lingkungan dengan nilai Anda sendiri sebelum menjalankan sampel:

CONTENTUNDERSTANDING_ENDPOINT - titik akhir ke sumber daya Pemahaman Konten Anda.
CONTENTUNDERSTANDING_KEY - kunci CONTENT Understanding API Anda (opsional jika menggunakan Microsoft Entra ID DefaultAzureCredential).

Windows

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux / macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

Membuat klien

const { AzureKeyCredential } =
    require("@azure/core-auth");
const {
    ContentUnderstandingClient,
} = require("@azure/ai-content-understanding");

const endpoint =
    process.env["CONTENTUNDERSTANDING_ENDPOINT"];
const key =
    process.env["CONTENTUNDERSTANDING_KEY"];

const client = new ContentUnderstandingClient(
    endpoint,
    new AzureKeyCredential(key)
);

Membuat penganalisis kustom

const analyzerId =
    `my_document_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom analyzer for extracting"
        + " company information",
    config: {
        enableFormula: true,
        enableLayout: true,
        enableOcr: true,
        estimateFieldSourceAndConfidence: true,
        returnDetails: true,
    },
    fieldSchema: {
        name: "company_schema",
        description:
            "Schema for extracting company"
            + " information",
        fields: {
            company_name: {
                type: "string",
                method: "extract",
                description:
                    "Name of the company",
            },
            total_amount: {
                type: "number",
                method: "extract",
                description:
                    "Total amount on the"
                    + " document",
            },
            document_summary: {
                type: "string",
                method: "generate",
                description:
                    "A brief summary of the"
                    + " document content",
            },
            document_type: {
                type: "string",
                method: "classify",
                description: "Type of document",
                enum: [
                    "invoice", "receipt",
                    "contract", "report", "other",
                ],
            },
        },
    },
    models: {
        completion: "gpt-4.1",
        embedding: "text-embedding-3-large", // Required when using field_schema and prebuilt-document base analyzer
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: string (extract)
    - total_amount: number (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

Petunjuk / Saran

Kode ini didasarkan pada sampel create Analyzer di repositori SDK.

const classifierId =
    `my_classifier_${Math.floor(
        Date.now() / 1000
    )}`;

console.log(
    `Creating classifier '${classifierId}'...`
);

const classifierAnalyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom classifier for financial"
        + " document categorization",
    config: {
        returnDetails: true,
        enableSegment: true,
        contentCategories: {
            Loan_Application: {
                description:
                    "Documents submitted by"
                    + " individuals or"
                    + " businesses to request"
                    + " funding, typically"
                    + " including personal or"
                    + " business details,"
                    + " financial history,"
                    + " loan amount, purpose,"
                    + " and supporting"
                    + " documentation.",
            },
            Invoice: {
                description:
                    "Billing documents issued"
                    + " by sellers or service"
                    + " providers to request"
                    + " payment for goods or"
                    + " services, detailing"
                    + " items, prices, taxes,"
                    + " totals, and payment"
                    + " terms.",
            },
            Bank_Statement: {
                description:
                    "Official statements"
                    + " issued by banks that"
                    + " summarize account"
                    + " activity over a"
                    + " period, including"
                    + " deposits, withdrawals,"
                    + " fees, and balances.",
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const classifierPoller =
    client.createAnalyzer(
        classifierId, classifierAnalyzer
    );
await classifierPoller.pollUntilDone();

const classifierResult =
    await client.getAnalyzer(classifierId);

console.log(
    `Classifier '${classifierId}' created`
    + ` successfully!`
);

if (classifierResult.description) {
    console.log(
        `  Description: `
        + `${classifierResult.description}`
    );
}

Petunjuk / Saran

Kode ini didasarkan pada sampel buat Pengklasifikasi untuk alur kerja klasifikasi.

Contoh berikut membuat penganalisis gambar kustom berdasarkan penganalisis gambar bawaan untuk memproses bagan dan grafik.

const analyzerId =
    `my_image_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-image",
    description:
        "Custom analyzer for charts and graphs",
    fieldSchema: {
        name: "chart_schema",
        description:
            "Schema for extracting chart"
            + " information",
        fields: {
            Title: {
                type: "string",
                description:
                    "Title of the chart",
            },
            ChartType: {
                type: "string",
                method: "classify",
                description: "Type of chart",
                enum: ["bar", "line", "pie"],
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create Analyzer untuk konten gambar.

Contoh berikut membuat penganalisis audio kustom berdasarkan penganalisis audio bawaan untuk memproses rekaman panggilan dukungan pelanggan.

const analyzerId =
    `my_audio_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-audio",
    description:
        "Custom analyzer for customer"
        + " support calls",
    config: {
        locales: ["en-US", "fr-FR"],
        returnDetails: true,
    },
    fieldSchema: {
        name: "call_center_schema",
        description:
            "Schema for analyzing customer"
            + " support calls",
        fields: {
            Summary: {
                type: "string",
                method: "generate",
                description:
                    "Summary of the call",
            },
            Sentiment: {
                type: "string",
                method: "classify",
                description:
                    "Overall sentiment of"
                    + " the call",
                enum: [
                    "Positive", "Neutral",
                    "Negative",
                ],
            },
            People: {
                type: "array",
                description:
                    "List of people mentioned",
                itemDefinition: {
                    type: "object",
                    properties: {
                        Name: {
                            type: "string",
                        },
                        Role: {
                            type: "string",
                        },
                    },
                },
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - Summary: string (generate)
    - Sentiment: string (classify)
    - People: array (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create Analyzer untuk konten audio.

Contoh berikut membuat penganalisis video kustom berdasarkan penganalisis video bawaan untuk memproses demo dan ulasan produk.

const analyzerId =
    `my_video_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const analyzer = {
    baseAnalyzerId: "prebuilt-video",
    description:
        "Custom analyzer for product"
        + " demo videos",
    config: {
        locales: ["en-US", "fr-FR"],
        returnDetails: true,
    },
    fieldSchema: {
        name: "video_schema",
        description:
            "Schema for analyzing product"
            + " demo videos",
        fields: {
            Segments: {
                type: "array",
                itemDefinition: {
                    type: "object",
                    properties: {
                        SegmentId: {
                            type: "string",
                        },
                        Description: {
                            type: "string",
                            method: "generate",
                            description:
                                "Detailed summary"
                                + " of the video"
                                + " segment",
                        },
                        Sentiment: {
                            type: "string",
                            method: "classify",
                            enum: [
                                "Positive",
                                "Neutral",
                                "Negative",
                            ],
                        },
                    },
                },
            },
        },
    },
    models: {
        completion: "gpt-4.1",
    },
};

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: array (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create Analyzer untuk konten video.

Menggunakan penganalisis kustom

Setelah membuat penganalisis, gunakan untuk menganalisis dokumen dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const documentUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf";

const analyzePoller = client.analyze(
    analyzerId, [{ url: documentUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const company =
            content.fields["company_name"];
        if (company) {
            console.log(
                `Company Name: `
                + `${company.value}`
            );
            console.log(
                `  Confidence: `
                + `${company.confidence}`
            );
        }

        const total =
            content.fields["total_amount"];
        if (total) {
            console.log(
                `Total Amount: `
                + `${total.value}`
            );
        }

        const summary =
            content.fields["document_summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const docType =
            content.fields["document_type"];
        if (docType) {
            console.log(
                `Document Type: `
                + `${docType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Company Name: CONTOSO LTD.
  Confidence: 0.739
Total Amount: 610
Summary: This document is an invoice from CONTOSO LTD. to Microsoft Corporation for consulting, document, and printing services provided during the service period. It details line items, subtotal, sales tax, total, previous unpaid balance, and the final amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel JavaScript SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis gambar dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const imageUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg";

const analyzePoller = client.analyze(
    analyzerId, [{ url: imageUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const title =
            content.fields["Title"];
        if (title) {
            console.log(
                `Title: ${title.value}`
            );
        }

        const chartType =
            content.fields["ChartType"];
        if (chartType) {
            console.log(
                `Chart Type: `
                + `${chartType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel JavaScript SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis file audio dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const audioUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3";

const analyzePoller = client.analyze(
    analyzerId, [{ url: audioUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const summary =
            content.fields["Summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const sentiment =
            content.fields["Sentiment"];
        if (sentiment) {
            console.log(
                `Sentiment: `
                + `${sentiment.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and then provided her with her point balance of 599 points. Maria confirmed she needed no further information and ended the call.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel JavaScript SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis video dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const videoUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4";

const analyzePoller = client.analyze(
    analyzerId, [{ url: videoUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    console.log(
        `Content kind: ${content.kind}`
    );
    if (content.fields) {
        const segments =
            content.fields["Segments"];
        if (segments && segments.value) {
            console.log(
                `Segments`
                + ` (${segments.value.length}):`
            );
            for (const segment
                of segments.value) {
                const segId =
                    segment.value
                        ?.SegmentId?.value
                    ?? "N/A";
                const desc =
                    segment.value
                        ?.Description?.value
                    ?? "N/A";
                const sent =
                    segment.value
                        ?.Sentiment?.value
                    ?? "N/A";
                console.log(
                    `  Segment: ${segId}`
                );
                console.log(
                    `    Description:`
                    + ` ${desc}`
                );
                console.log(
                    `    Sentiment:`
                    + ` ${sent}`
                );
            }
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Content kind: audioVisual
Segments (16):
  Segment: 00:00:00.000-00:00:01.467
    Description: The video opens with a scenic aerial view of an island surrounded by turquoise water. A small airplane is flying over the landscape. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI', indicating a collaboration or integration between the two.
    Sentiment: Positive
  Segment: 00:00:01.467-00:00:03.233
    Description: A man is shown sitting in a modern office setting, likely preparing to speak or introduce the topic. The background features geometric wall lights and a plant, giving a professional and contemporary atmosphere.
    Sentiment: Neutral
  Segment: 00:00:03.233-00:00:07.367
    Description: The screen displays a digital audio waveform, suggesting a focus on audio technology. The accompanying transcript discusses the importance of good data for neural TTS (Text-to-Speech) to achieve a high-quality voice.
    Sentiment: Neutral
  Segment: 00:00:07.367-00:00:08.200
    Description: Another man is shown in a similar office environment, possibly continuing the explanation or providing additional information about the product.
    Sentiment: Neutral
  Segment: 00:00:08.200-00:00:11.367
    Description: The video transitions to an outdoor scene showing a large facility with multiple buildings, set in a rural landscape. This likely represents the data centers or infrastructure supporting the technology.
    Sentiment: Neutral
  Segment: 00:00:11.367-00:00:13.567
    Description: The camera moves inside a data center, showing rows of servers and high-tech equipment. This emphasizes the scale and capability of the infrastructure used for the TTS model.
    Sentiment: Neutral
  Segment: 00:00:13.567-00:00:16.100
    Description: The man from earlier is shown again in the office, likely elaborating on the accumulation of data and the universal TTS model, as mentioned in the transcript.
    Sentiment: Neutral
  Segment: 00:00:16.100-00:00:19.433
    Description: A biplane is seen flying over a coastal city with clear blue water and lush green hills, highlighting the realism and immersive visuals of the Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:19.433-00:00:23.967
    Description: The video shows a castle surrounded by mountains and clouds, with a small aircraft flying nearby. This further showcases the detailed environments possible in the Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:23.967-00:00:30.033
    Description: A bald man is interviewed in a modern office setting. The transcript discusses the high fidelity and naturalness of voices generated by cognitive services, suggesting he is explaining the benefits of the technology.
    Sentiment: Positive
  Segment: 00:00:30.033-00:00:33.200
    Description: The bald man continues speaking, possibly providing more details about the product's capabilities and its impact on user experience.
    Sentiment: Positive
  Segment: 00:00:33.200-00:00:35.267
    Description: The video shifts to an overhead view of an airplane on the runway, preparing for movement. This scene likely relates to the realism of the simulator and the integration of AI-driven voice technology.
    Sentiment: Neutral
  Segment: 00:00:35.267-00:00:37.700
    Description: A ground crew member directs an Airbus aircraft, with pilots visible in the cockpit. This scene emphasizes the operational realism and communication aspects in the simulator.
    Sentiment: Neutral
  Segment: 00:00:37.700-00:00:39.200
    Description: Two ground crew members walk near an aircraft on the tarmac, with airport buildings and other planes in the background. The environment is realistic and detailed.
    Sentiment: Neutral
  Segment: 00:00:39.200-00:00:42.033
    Description: A close-up of an Airbus aircraft at the gate, with sunlight illuminating the scene. This further highlights the visual fidelity and immersive experience of the simulator.
    Sentiment: Positive
  Segment: 00:00:42.033-00:00:43.866
    Description: The video ends with the Microsoft logo and branding, signaling the conclusion of the product demo and reinforcing the partnership between Flight Simulator and Microsoft Azure AI.
    Sentiment: Positive

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel JavaScript SDK.

Pustaka klien | Sampel | Sumber SDK

Panduan ini menunjukkan kepada Anda cara menggunakan Content Understanding TypeScript SDK untuk membuat penganalisis kustom yang mengekstrak data terstruktur dari konten Anda. Penganalisis kustom mendukung jenis konten dokumen, gambar, audio, dan video.

Prasyarat

Langganan Azure aktif. Jika Anda tidak memiliki akun Azure, buat secara gratis.
Sumber daya Microsoft Foundry dibuat di wilayah yang didukung.
Titik akhir sumber daya dan kunci API Anda (ditemukan di bawah Kunci dan Titik Akhir di portal Microsoft Azure).
Pengaturan default penyebaran model telah dikonfigurasi untuk sumber daya Anda. Lihat Model dan penerapan atau skrip satu kali ini untuk konfigurasi untuk petunjuk penyiapan.
Node.js Versi LTS.
TypeScript 5.x atau yang lebih baru.

Pengaturan

Buat proyek Node.js baru:

mkdir custom-analyzer-tutorial
cd custom-analyzer-tutorial
npm init -y

Instal TypeScript dan pustaka klien Content Understanding:

npm install typescript ts-node @azure/ai-content-understanding

Secara opsional, instal pustaka Azure Identity untuk autentikasi Microsoft Entra:
```
npm install @azure/identity
```

Menyiapkan variabel lingkungan

Untuk mengautentikasi dengan layanan Content Understanding, atur variabel lingkungan dengan nilai Anda sendiri sebelum menjalankan sampel:

CONTENTUNDERSTANDING_ENDPOINT - titik akhir ke sumber daya Pemahaman Konten Anda.
CONTENTUNDERSTANDING_KEY - kunci CONTENT Understanding API Anda (opsional jika menggunakan Microsoft Entra ID DefaultAzureCredential).

Windows

setx CONTENTUNDERSTANDING_ENDPOINT "your-endpoint"
setx CONTENTUNDERSTANDING_KEY "your-key"

Linux / macOS

export CONTENTUNDERSTANDING_ENDPOINT="your-endpoint"
export CONTENTUNDERSTANDING_KEY="your-key"

Membuat klien

import { AzureKeyCredential } from "@azure/core-auth";
import {
    ContentUnderstandingClient,
} from "@azure/ai-content-understanding";
import type {
    ContentAnalyzer,
    ContentAnalyzerConfig,
    ContentFieldSchema,
} from "@azure/ai-content-understanding";

const endpoint =
    process.env["CONTENTUNDERSTANDING_ENDPOINT"]!;
const key =
    process.env["CONTENTUNDERSTANDING_KEY"]!;

const client = new ContentUnderstandingClient(
    endpoint,
    new AzureKeyCredential(key)
);

Membuat penganalisis kustom

const analyzerId =
    `my_document_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "company_schema",
    description:
        "Schema for extracting company"
        + " information",
    fields: {
        company_name: {
            type: "string",
            method: "extract",
            description:
                "Name of the company",
        },
        total_amount: {
            type: "number",
            method: "extract",
            description:
                "Total amount on the document",
        },
        document_summary: {
            type: "string",
            method: "generate",
            description:
                "A brief summary of the"
                + " document content",
        },
        document_type: {
            type: "string",
            method: "classify",
            description: "Type of document",
            enum: [
                "invoice", "receipt",
                "contract", "report", "other",
            ],
        },
    },
};

const config: ContentAnalyzerConfig = {
    enableFormula: true,
    enableLayout: true,
    enableOcr: true,
    estimateFieldSourceAndConfidence: true,
    returnDetails: true,
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom analyzer for extracting"
        + " company information",
    config,
    fieldSchema,
    models: {
        completion: "gpt-4.1",
        embedding: "text-embedding-3-large", // Required when using field_schema and prebuilt-document base analyzer
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_document_analyzer_ID' created successfully!
  Description: Custom analyzer for extracting company information
  Fields (4):
    - company_name: string (extract)
    - total_amount: number (extract)
    - document_summary: string (generate)
    - document_type: string (classify)

Petunjuk / Saran

Kode ini didasarkan pada sampel create Analyzer di repositori SDK.

const classifierId =
    `my_classifier_${Math.floor(
        Date.now() / 1000
    )}`;

console.log(
    `Creating classifier '${classifierId}'...`
);

const classifierAnalyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-document",
    description:
        "Custom classifier for financial"
        + " document categorization",
    config: {
        returnDetails: true,
        enableSegment: true,
        contentCategories: {
            Loan_Application: {
                description:
                    "Documents submitted by"
                    + " individuals or"
                    + " businesses to request"
                    + " funding, typically"
                    + " including personal or"
                    + " business details,"
                    + " financial history,"
                    + " loan amount, purpose,"
                    + " and supporting"
                    + " documentation.",
            },
            Invoice: {
                description:
                    "Billing documents issued"
                    + " by sellers or service"
                    + " providers to request"
                    + " payment for goods or"
                    + " services, detailing"
                    + " items, prices, taxes,"
                    + " totals, and payment"
                    + " terms.",
            },
            Bank_Statement: {
                description:
                    "Official statements"
                    + " issued by banks that"
                    + " summarize account"
                    + " activity over a"
                    + " period, including"
                    + " deposits, withdrawals,"
                    + " fees, and balances.",
            },
        },
    } as unknown as ContentAnalyzerConfig,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const classifierPoller =
    client.createAnalyzer(
        classifierId, classifierAnalyzer
    );
await classifierPoller.pollUntilDone();

const classifierResult =
    await client.getAnalyzer(classifierId);

console.log(
    `Classifier '${classifierId}' created`
    + ` successfully!`
);

if (classifierResult.description) {
    console.log(
        `  Description: `
        + `${classifierResult.description}`
    );
}

Petunjuk / Saran

Kode ini didasarkan pada sampel buat Pengklasifikasi untuk alur kerja klasifikasi.

Contoh berikut membuat penganalisis gambar kustom berdasarkan penganalisis gambar bawaan untuk memproses bagan dan grafik.

const analyzerId =
    `my_image_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "chart_schema",
    description:
        "Schema for extracting chart"
        + " information",
    fields: {
        Title: {
            type: "string",
            description:
                "Title of the chart",
        },
        ChartType: {
            type: "string",
            method: "classify",
            description: "Type of chart",
            enum: ["bar", "line", "pie"],
        },
    },
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-image",
    description:
        "Custom analyzer for charts"
        + " and graphs",
    fieldSchema,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_image_analyzer_ID' created successfully!
  Description: Custom analyzer for charts and graphs
  Fields (2):
    - Title: string (auto)
    - ChartType: string (classify)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create Analyzer untuk konten gambar.

Contoh berikut membuat penganalisis audio kustom berdasarkan penganalisis audio bawaan untuk memproses rekaman panggilan dukungan pelanggan.

const analyzerId =
    `my_audio_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "call_center_schema",
    description:
        "Schema for analyzing customer"
        + " support calls",
    fields: {
        Summary: {
            type: "string",
            method: "generate",
            description:
                "Summary of the call",
        },
        Sentiment: {
            type: "string",
            method: "classify",
            description:
                "Overall sentiment of"
                + " the call",
            enum: [
                "Positive", "Neutral",
                "Negative",
            ],
        },
        People: {
            type: "array",
            description:
                "List of people mentioned",
            itemDefinition: {
                type: "object",
                properties: {
                    Name: { type: "string" },
                    Role: { type: "string" },
                },
            },
        },
    },
};

const config: ContentAnalyzerConfig = {
    locales: ["en-US", "fr-FR"],
    returnDetails: true,
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-audio",
    description:
        "Custom analyzer for customer"
        + " support calls",
    config,
    fieldSchema,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_audio_analyzer_ID' created successfully!
  Description: Custom analyzer for customer support calls
  Fields (3):
    - Summary: string (generate)
    - Sentiment: string (classify)
    - People: array (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create Analyzer untuk konten audio.

Contoh berikut membuat penganalisis video kustom berdasarkan penganalisis video bawaan untuk memproses demo dan ulasan produk.

const analyzerId =
    `my_video_analyzer_${Math.floor(
        Date.now() / 1000
    )}`;

const fieldSchema: ContentFieldSchema = {
    name: "video_schema",
    description:
        "Schema for analyzing product"
        + " demo videos",
    fields: {
        Segments: {
            type: "array",
            itemDefinition: {
                type: "object",
                properties: {
                    SegmentId: {
                        type: "string",
                    },
                    Description: {
                        type: "string",
                        method: "generate",
                        description:
                            "Detailed summary"
                            + " of the video"
                            + " segment",
                    },
                    Sentiment: {
                        type: "string",
                        method: "classify",
                        enum: [
                            "Positive",
                            "Neutral",
                            "Negative",
                        ],
                    },
                },
            },
        },
    },
};

const config: ContentAnalyzerConfig = {
    locales: ["en-US", "fr-FR"],
    returnDetails: true,
};

const analyzer: ContentAnalyzer = {
    baseAnalyzerId: "prebuilt-video",
    description:
        "Custom analyzer for product"
        + " demo videos",
    config,
    fieldSchema,
    models: {
        completion: "gpt-4.1",
    },
} as unknown as ContentAnalyzer;

const poller = client.createAnalyzer(
    analyzerId, analyzer
);
await poller.pollUntilDone();

const result = await client.getAnalyzer(
    analyzerId
);
console.log(
    `Analyzer '${analyzerId}' created`
    + ` successfully!`
);

if (result.description) {
    console.log(
        `  Description: ${result.description}`
    );
}

if (result.fieldSchema?.fields) {
    const fields = result.fieldSchema.fields;
    console.log(
        `  Fields`
        + ` (${Object.keys(fields).length}):`
    );
    for (const [name, fieldDef]
        of Object.entries(fields)) {
        const method =
            fieldDef.method ?? "auto";
        const fieldType =
            fieldDef.type ?? "unknown";
        console.log(
            `    - ${name}: `
            + `${fieldType} (${method})`
        );
    }
}

Contoh output terlihat seperti:

Analyzer 'my_video_analyzer_ID' created successfully!
  Description: Custom analyzer for product demo videos
  Fields (1):
    - Segments: array (auto)

Petunjuk / Saran

Kode ini mengadaptasi pola sampel create Analyzer untuk konten video.

Menggunakan penganalisis kustom

Setelah membuat penganalisis, gunakan untuk menganalisis dokumen dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const documentUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/document/invoice.pdf";

const analyzePoller = client.analyze(
    analyzerId, [{ url: documentUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const company =
            content.fields["company_name"];
        if (company) {
            console.log(
                `Company Name: `
                + `${company.value}`
            );
            console.log(
                `  Confidence: `
                + `${company.confidence}`
            );
        }

        const total =
            content.fields["total_amount"];
        if (total) {
            console.log(
                `Total Amount: `
                + `${total.value}`
            );
        }

        const summary =
            content.fields["document_summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const docType =
            content.fields["document_type"];
        if (docType) {
            console.log(
                `Document Type: `
                + `${docType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Company Name: CONTOSO LTD.
  Confidence: 0.818
Total Amount: 610
Summary: This document is an invoice from CONTOSO LTD. to MICROSOFT CORPORATION for consulting, document, and printing services provided during the service period 10/14/2019 - 11/14/2019. It details line items, subtotal, sales tax, total, previous unpaid balance, and the final amount due.
Document Type: invoice

Cleaning up: deleting analyzer 'my_document_analyzer_ID'...
Analyzer 'my_document_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel TypeScript SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis gambar dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const imageUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/image/pieChart.jpg";

const analyzePoller = client.analyze(
    analyzerId, [{ url: imageUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const title =
            content.fields["Title"];
        if (title) {
            console.log(
                `Title: ${title.value}`
            );
        }

        const chartType =
            content.fields["ChartType"];
        if (chartType) {
            console.log(
                `Chart Type: `
                + `${chartType.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Title: Distribution of Weekly Working Hours
Chart Type: pie

Cleaning up: deleting analyzer 'my_image_analyzer_ID'...
Analyzer 'my_image_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel TypeScript SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis file audio dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const audioUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/audio/callCenterRecording.mp3";

const analyzePoller = client.analyze(
    analyzerId, [{ url: audioUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    if (content.fields) {
        const summary =
            content.fields["Summary"];
        if (summary) {
            console.log(
                `Summary: ${summary.value}`
            );
        }

        const sentiment =
            content.fields["Sentiment"];
        if (sentiment) {
            console.log(
                `Sentiment: `
                + `${sentiment.value}`
            );
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Summary: Maria Smith contacted Contoso to inquire about her current point balance. John Doe, the representative, verified her identity by requesting her date of birth and then provided her with her point balance of 599 points. Maria confirmed she did not need further assistance, and the call ended amicably.
Sentiment: Positive

Cleaning up: deleting analyzer 'my_audio_analyzer_ID'...
Analyzer 'my_audio_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel TypeScript SDK.

Setelah membuat penganalisis, gunakan untuk menganalisis video dan mengekstrak bidang kustom. Hapus penganalisis saat Anda tidak lagi membutuhkannya.

const videoUrl =
    "https://raw.githubusercontent.com/"
    + "Azure-Samples/"
    + "azure-ai-content-understanding-assets/"
    + "main/videos/sdk_samples/"
    + "FlightSimulator.mp4";

const analyzePoller = client.analyze(
    analyzerId, [{ url: videoUrl }]
);
const analyzeResult =
    await analyzePoller.pollUntilDone();

if (analyzeResult.contents
    && analyzeResult.contents.length > 0) {
    const content = analyzeResult.contents[0];
    console.log(
        `Content kind: ${content.kind}`
    );
    if (content.fields) {
        const segments =
            content.fields["Segments"];
        if (segments && segments.value) {
            const segArray =
                segments.value as any[];
            console.log(
                `Segments`
                + ` (${segArray.length}):`
            );
            for (const segment
                of segArray) {
                const segId =
                    segment.value
                        ?.SegmentId?.value
                    ?? "N/A";
                const desc =
                    segment.value
                        ?.Description?.value
                    ?? "N/A";
                const sent =
                    segment.value
                        ?.Sentiment?.value
                    ?? "N/A";
                console.log(
                    `  Segment: ${segId}`
                );
                console.log(
                    `    Description:`
                    + ` ${desc}`
                );
                console.log(
                    `    Sentiment:`
                    + ` ${sent}`
                );
            }
        }
    }
}

// --- Clean up ---
console.log(
    `\nCleaning up: deleting analyzer`
    + ` '${analyzerId}'...`
);
await client.deleteAnalyzer(analyzerId);
console.log(
    `Analyzer '${analyzerId}' deleted`
    + ` successfully.`
);

Contoh output terlihat seperti:

Content kind: audioVisual
Segments (16):
  Segment: 00:00:00.000-00:00:01.467
    Description: The video opens with a scenic aerial view of an island surrounded by blue water, featuring a small airplane flying over it. The screen displays the logos for 'Flight Simulator' and 'Microsoft Azure AI', indicating a collaboration or integration between the two.
    Sentiment: Positive
  Segment: 00:00:01.467-00:00:03.233
    Description: A man is shown sitting in a modern office environment, likely preparing to speak or introduce the topic. The background includes plants and geometric wall lights, giving a professional and contemporary feel.
    Sentiment: Neutral
  Segment: 00:00:03.233-00:00:07.367
    Description: The video transitions to a close-up of a digital audio waveform, visually representing sound data. This segment aligns with the audio discussing the importance of good data for neural TTS (Text-to-Speech) and the creation of a universal TTS model using extensive audio data.
    Sentiment: Positive
  Segment: 00:00:07.367-00:00:08.200
    Description: Another man appears in a similar office setting, possibly continuing the explanation or providing additional commentary.
    Sentiment: Neutral
  Segment: 00:00:08.200-00:00:11.367
    Description: The scene shifts to an outdoor view of a large facility surrounded by green fields and blue skies, likely representing a data center or infrastructure supporting the TTS technology.
    Sentiment: Positive
  Segment: 00:00:11.367-00:00:13.567
    Description: Inside a data center, rows of servers are shown, emphasizing the technological backbone and scale of the operation required for processing large amounts of audio data.
    Sentiment: Positive
  Segment: 00:00:13.567-00:00:16.100
    Description: The first man returns, continuing his explanation in the office setting. The audio mentions the accumulation of data to capture audio nuances and generate natural voices.
    Sentiment: Positive
  Segment: 00:00:16.100-00:00:19.433
    Description: A biplane is seen flying over a coastal landscape, showcasing the immersive visuals of Flight Simulator. This segment highlights the realism and beauty of the simulation.
    Sentiment: Positive
  Segment: 00:00:19.433-00:00:23.967
    Description: A plane flies past a castle set against a mountainous backdrop, further demonstrating the detailed environments in Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:23.967-00:00:30.033
    Description: A bald man is interviewed in a modern office space, likely discussing the benefits of cognitive services offerings, such as higher fidelity and more human-like voices.
    Sentiment: Positive
  Segment: 00:00:30.033-00:00:33.200
    Description: The interview continues with the bald man, focusing on his commentary about the product's features and advantages.
    Sentiment: Positive
  Segment: 00:00:33.200-00:00:35.267
    Description: The video shifts to an overhead view of an airplane on the runway, preparing for movement, possibly referencing the realism of in-game operations.
    Sentiment: Neutral
  Segment: 00:00:35.267-00:00:37.700
    Description: A ground crew member directs an Airbus aircraft, highlighting the detailed simulation of airport operations in Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:37.700-00:00:39.200
    Description: Two ground crew members walk near an aircraft on the tarmac, reinforcing the realistic airport environment and operations.
    Sentiment: Neutral
  Segment: 00:00:39.200-00:00:42.033
    Description: A close-up of an Airbus aircraft at the gate, with sunlight and clouds in the background, further showcasing the visual fidelity of Flight Simulator.
    Sentiment: Positive
  Segment: 00:00:42.033-00:00:43.866
    Description: The video concludes with the Microsoft logo and branding, signaling the end of the product demo and reinforcing the partnership.
    Sentiment: Positive

Cleaning up: deleting analyzer 'my_video_analyzer_ID'...
Analyzer 'my_video_analyzer_ID' deleted successfully.

Petunjuk / Saran

Lihat contoh lebih lanjut tentang menjalankan penganalisis di sampel TypeScript SDK.

Tinjau sampel kode: pencarian dokumen visual.
Tinjau sampel kode: templat penganalisis.
Jelajahi lebih banyak sampel Python SDK
Jelajahi lebih banyak sampel .NET SDK
Menjelajahi lebih banyak sampel Java SDK
Menjelajahi lebih banyak sampel JavaScript SDK
Jelajahi lebih banyak sampel TypeScript SDK
Coba pemrosesan konten dokumen Anda menggunakan Pemahaman Konten di Foundry.

Saran dan Komentar

Apakah halaman ini membantu?

Last updated on 2026-03-31

Bagikan melalui

Membuat penganalisis kustom

Prasyarat

Menentukan skema penganalisis

Membuat penganalisis

Permintaan PUT

Respons PUT

Menganalisis berkas tersebut

Unggah berkas

Permintaan POST

Respons POST

Dapatkan hasil analisis

Permintaan GET

Respons GET

Contoh tanggapan

Prasyarat

Pengaturan

Menyiapkan variabel lingkungan

Windows

Linux / macOS

Membuat klien

Membuat penganalisis kustom

Menggunakan penganalisis kustom

Prasyarat

Pengaturan

Menyiapkan variabel lingkungan

Windows

Linux / macOS

Membuat klien

Membuat penganalisis kustom

Menggunakan penganalisis kustom

Prasyarat

Pengaturan

Menyiapkan variabel lingkungan

Windows

Linux / macOS

Membuat klien

Membuat penganalisis kustom

Menggunakan penganalisis kustom

Prasyarat

Pengaturan

Menyiapkan variabel lingkungan

Windows

Linux / macOS

Membuat klien

Membuat penganalisis kustom

Menggunakan penganalisis kustom

Prasyarat

Pengaturan

Menyiapkan variabel lingkungan

Windows

Linux / macOS

Membuat klien

Membuat penganalisis kustom

Menggunakan penganalisis kustom

Konten terkait

Saran dan Komentar

Sumber Daya Tambahan: