Content Analyzers - Create Or Replace
Create a new analyzer asynchronously.
PUT {endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-05-01-preview
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
analyzer
|
path | True |
string pattern: ^[a-zA-Z0-9._-]{1,64}$ |
The unique identifier of the analyzer. |
endpoint
|
path | True |
string (uri) |
Content Understanding service endpoint. |
api-version
|
query | True |
string minLength: 1 |
The API version to use for this operation. |
Request Header
Name | Required | Type | Description |
---|---|---|---|
x-ms-client-request-id |
string (uuid) |
An opaque, globally-unique, client-generated string identifier for the request. |
Request Body
Name | Type | Description |
---|---|---|
baseAnalyzerId |
string pattern: ^[a-zA-Z0-9._-]{1,64}$ |
The analyzer to incrementally train from. |
config |
Analyzer configuration settings. |
|
description |
string |
A description of the analyzer. |
fieldSchema |
The schema of fields to extracted. |
|
knowledgeSources | KnowledgeSource[]: |
Additional knowledge sources used to enhance the analyzer. |
mode |
The analysis mode: standard, pro. Default is standard. |
|
processingLocation |
The location where the data may be processed. |
|
tags |
object |
Tags associated with the analyzer. |
trainingData | DataSource: |
The data source containing training data for the analyzer. |
Responses
Name | Type | Description |
---|---|---|
200 OK |
The request has succeeded. Headers
|
|
201 Created |
The request has succeeded and a new resource has been created as a result. Headers
|
|
Other Status Codes |
An unexpected error response. Headers x-ms-error-code: string |
Security
Ocp-Apim-Subscription-Key
Type:
apiKey
In:
header
OAuth2Auth
Type:
oauth2
Flow:
accessCode
Authorization URL:
https://login.microsoftonline.com/common/oauth2/authorize
Token URL:
https://login.microsoftonline.com/common/oauth2/token
Scopes
Name | Description |
---|---|
https://cognitiveservices.azure.com/.default |
Examples
Create or Replace Analyzer
Sample request
PUT {endpoint}/contentunderstanding/analyzers/myAnalyzer?api-version=2025-05-01-preview
{
"description": "My analyzer",
"tags": {
"createdBy": "John"
},
"baseAnalyzerId": "prebuilt-documentAnalyzer",
"config": {
"enableFormula": false,
"returnDetails": true
},
"fieldSchema": {
"name": "MyForm",
"description": "My form",
"fields": {
"Company": {
"type": "string",
"description": "Name of company."
}
},
"definitions": {}
},
"trainingData": {
"kind": "blob",
"containerUrl": "https://myStorageAccount.blob.core.windows.net/myContainer?mySasToken",
"prefix": "trainingData",
"fileListPath": "trainingData/fileList.jsonl"
}
}
Sample response
Operation-Location: https://myendpoint.cognitiveservices.azure.com/contentunderstanding/analyzers/myAnalyzer/operations/3b31320d-8bab-4f88-b19c-2322a7f11034?api-version=2025-05-01-preview
{
"analyzerId": "myAnalyzer",
"description": "My analyzer",
"tags": {
"createdBy": "John"
},
"status": "creating",
"createdAt": "2025-05-01T18:46:36.051Z",
"lastModifiedAt": "2025-05-01T18:46:36.051Z",
"baseAnalyzerId": "prebuilt-documentAnalyzer",
"config": {
"locales": null,
"enableFace": false,
"enableOcr": true,
"enableLayout": true,
"enableFormula": false,
"returnDetails": true
},
"fieldSchema": {
"name": "MyForm",
"description": "My form",
"fields": {
"Company": {
"type": "string",
"description": "Name of company."
}
},
"definitions": {}
},
"trainingData": {
"kind": "blob",
"containerUrl": "https://myStorageAccount.blob.core.windows.net/myContainer",
"prefix": "trainingData",
"fileListPath": "trainingData/fileList.jsonl"
}
}
Operation-Location: https://myendpoint.cognitiveservices.azure.com/contentunderstanding/analyzers/myAnalyzer/operations/3b31320d-8bab-4f88-b19c-2322a7f11034?api-version=2025-05-01-preview
{
"analyzerId": "myAnalyzer",
"description": "My analyzer",
"tags": {
"createdBy": "John"
},
"status": "creating",
"createdAt": "2025-05-01T18:46:36.051Z",
"lastModifiedAt": "2025-05-01T18:46:36.051Z",
"baseAnalyzerId": "prebuilt-documentAnalyzer",
"config": {
"locales": null,
"enableFace": false,
"enableOcr": true,
"enableLayout": true,
"enableFormula": false,
"returnDetails": true
},
"fieldSchema": {
"name": "MyForm",
"description": "My form",
"fields": {
"Company": {
"type": "string",
"description": "Name of company."
}
},
"definitions": {}
},
"trainingData": {
"kind": "blob",
"containerUrl": "https://myStorageAccount.blob.core.windows.net/myContainer",
"prefix": "trainingData",
"fileListPath": "trainingData/fileList.jsonl"
}
}
Definitions
Name | Description |
---|---|
Analysis |
The analysis mode: standard, pro. Default is standard. |
Azure. |
The error object. |
Azure. |
A response containing error details. |
Azure. |
An object containing more specific information about the error. As per Microsoft One API guidelines - https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#handling-errors. |
Blob |
Blob storage data source. |
Content |
Analyzer that extracts content and fields from multimodal documents. |
Content |
Configuration settings for an analyzer. |
Data |
Data source kind. |
Field |
Definition of the field using a JSON Schema like syntax. |
Field |
Schema of fields to be extracted from documents. |
Field |
Semantic data type of the field value. |
Generation |
Generation method. |
Knowledge |
Knowledge source kind. |
Processing |
The location where the data may be processed. |
Reference |
File knowledge source. |
Resource |
Status of a resource. |
Segmentation |
Segmentation mode used to split audio/visual content. |
Table |
Representation format of tables in analyze result markdown. |
AnalysisMode
The analysis mode: standard, pro. Default is standard.
Value | Description |
---|---|
standard |
Standard analysis mode. |
pro |
Pro analysis mode. |
Azure.Core.Foundations.Error
The error object.
Name | Type | Description |
---|---|---|
code |
string |
One of a server-defined set of error codes. |
details |
An array of details about specific errors that led to this reported error. |
|
innererror |
An object containing more specific information than the current object about the error. |
|
message |
string |
A human-readable representation of the error. |
target |
string |
The target of the error. |
Azure.Core.Foundations.ErrorResponse
A response containing error details.
Name | Type | Description |
---|---|---|
error |
The error object. |
Azure.Core.Foundations.InnerError
An object containing more specific information about the error. As per Microsoft One API guidelines - https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#handling-errors.
Name | Type | Description |
---|---|---|
code |
string |
One of a server-defined set of error codes. |
innererror |
Inner error. |
BlobDataSource
Blob storage data source.
Name | Type | Description |
---|---|---|
containerUrl |
string (uri) |
The URL of the blob container. |
fileListPath |
string |
An optional path to a file listing specific blobs to include. |
kind |
string:
blob |
The kind of data source. |
prefix |
string |
An optional prefix to filter blobs within the container. |
ContentAnalyzer
Analyzer that extracts content and fields from multimodal documents.
Name | Type | Default value | Description |
---|---|---|---|
analyzerId |
string pattern: ^[a-zA-Z0-9._-]{1,64}$ |
The unique identifier of the analyzer. |
|
baseAnalyzerId |
string pattern: ^[a-zA-Z0-9._-]{1,64}$ |
The analyzer to incrementally train from. |
|
config |
Analyzer configuration settings. |
||
createdAt |
string (date-time) |
The date and time when the analyzer was created. |
|
description |
string |
A description of the analyzer. |
|
fieldSchema |
The schema of fields to extracted. |
||
knowledgeSources | KnowledgeSource[]: |
Additional knowledge sources used to enhance the analyzer. |
|
lastModifiedAt |
string (date-time) |
The date and time when the analyzer was last modified. |
|
mode | standard |
The analysis mode: standard, pro. Default is standard. |
|
processingLocation | geography |
The location where the data may be processed. |
|
status |
The status of the analyzer. |
||
tags |
object |
Tags associated with the analyzer. |
|
trainingData | DataSource: |
The data source containing training data for the analyzer. |
|
warnings |
Warnings encountered while creating the analyzer. |
ContentAnalyzerConfig
Configuration settings for an analyzer.
Name | Type | Default value | Description |
---|---|---|---|
disableContentFiltering |
boolean |
Disable content filtering that detects and prevents the output of harmful content. |
|
disableFaceBlurring |
boolean |
Disable the default blurring of faces for privacy while processing the content. |
|
enableFace |
boolean |
Enable face detection. |
|
enableFormula |
boolean |
Enable mathematical formula detection. |
|
enableLayout |
boolean |
Enable layout analysis. |
|
enableOcr |
boolean |
Enable optical character recognition (OCR). |
|
estimateFieldSourceAndConfidence |
boolean |
Return grounding source and confidence for extracted fields. |
|
locales |
string[] |
List of locale hints for speech transcription. |
|
personDirectoryId |
string |
Specify the person directory used for identifying detected faces. |
|
returnDetails |
boolean |
Return all content details. |
|
segmentationDefinition |
string |
Segmentation definition for use with custom segmentation mode. |
|
segmentationMode | noSegmentation |
Segmentation mode used to split audio/visual content. |
|
tableFormat | html |
Representation format of tables in analyze result markdown. |
DataSourceKind
Data source kind.
Value | Description |
---|---|
blob |
A blob storage data source. |
FieldDefinition
Definition of the field using a JSON Schema like syntax.
Name | Type | Default value | Description |
---|---|---|---|
$ref |
string |
Reference to another field definition. |
|
description |
string |
Field description. |
|
enum |
string[] |
Enumeration of possible field values. |
|
enumDescriptions |
object |
Descriptions for each enumeration value. |
|
examples |
string[] |
Examples of field values. |
|
items |
Field type schema of each array element, if type is array. |
||
method | generate |
Generation method. |
|
properties |
<string,
Field |
Named sub-fields, if type is object. |
|
type |
Semantic data type of the field value. |
FieldSchema
Schema of fields to be extracted from documents.
Name | Type | Description |
---|---|---|
definitions |
<string,
Field |
Additional definitions referenced by the fields in the schema. |
description |
string |
A description of the field schema. |
fields |
<string,
Field |
The fields defined in the schema. |
name |
string |
The name of the field schema. |
FieldType
Semantic data type of the field value.
Value | Description |
---|---|
string |
Plain text. |
date |
Date, normalized to ISO 8601 (YYYY-MM-DD) format. |
time |
Time, normalized to ISO 8601 (hh:mm:ss) format. |
number |
Number as double precision floating point. |
integer |
Integer as 64-bit signed integer. |
boolean |
Boolean value. |
array |
List of subfields of the same type. |
object |
Named list of subfields. |
GenerationMethod
Generation method.
Value | Description |
---|---|
generate |
Values are generated freely based on the content. |
extract |
Values are extracted as they appear in the content. |
classify |
Values are classified against a predefined set of categories. |
KnowledgeSourceKind
Knowledge source kind.
Value | Description |
---|---|
reference |
A reference knowledge source. |
ProcessingLocation
The location where the data may be processed.
Value | Description |
---|---|
geography |
Data may be processed in the same geography as the resource. |
dataZone |
Data may be processed in the same data zone as the resource. |
global |
Data may be processed in any Azure data center globally. |
ReferenceKnowledgeSource
File knowledge source.
Name | Type | Description |
---|---|---|
containerUrl |
string (uri) |
The URL of the blob container. |
fileListPath |
string |
Path to a file listing specific blobs to include. |
kind |
string:
reference |
The kind of knowledge source. |
prefix |
string |
An optional prefix to filter blobs within the container. |
ResourceStatus
Status of a resource.
Value | Description |
---|---|
creating |
The resource is being created. |
ready |
The resource is ready. |
deleting |
The resource is being deleted. |
failed |
The resource failed during creation. |
SegmentationMode
Segmentation mode used to split audio/visual content.
Value | Description |
---|---|
noSegmentation |
No segmentation. |
auto |
Automatic segmentation. |
custom |
Segment according to custom segmentation definition. |
TableFormat
Representation format of tables in analyze result markdown.
Value | Description |
---|---|
html |
Represent tables using HTML table elements: <table>, <th>, <tr>, <td>. |