Share via


Content Analyzers - Create Or Replace

Create a new analyzer asynchronously.

PUT {endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2025-05-01-preview

URI Parameters

Name In Required Type Description
analyzerId
path True

string

pattern: ^[a-zA-Z0-9._-]{1,64}$

The unique identifier of the analyzer.

endpoint
path True

string (uri)

Content Understanding service endpoint.

api-version
query True

string

minLength: 1

The API version to use for this operation.

Request Header

Name Required Type Description
x-ms-client-request-id

string (uuid)

An opaque, globally-unique, client-generated string identifier for the request.

Request Body

Name Type Description
baseAnalyzerId

string

pattern: ^[a-zA-Z0-9._-]{1,64}$

The analyzer to incrementally train from.

config

ContentAnalyzerConfig

Analyzer configuration settings.

description

string

A description of the analyzer.

fieldSchema

FieldSchema

The schema of fields to extracted.

knowledgeSources KnowledgeSource[]:

ReferenceKnowledgeSource[]

Additional knowledge sources used to enhance the analyzer.

mode

AnalysisMode

The analysis mode: standard, pro. Default is standard.

processingLocation

ProcessingLocation

The location where the data may be processed.

tags

object

Tags associated with the analyzer.

trainingData DataSource:

BlobDataSource

The data source containing training data for the analyzer.

Responses

Name Type Description
200 OK

ContentAnalyzer

The request has succeeded.

Headers

  • Operation-Location: string
  • x-ms-client-request-id: string
201 Created

ContentAnalyzer

The request has succeeded and a new resource has been created as a result.

Headers

  • Operation-Location: string
  • x-ms-client-request-id: string
Other Status Codes

Azure.Core.Foundations.ErrorResponse

An unexpected error response.

Headers

x-ms-error-code: string

Security

Ocp-Apim-Subscription-Key

Type: apiKey
In: header

OAuth2Auth

Type: oauth2
Flow: accessCode
Authorization URL: https://login.microsoftonline.com/common/oauth2/authorize
Token URL: https://login.microsoftonline.com/common/oauth2/token

Scopes

Name Description
https://cognitiveservices.azure.com/.default

Examples

Create or Replace Analyzer

Sample request

PUT {endpoint}/contentunderstanding/analyzers/myAnalyzer?api-version=2025-05-01-preview

{
  "description": "My analyzer",
  "tags": {
    "createdBy": "John"
  },
  "baseAnalyzerId": "prebuilt-documentAnalyzer",
  "config": {
    "enableFormula": false,
    "returnDetails": true
  },
  "fieldSchema": {
    "name": "MyForm",
    "description": "My form",
    "fields": {
      "Company": {
        "type": "string",
        "description": "Name of company."
      }
    },
    "definitions": {}
  },
  "trainingData": {
    "kind": "blob",
    "containerUrl": "https://myStorageAccount.blob.core.windows.net/myContainer?mySasToken",
    "prefix": "trainingData",
    "fileListPath": "trainingData/fileList.jsonl"
  }
}

Sample response

Operation-Location: https://myendpoint.cognitiveservices.azure.com/contentunderstanding/analyzers/myAnalyzer/operations/3b31320d-8bab-4f88-b19c-2322a7f11034?api-version=2025-05-01-preview
{
  "analyzerId": "myAnalyzer",
  "description": "My analyzer",
  "tags": {
    "createdBy": "John"
  },
  "status": "creating",
  "createdAt": "2025-05-01T18:46:36.051Z",
  "lastModifiedAt": "2025-05-01T18:46:36.051Z",
  "baseAnalyzerId": "prebuilt-documentAnalyzer",
  "config": {
    "locales": null,
    "enableFace": false,
    "enableOcr": true,
    "enableLayout": true,
    "enableFormula": false,
    "returnDetails": true
  },
  "fieldSchema": {
    "name": "MyForm",
    "description": "My form",
    "fields": {
      "Company": {
        "type": "string",
        "description": "Name of company."
      }
    },
    "definitions": {}
  },
  "trainingData": {
    "kind": "blob",
    "containerUrl": "https://myStorageAccount.blob.core.windows.net/myContainer",
    "prefix": "trainingData",
    "fileListPath": "trainingData/fileList.jsonl"
  }
}
Operation-Location: https://myendpoint.cognitiveservices.azure.com/contentunderstanding/analyzers/myAnalyzer/operations/3b31320d-8bab-4f88-b19c-2322a7f11034?api-version=2025-05-01-preview
{
  "analyzerId": "myAnalyzer",
  "description": "My analyzer",
  "tags": {
    "createdBy": "John"
  },
  "status": "creating",
  "createdAt": "2025-05-01T18:46:36.051Z",
  "lastModifiedAt": "2025-05-01T18:46:36.051Z",
  "baseAnalyzerId": "prebuilt-documentAnalyzer",
  "config": {
    "locales": null,
    "enableFace": false,
    "enableOcr": true,
    "enableLayout": true,
    "enableFormula": false,
    "returnDetails": true
  },
  "fieldSchema": {
    "name": "MyForm",
    "description": "My form",
    "fields": {
      "Company": {
        "type": "string",
        "description": "Name of company."
      }
    },
    "definitions": {}
  },
  "trainingData": {
    "kind": "blob",
    "containerUrl": "https://myStorageAccount.blob.core.windows.net/myContainer",
    "prefix": "trainingData",
    "fileListPath": "trainingData/fileList.jsonl"
  }
}

Definitions

Name Description
AnalysisMode

The analysis mode: standard, pro. Default is standard.

Azure.Core.Foundations.Error

The error object.

Azure.Core.Foundations.ErrorResponse

A response containing error details.

Azure.Core.Foundations.InnerError

An object containing more specific information about the error. As per Microsoft One API guidelines - https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#handling-errors.

BlobDataSource

Blob storage data source.

ContentAnalyzer

Analyzer that extracts content and fields from multimodal documents.

ContentAnalyzerConfig

Configuration settings for an analyzer.

DataSourceKind

Data source kind.

FieldDefinition

Definition of the field using a JSON Schema like syntax.

FieldSchema

Schema of fields to be extracted from documents.

FieldType

Semantic data type of the field value.

GenerationMethod

Generation method.

KnowledgeSourceKind

Knowledge source kind.

ProcessingLocation

The location where the data may be processed.

ReferenceKnowledgeSource

File knowledge source.

ResourceStatus

Status of a resource.

SegmentationMode

Segmentation mode used to split audio/visual content.

TableFormat

Representation format of tables in analyze result markdown.

AnalysisMode

The analysis mode: standard, pro. Default is standard.

Value Description
standard

Standard analysis mode.

pro

Pro analysis mode.

Azure.Core.Foundations.Error

The error object.

Name Type Description
code

string

One of a server-defined set of error codes.

details

Azure.Core.Foundations.Error[]

An array of details about specific errors that led to this reported error.

innererror

Azure.Core.Foundations.InnerError

An object containing more specific information than the current object about the error.

message

string

A human-readable representation of the error.

target

string

The target of the error.

Azure.Core.Foundations.ErrorResponse

A response containing error details.

Name Type Description
error

Azure.Core.Foundations.Error

The error object.

Azure.Core.Foundations.InnerError

An object containing more specific information about the error. As per Microsoft One API guidelines - https://github.com/microsoft/api-guidelines/blob/vNext/azure/Guidelines.md#handling-errors.

Name Type Description
code

string

One of a server-defined set of error codes.

innererror

Azure.Core.Foundations.InnerError

Inner error.

BlobDataSource

Blob storage data source.

Name Type Description
containerUrl

string (uri)

The URL of the blob container.

fileListPath

string

An optional path to a file listing specific blobs to include.

kind string:

blob

The kind of data source.

prefix

string

An optional prefix to filter blobs within the container.

ContentAnalyzer

Analyzer that extracts content and fields from multimodal documents.

Name Type Default value Description
analyzerId

string

pattern: ^[a-zA-Z0-9._-]{1,64}$

The unique identifier of the analyzer.

baseAnalyzerId

string

pattern: ^[a-zA-Z0-9._-]{1,64}$

The analyzer to incrementally train from.

config

ContentAnalyzerConfig

Analyzer configuration settings.

createdAt

string (date-time)

The date and time when the analyzer was created.

description

string

A description of the analyzer.

fieldSchema

FieldSchema

The schema of fields to extracted.

knowledgeSources KnowledgeSource[]:

ReferenceKnowledgeSource[]

Additional knowledge sources used to enhance the analyzer.

lastModifiedAt

string (date-time)

The date and time when the analyzer was last modified.

mode

AnalysisMode

standard

The analysis mode: standard, pro. Default is standard.

processingLocation

ProcessingLocation

geography

The location where the data may be processed.

status

ResourceStatus

The status of the analyzer.

tags

object

Tags associated with the analyzer.

trainingData DataSource:

BlobDataSource

The data source containing training data for the analyzer.

warnings

Azure.Core.Foundations.Error[]

Warnings encountered while creating the analyzer.

ContentAnalyzerConfig

Configuration settings for an analyzer.

Name Type Default value Description
disableContentFiltering

boolean

Disable content filtering that detects and prevents the output of harmful content.

disableFaceBlurring

boolean

Disable the default blurring of faces for privacy while processing the content.

enableFace

boolean

Enable face detection.

enableFormula

boolean

Enable mathematical formula detection.

enableLayout

boolean

Enable layout analysis.

enableOcr

boolean

Enable optical character recognition (OCR).

estimateFieldSourceAndConfidence

boolean

Return grounding source and confidence for extracted fields.

locales

string[]

List of locale hints for speech transcription.

personDirectoryId

string

Specify the person directory used for identifying detected faces.

returnDetails

boolean

Return all content details.

segmentationDefinition

string

Segmentation definition for use with custom segmentation mode.

segmentationMode

SegmentationMode

noSegmentation

Segmentation mode used to split audio/visual content.

tableFormat

TableFormat

html

Representation format of tables in analyze result markdown.

DataSourceKind

Data source kind.

Value Description
blob

A blob storage data source.

FieldDefinition

Definition of the field using a JSON Schema like syntax.

Name Type Default value Description
$ref

string

Reference to another field definition.

description

string

Field description.

enum

string[]

Enumeration of possible field values.

enumDescriptions

object

Descriptions for each enumeration value.

examples

string[]

Examples of field values.

items

FieldDefinition

Field type schema of each array element, if type is array.

method

GenerationMethod

generate

Generation method.

properties

<string,  FieldDefinition>

Named sub-fields, if type is object.

type

FieldType

Semantic data type of the field value.

FieldSchema

Schema of fields to be extracted from documents.

Name Type Description
definitions

<string,  FieldDefinition>

Additional definitions referenced by the fields in the schema.

description

string

A description of the field schema.

fields

<string,  FieldDefinition>

The fields defined in the schema.

name

string

The name of the field schema.

FieldType

Semantic data type of the field value.

Value Description
string

Plain text.

date

Date, normalized to ISO 8601 (YYYY-MM-DD) format.

time

Time, normalized to ISO 8601 (hh:mm:ss) format.

number

Number as double precision floating point.

integer

Integer as 64-bit signed integer.

boolean

Boolean value.

array

List of subfields of the same type.

object

Named list of subfields.

GenerationMethod

Generation method.

Value Description
generate

Values are generated freely based on the content.

extract

Values are extracted as they appear in the content.

classify

Values are classified against a predefined set of categories.

KnowledgeSourceKind

Knowledge source kind.

Value Description
reference

A reference knowledge source.

ProcessingLocation

The location where the data may be processed.

Value Description
geography

Data may be processed in the same geography as the resource.

dataZone

Data may be processed in the same data zone as the resource.

global

Data may be processed in any Azure data center globally.

ReferenceKnowledgeSource

File knowledge source.

Name Type Description
containerUrl

string (uri)

The URL of the blob container.

fileListPath

string

Path to a file listing specific blobs to include.

kind string:

reference

The kind of knowledge source.

prefix

string

An optional prefix to filter blobs within the container.

ResourceStatus

Status of a resource.

Value Description
creating

The resource is being created.

ready

The resource is ready.

deleting

The resource is being deleted.

failed

The resource failed during creation.

SegmentationMode

Segmentation mode used to split audio/visual content.

Value Description
noSegmentation

No segmentation.

auto

Automatic segmentation.

custom

Segment according to custom segmentation definition.

TableFormat

Representation format of tables in analyze result markdown.

Value Description
html

Represent tables using HTML table elements: <table>, <th>, <tr>, <td>.