Indexers - Create

Reference

Service:: Search Service

API Version:: 2023-10-01-Preview

Creates a new indexer.

POST {endpoint}/indexers?api-version=2023-10-01-Preview

URI Parameters

Name	In	Required	Type	Description
endpoint	path	True	string	The endpoint URL of the search service.
api-version	query	True	string	Client Api Version.

Request Header

Name	Required	Type	Description
x-ms-client-request-id		string uuid	The tracking ID sent with the request to help with debugging.

Request Body

Name	Required	Type	Description
dataSourceName	True	string	The name of the datasource from which this indexer reads data.
name	True	string	The name of the indexer.
targetIndexName	True	string	The name of the index to which this indexer writes data.
@odata.etag		string	The ETag of the indexer.
cache		SearchIndexerCache	Adds caching to an enrichment pipeline to allow for incremental modification steps without having to rebuild the index every time.
description		string	The description of the indexer.
disabled		boolean	A value indicating whether the indexer is disabled. Default is false.
encryptionKey		SearchResourceEncryptionKey	A description of an encryption key that you create in Azure Key Vault. This key is used to provide an additional level of encryption-at-rest for your indexer definition (as well as indexer execution status) when you want full assurance that no one, not even Microsoft, can decrypt them. Once you have encrypted your indexer definition, it will always remain encrypted. The search service will ignore attempts to set this property to null. You can change this property as needed if you want to rotate your encryption key; Your indexer definition (and indexer execution status) will be unaffected. Encryption with customer-managed keys is not available for free search services, and is only available for paid services created on or after January 1, 2019.
fieldMappings		FieldMapping[]	Defines mappings between fields in the data source and corresponding target fields in the index.
outputFieldMappings		FieldMapping[]	Output field mappings are applied after enrichment and immediately before indexing.
parameters		IndexingParameters	Parameters for indexer execution.
schedule		IndexingSchedule	The schedule for this indexer.
skillsetName		string	The name of the skillset executing with this indexer.

Responses

Name	Type	Description
201 Created	SearchIndexer
Other Status Codes	SearchError	Error response.

Examples

SearchServiceCreateIndexer

Sample request

HTTP

POST https://myservice.search.windows.net/indexers?api-version=2023-10-01-Preview

{
  "name": "myindexer",
  "description": "a cool indexer",
  "dataSourceName": "mydatasource",
  "targetIndexName": "orders",
  "schedule": {
    "interval": "PT1H",
    "startTime": "2015-01-01T00:00:00Z"
  },
  "parameters": {
    "maxFailedItems": 10,
    "maxFailedItemsPerBatch": 5
  },
  "encryptionKey": {
    "keyVaultKeyName": "myUserManagedEncryptionKey-createdinAzureKeyVault",
    "keyVaultKeyVersion": "myKeyVersion-32charAlphaNumericString",
    "keyVaultUri": "https://myKeyVault.vault.azure.net",
    "accessCredentials": {
      "applicationId": "00000000-0000-0000-0000-000000000000",
      "applicationSecret": "<applicationSecret>"
    }
  }
}

Sample response

Status code:: 201

{
  "name": "myindexer",
  "description": "a cool indexer",
  "dataSourceName": "mydatasource",
  "targetIndexName": "orders",
  "schedule": {
    "interval": "PT1H",
    "startTime": "2015-01-01T00:00:00Z"
  },
  "parameters": {
    "maxFailedItems": 10,
    "maxFailedItemsPerBatch": 5
  },
  "fieldMappings": [],
  "disabled": false,
  "encryptionKey": {
    "keyVaultKeyName": "myUserManagedEncryptionKey-createdinAzureKeyVault",
    "keyVaultKeyVersion": "myKeyVersion-32charAlphaNumericString",
    "keyVaultUri": "https://myKeyVault.vault.azure.net",
    "accessCredentials": {
      "applicationId": "00000000-0000-0000-0000-000000000000",
      "applicationSecret": null
    }
  }
}

Definitions

Name	Description
AzureActiveDirectoryApplicationCredentials	Credentials of a registered application created for your search service, used for authenticated access to the encryption keys stored in Azure Key Vault.
BlobIndexerDataToExtract	Specifies the data to extract from Azure blob storage and tells the indexer which data to extract from image content when "imageAction" is set to a value other than "none". This applies to embedded image content in a .PDF or other application, or image files such as .jpg and .png, in Azure blobs.
BlobIndexerImageAction	Determines how to process embedded images and image files in Azure blob storage. Setting the "imageAction" configuration to any value other than "none" requires that a skillset also be attached to that indexer.
BlobIndexerParsingMode	Represents the parsing mode for indexing from an Azure blob data source.
BlobIndexerPDFTextRotationAlgorithm	Determines algorithm for text extraction from PDF files in Azure blob storage.
FieldMapping	Defines a mapping between a field in a data source and a target field in an index.
FieldMappingFunction	Represents a function that transforms a value from a data source before indexing.
IndexerExecutionEnvironment	Specifies the environment in which the indexer should execute.
IndexingParameters	Represents parameters for indexer execution.
IndexingParametersConfiguration	A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
IndexingSchedule	Represents a schedule for indexer execution.
SearchError	Describes an error condition for the API.
SearchIndexer	Represents an indexer.
SearchIndexerCache
SearchIndexerDataNoneIdentity	Clears the identity property of a datasource.
SearchIndexerDataUserAssignedIdentity	Specifies the identity for a datasource to use.
SearchResourceEncryptionKey	A customer-managed encryption key in Azure Key Vault. Keys that you create and manage can be used to encrypt or decrypt data-at-rest, such as indexes and synonym maps.

AzureActiveDirectoryApplicationCredentials

Credentials of a registered application created for your search service, used for authenticated access to the encryption keys stored in Azure Key Vault.

Name	Type	Description
applicationId	string	An AAD Application ID that was granted the required access permissions to the Azure Key Vault that is to be used when encrypting your data at rest. The Application ID should not be confused with the Object ID for your AAD Application.
applicationSecret	string	The authentication key of the specified AAD application.

BlobIndexerDataToExtract

Specifies the data to extract from Azure blob storage and tells the indexer which data to extract from image content when "imageAction" is set to a value other than "none". This applies to embedded image content in a .PDF or other application, or image files such as .jpg and .png, in Azure blobs.

Name	Type	Description
allMetadata	string	Extracts metadata provided by the Azure blob storage subsystem and the content-type specific metadata (for example, metadata unique to just .png files are indexed).
contentAndMetadata	string	Extracts all metadata and textual content from each blob.
storageMetadata	string	Indexes just the standard blob properties and user-specified metadata.

BlobIndexerImageAction

Determines how to process embedded images and image files in Azure blob storage. Setting the "imageAction" configuration to any value other than "none" requires that a skillset also be attached to that indexer.

Name	Type	Description
generateNormalizedImagePerPage	string	Extracts text from images (for example, the word "STOP" from a traffic stop sign), and embeds it into the content field, but treats PDF files differently in that each page will be rendered as an image and normalized accordingly, instead of extracting embedded images. Non-PDF file types will be treated the same as if "generateNormalizedImages" was set.
generateNormalizedImages	string	Extracts text from images (for example, the word "STOP" from a traffic stop sign), and embeds it into the content field. This action requires that "dataToExtract" is set to "contentAndMetadata". A normalized image refers to additional processing resulting in uniform image output, sized and rotated to promote consistent rendering when you include images in visual search results. This information is generated for each image when you use this option.
none	string	Ignores embedded images or image files in the data set. This is the default.

BlobIndexerParsingMode

Represents the parsing mode for indexing from an Azure blob data source.

Name	Type	Description
default	string	Set to default for normal file processing.
delimitedText	string	Set to delimitedText when blobs are plain CSV files.
json	string	Set to json to extract structured content from JSON files.
jsonArray	string	Set to jsonArray to extract individual elements of a JSON array as separate documents.
jsonLines	string	Set to jsonLines to extract individual JSON entities, separated by a new line, as separate documents.
text	string	Set to text to improve indexing performance on plain text files in blob storage.

BlobIndexerPDFTextRotationAlgorithm

Determines algorithm for text extraction from PDF files in Azure blob storage.

Name	Type	Description
detectAngles	string	May produce better and more readable text extraction from PDF files that have rotated text within them. Note that there may be a small performance speed impact when this parameter is used. This parameter only applies to PDF files, and only to PDFs with embedded text. If the rotated text appears within an embedded image in the PDF, this parameter does not apply.
none	string	Leverages normal text extraction. This is the default.

FieldMapping

Defines a mapping between a field in a data source and a target field in an index.

Name	Type	Description
mappingFunction	FieldMappingFunction	A function to apply to each source field value before indexing.
sourceFieldName	string	The name of the field in the data source.
targetFieldName	string	The name of the target field in the index. Same as the source field name by default.

FieldMappingFunction

Represents a function that transforms a value from a data source before indexing.

Name	Type	Description
name	string	The name of the field mapping function.
parameters	object	A dictionary of parameter name/value pairs to pass to the function. Each value must be of a primitive type.

IndexerExecutionEnvironment

Specifies the environment in which the indexer should execute.

Name	Type	Description
private	string	Indicates that the indexer should run with the environment provisioned specifically for the search service. This should only be specified as the execution environment if the indexer needs to access resources securely over shared private link resources.
standard	string	Indicates that the search service can determine where the indexer should execute. This is the default environment when nothing is specified and is the recommended value.

IndexingParameters

Represents parameters for indexer execution.

Name	Type	Default value	Description
batchSize	integer		The number of items that are read from the data source and indexed as a single batch in order to improve performance. The default depends on the data source type.
configuration	IndexingParametersConfiguration		A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
maxFailedItems	integer	0	The maximum number of items that can fail indexing for indexer execution to still be considered successful. -1 means no limit. Default is 0.
maxFailedItemsPerBatch	integer	0	The maximum number of items in a single batch that can fail indexing for the batch to still be considered successful. -1 means no limit. Default is 0.

IndexingParametersConfiguration

A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.

Name	Type	Default value	Description
allowSkillsetToReadFileData	boolean	False	If true, will create a path //document//file_data that is an object representing the original file data downloaded from your blob data source. This allows you to pass the original file data to a custom skill for processing within the enrichment pipeline, or to the Document Extraction skill.
dataToExtract	BlobIndexerDataToExtract	contentAndMetadata	Specifies the data to extract from Azure blob storage and tells the indexer which data to extract from image content when "imageAction" is set to a value other than "none". This applies to embedded image content in a .PDF or other application, or image files such as .jpg and .png, in Azure blobs.
delimitedTextDelimiter	string		For CSV blobs, specifies the end-of-line single-character delimiter for CSV files where each line starts a new document (for example, "\|").
delimitedTextHeaders	string		For CSV blobs, specifies a comma-delimited list of column headers, useful for mapping source fields to destination fields in an index.
documentRoot	string		For JSON arrays, given a structured or semi-structured document, you can specify a path to the array using this property.
excludedFileNameExtensions	string		Comma-delimited list of filename extensions to ignore when processing from Azure blob storage. For example, you could exclude ".png, .mp4" to skip over those files during indexing.
executionEnvironment	IndexerExecutionEnvironment	standard	Specifies the environment in which the indexer should execute.
failOnUnprocessableDocument	boolean	False	For Azure blobs, set to false if you want to continue indexing if a document fails indexing.
failOnUnsupportedContentType	boolean	False	For Azure blobs, set to false if you want to continue indexing when an unsupported content type is encountered, and you don't know all the content types (file extensions) in advance.
firstLineContainsHeaders	boolean	True	For CSV blobs, indicates that the first (non-blank) line of each blob contains headers.
imageAction	BlobIndexerImageAction	none	Determines how to process embedded images and image files in Azure blob storage. Setting the "imageAction" configuration to any value other than "none" requires that a skillset also be attached to that indexer.
indexStorageMetadataOnlyForOversizedDocuments	boolean	False	For Azure blobs, set this property to true to still index storage metadata for blob content that is too large to process. Oversized blobs are treated as errors by default. For limits on blob size, see https://docs.microsoft.com/azure/search/search-limits-quotas-capacity.
indexedFileNameExtensions	string		Comma-delimited list of filename extensions to select when processing from Azure blob storage. For example, you could focus indexing on specific application files ".docx, .pptx, .msg" to specifically include those file types.
parsingMode	BlobIndexerParsingMode	default	Represents the parsing mode for indexing from an Azure blob data source.
pdfTextRotationAlgorithm	BlobIndexerPDFTextRotationAlgorithm	none	Determines algorithm for text extraction from PDF files in Azure blob storage.
queryTimeout	string	00:05:00	Increases the timeout beyond the 5-minute default for Azure SQL database data sources, specified in the format "hh:mm:ss".

IndexingSchedule

Represents a schedule for indexer execution.

Name	Type	Description
interval	string	The interval of time between indexer executions.
startTime	string	The time when an indexer should start running.

SearchError

Describes an error condition for the API.

Name	Type	Description
code	string	One of a server-defined set of error codes.
details	SearchError[]	An array of details about specific errors that led to this reported error.
message	string	A human-readable representation of the error.

SearchIndexer

Represents an indexer.

Name	Type	Default value	Description
@odata.etag	string		The ETag of the indexer.
cache	SearchIndexerCache		Adds caching to an enrichment pipeline to allow for incremental modification steps without having to rebuild the index every time.
dataSourceName	string		The name of the datasource from which this indexer reads data.
description	string		The description of the indexer.
disabled	boolean	False	A value indicating whether the indexer is disabled. Default is false.
encryptionKey	SearchResourceEncryptionKey		A description of an encryption key that you create in Azure Key Vault. This key is used to provide an additional level of encryption-at-rest for your indexer definition (as well as indexer execution status) when you want full assurance that no one, not even Microsoft, can decrypt them. Once you have encrypted your indexer definition, it will always remain encrypted. The search service will ignore attempts to set this property to null. You can change this property as needed if you want to rotate your encryption key; Your indexer definition (and indexer execution status) will be unaffected. Encryption with customer-managed keys is not available for free search services, and is only available for paid services created on or after January 1, 2019.
fieldMappings	FieldMapping[]		Defines mappings between fields in the data source and corresponding target fields in the index.
name	string		The name of the indexer.
outputFieldMappings	FieldMapping[]		Output field mappings are applied after enrichment and immediately before indexing.
parameters	IndexingParameters		Parameters for indexer execution.
schedule	IndexingSchedule		The schedule for this indexer.
skillsetName	string		The name of the skillset executing with this indexer.
targetIndexName	string		The name of the index to which this indexer writes data.

SearchIndexerCache

Name	Type	Description
enableReprocessing	boolean	Specifies whether incremental reprocessing is enabled.
identity	SearchIndexerDataIdentity: SearchIndexerDataNoneIdentity SearchIndexerDataUserAssignedIdentity	The user-assigned managed identity used for connections to the enrichment cache. If the connection string indicates an identity (ResourceId) and it's not specified, the system-assigned managed identity is used. On updates to the indexer, if the identity is unspecified, the value remains unchanged. If set to "none", the value of this property is cleared.
storageConnectionString	string	The connection string to the storage account where the cache data will be persisted.

SearchIndexerDataNoneIdentity

Clears the identity property of a datasource.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.DataNoneIdentity	A URI fragment specifying the type of identity.

SearchIndexerDataUserAssignedIdentity

Specifies the identity for a datasource to use.

Name	Type	Description
@odata.type	string: #Microsoft.Azure.Search.DataUserAssignedIdentity	A URI fragment specifying the type of identity.
userAssignedIdentity	string	The fully qualified Azure resource Id of a user assigned managed identity typically in the form "/subscriptions/12345678-1234-1234-1234-1234567890ab/resourceGroups/rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myId" that should have been assigned to the search service.

SearchResourceEncryptionKey

A customer-managed encryption key in Azure Key Vault. Keys that you create and manage can be used to encrypt or decrypt data-at-rest, such as indexes and synonym maps.

Name	Type	Description
accessCredentials	AzureActiveDirectoryApplicationCredentials	Optional Azure Active Directory credentials used for accessing your Azure Key Vault. Not required if using managed identity instead.
identity	SearchIndexerDataIdentity: SearchIndexerDataNoneIdentity SearchIndexerDataUserAssignedIdentity	An explicit managed identity to use for this encryption key. If not specified and the access credentials property is null, the system-assigned managed identity is used. On update to the resource, if the explicit identity is unspecified, it remains unchanged. If "none" is specified, the value of this property is cleared.
keyVaultKeyName	string	The name of your Azure Key Vault key to be used to encrypt your data at rest.
keyVaultKeyVersion	string	The version of your Azure Key Vault key to be used to encrypt your data at rest.
keyVaultUri	string	The URI of your Azure Key Vault, also referred to as DNS name, that contains the key to be used to encrypt your data at rest. An example URI might be `https://my-keyvault-name.vault.azure.net`.

Share via