Skillsets - Create
Creates a new skillset in a search service.
POST {endpoint}/skillsets?api-version=2023-10-01-Preview
URI Parameters
Name | In | Required | Type | Description |
---|---|---|---|---|
endpoint
|
path | True |
string |
The endpoint URL of the search service. |
api-version
|
query | True |
string |
Client Api Version. |
Request Header
Name | Required | Type | Description |
---|---|---|---|
x-ms-client-request-id |
string uuid |
The tracking ID sent with the request to help with debugging. |
Request Body
Name | Required | Type | Description |
---|---|---|---|
name | True |
string |
The name of the skillset. |
skills | True |
SearchIndexerSkill[]:
|
A list of skills in the skillset. |
@odata.etag |
string |
The ETag of the skillset. |
|
cognitiveServices | CognitiveServicesAccount: |
Details about the Azure AI service to be used when running skills. |
|
description |
string |
The description of the skillset. |
|
encryptionKey |
A description of an encryption key that you create in Azure Key Vault. This key is used to provide an additional level of encryption-at-rest for your skillset definition when you want full assurance that no one, not even Microsoft, can decrypt your skillset definition. Once you have encrypted your skillset definition, it will always remain encrypted. The search service will ignore attempts to set this property to null. You can change this property as needed if you want to rotate your encryption key; Your skillset definition will be unaffected. Encryption with customer-managed keys is not available for free search services, and is only available for paid services created on or after January 1, 2019. |
||
indexProjections |
Definition of additional projections to secondary search index(es). |
||
knowledgeStore |
Definition of additional projections to Azure blob, table, or files, of enriched data. |
Responses
Name | Type | Description |
---|---|---|
201 Created |
The skillset is successfully created. |
|
Other Status Codes |
Error response. |
Examples
SearchServiceCreateSkillset
Sample request
POST https://myservice.search.windows.net/skillsets?api-version=2023-10-01-Preview
{
"name": "demoskillset",
"description": "Extract entities, detect language and extract key-phrases",
"skills": [
{
"@odata.type": "#Microsoft.Skills.Text.V3.EntityRecognitionSkill",
"categories": [
"organization"
],
"defaultLanguageCode": "en",
"minimumPrecision": 0.7,
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "organizations",
"targetName": "organizations"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.LanguageDetectionSkill",
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "languageCode",
"targetName": "languageCode"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.SplitSkill",
"textSplitMode": "pages",
"maximumPageLength": 4000,
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "textItems",
"targetName": "pages"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.KeyPhraseExtractionSkill",
"context": "/document/pages/*",
"inputs": [
{
"name": "text",
"source": "/document/pages/*"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "keyPhrases",
"targetName": "keyPhrases"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "MyCustomWebApiSkill",
"uri": "https://contoso.example.org",
"httpMethod": "POST",
"timeout": "PT30S",
"batchSize": 1,
"inputs": [
{
"name": "text",
"source": "/document/pages/*"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "customresult",
"targetName": "result"
}
],
"httpHeaders": {}
}
],
"knowledgeStore": {
"storageConnectionString": "DefaultEndpointsProtocol=https;AccountName=myStorage;AccountKey=myStorageKey;EndpointSuffix=core.windows.net",
"projections": [
{
"tables": [
{
"tableName": "Reviews",
"generatedKeyName": "ReviewId",
"source": "/document/Review",
"sourceContext": null,
"inputs": []
},
{
"tableName": "Sentences",
"generatedKeyName": "SentenceId",
"source": "/document/Review/Sentences/*",
"sourceContext": null,
"inputs": []
},
{
"tableName": "KeyPhrases",
"generatedKeyName": "KeyPhraseId",
"source": "/document/Review/Sentences/*/KeyPhrases",
"sourceContext": null,
"inputs": []
},
{
"tableName": "Entities",
"generatedKeyName": "EntityId",
"source": "/document/Review/Sentences/*/Entities/*",
"sourceContext": null,
"inputs": []
}
]
},
{
"objects": [
{
"storageContainer": "Reviews",
"source": "/document/Review",
"generatedKeyName": "/document/Review/Id"
}
]
}
]
},
"encryptionKey": {
"keyVaultKeyName": "myUserManagedEncryptionKey-createdinAzureKeyVault",
"keyVaultKeyVersion": "myKeyVersion-32charAlphaNumericString",
"keyVaultUri": "https://myKeyVault.vault.azure.net",
"accessCredentials": {
"applicationId": "00000000-0000-0000-0000-000000000000",
"applicationSecret": "<applicationSecret>"
}
}
}
Sample response
{
"name": "demoskillset",
"description": "Extract entities, detect language and extract key-phrases",
"skills": [
{
"@odata.type": "#Microsoft.Skills.Text.V3.EntityRecognitionSkill",
"name": "#1",
"description": null,
"context": null,
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "organizations",
"targetName": "organizations"
}
],
"categories": [
"organization"
],
"defaultLanguageCode": "en",
"minimumPrecision": 0.7
},
{
"@odata.type": "#Microsoft.Skills.Text.LanguageDetectionSkill",
"name": "#2",
"description": null,
"context": null,
"inputs": [
{
"name": "text",
"source": "/document/content"
}
],
"outputs": [
{
"name": "languageCode",
"targetName": "languageCode"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Text.SplitSkill",
"name": "#3",
"description": null,
"context": null,
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "textItems",
"targetName": "pages"
}
],
"defaultLanguageCode": null,
"textSplitMode": "pages",
"maximumPageLength": 4000
},
{
"@odata.type": "#Microsoft.Skills.Text.KeyPhraseExtractionSkill",
"name": "#4",
"description": null,
"context": "/document/pages/*",
"inputs": [
{
"name": "text",
"source": "/document/pages/*"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "keyPhrases",
"targetName": "keyPhrases"
}
],
"defaultLanguageCode": null,
"maxKeyPhraseCount": null
},
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "MyCustomWebApiSkill",
"description": null,
"context": "/document",
"uri": "https://contoso.example.org",
"httpMethod": "POST",
"timeout": "PT30S",
"batchSize": 1,
"degreeOfParallelism": null,
"inputs": [
{
"name": "text",
"source": "/document/pages/*"
},
{
"name": "languageCode",
"source": "/document/languageCode"
}
],
"outputs": [
{
"name": "customresult",
"targetName": "result"
}
],
"httpHeaders": {}
}
],
"encryptionKey": {
"keyVaultKeyName": "myUserManagedEncryptionKey-createdinAzureKeyVault",
"keyVaultKeyVersion": "myKeyVersion-32charAlphaNumericString",
"keyVaultUri": "https://myKeyVault.vault.azure.net",
"accessCredentials": {
"applicationId": "00000000-0000-0000-0000-000000000000",
"applicationSecret": null
}
}
}
Definitions
Name | Description |
---|---|
Aml |
The AML skill allows you to extend AI enrichment with a custom Azure Machine Learning (AML) model. Once an AML model is trained and deployed, an AML skill integrates it into AI enrichment. |
Azure |
Credentials of a registered application created for your search service, used for authenticated access to the encryption keys stored in Azure Key Vault. |
Azure |
Allows you to generate a vector embedding for a given text input using the Azure OpenAI resource. |
Cognitive |
The multi-region account key of an Azure AI service resource that's attached to a skillset. |
Conditional |
A skill that enables scenarios that require a Boolean operation to determine the data to assign to an output. |
Custom |
An object that contains information about the matches that were found, and related metadata. |
Custom |
A complex object that can be used to specify alternative spellings or synonyms to the root entity name. |
Custom |
A skill looks for text from a custom, user-defined list of words and phrases. |
Custom |
The language codes supported for input text by CustomEntityLookupSkill. |
Default |
An empty object that represents the default Azure AI service resource for a skillset. |
Document |
A skill that extracts content from a file within the enrichment pipeline. |
Entity |
A string indicating what entity categories to return. |
Entity |
Using the Text Analytics API, extracts linked entities from text. |
Entity |
This skill is deprecated. Use the V3.EntityRecognitionSkill instead. |
Entity |
Deprecated. The language codes supported for input text by EntityRecognitionSkill. |
Entity |
Using the Text Analytics API, extracts entities of different types from text. |
Image |
A skill that analyzes image files. It extracts a rich set of visual features based on the image content. |
Image |
The language codes supported for input by ImageAnalysisSkill. |
Image |
A string indicating which domain-specific details to return. |
Index |
Defines behavior of the index projections in relation to the rest of the indexer. |
Input |
Input field mapping for a skill. |
Key |
A skill that uses text analytics for key phrase extraction. |
Key |
The language codes supported for input text by KeyPhraseExtractionSkill. |
Language |
A skill that detects the language of input text and reports a single language code for every document submitted on the request. The language code is paired with a score indicating the confidence of the analysis. |
Line |
Defines the sequence of characters to use between the lines of text recognized by the OCR skill. The default value is "space". |
Merge |
A skill for merging two or more strings into a single unified string, with an optional user-defined delimiter separating each component part. |
Ocr |
A skill that extracts text from image files. |
Ocr |
The language codes supported for input by OcrSkill. |
Output |
Output field mapping for a skill. |
PIIDetection |
Using the Text Analytics API, extracts personal information from an input text and gives you the option of masking it. |
PIIDetection |
A string indicating what maskingMode to use to mask the personal information detected in the input text. |
Search |
Describes an error condition for the API. |
Search |
Clears the identity property of a datasource. |
Search |
Specifies the identity for a datasource to use. |
Search |
Definition of additional projections to secondary search indexes. |
Search |
Description for what data to store in the designated search index. |
Search |
A dictionary of index projection-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type. |
Search |
Definition of additional projections to azure blob, table, or files, of enriched data. |
Search |
Projection definition for what data to store in Azure Files. |
Search |
Projection definition for what data to store in Azure Blob. |
Search |
A dictionary of knowledge store-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type. |
Search |
Container object for various projection selectors. |
Search |
Description for what data to store in Azure Tables. |
Search |
A list of skills. |
Search |
A customer-managed encryption key in Azure Key Vault. Keys that you create and manage can be used to encrypt or decrypt data-at-rest, such as indexes and synonym maps. |
Sentiment |
This skill is deprecated. Use the V3.SentimentSkill instead. |
Sentiment |
Deprecated. The language codes supported for input text by SentimentSkill. |
Sentiment |
Using the Text Analytics API, evaluates unstructured text and for each record, provides sentiment labels (such as "negative", "neutral" and "positive") based on the highest confidence score found by the service at a sentence and document-level. |
Shaper |
A skill for reshaping the outputs. It creates a complex type to support composite fields (also known as multipart fields). |
Split |
A skill to split a string into chunks of text. |
Split |
The language codes supported for input text by SplitSkill. |
Text |
A value indicating which split mode to perform. |
Text |
A skill to translate text from one language to another. |
Text |
The language codes supported for input text by TextTranslationSkill. |
Visual |
The strings indicating what visual feature types to return. |
Web |
A skill that can call a Web API endpoint, allowing you to extend a skillset by having it call your custom code. |
AmlSkill
The AML skill allows you to extend AI enrichment with a custom Azure Machine Learning (AML) model. Once an AML model is trained and deployed, an AML skill integrates it into AI enrichment.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
degreeOfParallelism |
integer |
(Optional) When specified, indicates the number of calls the indexer will make in parallel to the endpoint you have provided. You can decrease this value if your endpoint is failing under too high of a request load, or raise it if your endpoint is able to accept more requests and you would like an increase in the performance of the indexer. If not set, a default value of 5 is used. The degreeOfParallelism can be set to a maximum of 10 and a minimum of 1. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
key |
string |
(Required for key authentication) The key for the AML service. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
region |
string |
(Optional for token authentication). The region the AML service is deployed in. |
resourceId |
string |
(Required for token authentication). The Azure Resource Manager resource ID of the AML service. It should be in the format subscriptions/{guid}/resourceGroups/{resource-group-name}/Microsoft.MachineLearningServices/workspaces/{workspace-name}/services/{service_name}. |
timeout |
string |
(Optional) When specified, indicates the timeout for the http client making the API call. |
uri |
string |
(Required for no authentication or key authentication) The scoring URI of the AML service to which the JSON payload will be sent. Only the https URI scheme is allowed. |
AzureActiveDirectoryApplicationCredentials
Credentials of a registered application created for your search service, used for authenticated access to the encryption keys stored in Azure Key Vault.
Name | Type | Description |
---|---|---|
applicationId |
string |
An AAD Application ID that was granted the required access permissions to the Azure Key Vault that is to be used when encrypting your data at rest. The Application ID should not be confused with the Object ID for your AAD Application. |
applicationSecret |
string |
The authentication key of the specified AAD application. |
AzureOpenAIEmbeddingSkill
Allows you to generate a vector embedding for a given text input using the Azure OpenAI resource.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
apiKey |
string |
API key for the designated Azure OpenAI resource. |
authIdentity | SearchIndexerDataIdentity: |
The user-assigned managed identity used for outbound connections. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
deploymentId |
string |
ID of your Azure OpenAI model deployment on the designated resource. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
resourceUri |
string |
The resource URI for your Azure OpenAI resource. |
CognitiveServicesAccountKey
The multi-region account key of an Azure AI service resource that's attached to a skillset.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of Azure AI service resource attached to a skillset. |
description |
string |
Description of the Azure AI service resource attached to a skillset. |
key |
string |
The key used to provision the Azure AI service resource attached to a skillset. |
ConditionalSkill
A skill that enables scenarios that require a Boolean operation to determine the data to assign to an output.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
CustomEntity
An object that contains information about the matches that were found, and related metadata.
Name | Type | Description |
---|---|---|
accentSensitive |
boolean |
Defaults to false. Boolean value denoting whether comparisons with the entity name should be sensitive to accent. |
aliases |
An array of complex objects that can be used to specify alternative spellings or synonyms to the root entity name. |
|
caseSensitive |
boolean |
Defaults to false. Boolean value denoting whether comparisons with the entity name should be sensitive to character casing. Sample case insensitive matches of "Microsoft" could be: microsoft, microSoft, MICROSOFT. |
defaultAccentSensitive |
boolean |
Changes the default accent sensitivity value for this entity. It be used to change the default value of all aliases accentSensitive values. |
defaultCaseSensitive |
boolean |
Changes the default case sensitivity value for this entity. It be used to change the default value of all aliases caseSensitive values. |
defaultFuzzyEditDistance |
integer |
Changes the default fuzzy edit distance value for this entity. It can be used to change the default value of all aliases fuzzyEditDistance values. |
description |
string |
This field can be used as a passthrough for custom metadata about the matched text(s). The value of this field will appear with every match of its entity in the skill output. |
fuzzyEditDistance |
integer |
Defaults to 0. Maximum value of 5. Denotes the acceptable number of divergent characters that would still constitute a match with the entity name. The smallest possible fuzziness for any given match is returned. For instance, if the edit distance is set to 3, "Windows10" would still match "Windows", "Windows10" and "Windows 7". When case sensitivity is set to false, case differences do NOT count towards fuzziness tolerance, but otherwise do. |
id |
string |
This field can be used as a passthrough for custom metadata about the matched text(s). The value of this field will appear with every match of its entity in the skill output. |
name |
string |
The top-level entity descriptor. Matches in the skill output will be grouped by this name, and it should represent the "normalized" form of the text being found. |
subtype |
string |
This field can be used as a passthrough for custom metadata about the matched text(s). The value of this field will appear with every match of its entity in the skill output. |
type |
string |
This field can be used as a passthrough for custom metadata about the matched text(s). The value of this field will appear with every match of its entity in the skill output. |
CustomEntityAlias
A complex object that can be used to specify alternative spellings or synonyms to the root entity name.
Name | Type | Description |
---|---|---|
accentSensitive |
boolean |
Determine if the alias is accent sensitive. |
caseSensitive |
boolean |
Determine if the alias is case sensitive. |
fuzzyEditDistance |
integer |
Determine the fuzzy edit distance of the alias. |
text |
string |
The text of the alias. |
CustomEntityLookupSkill
A skill looks for text from a custom, user-defined list of words and phrases.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
entitiesDefinitionUri |
string |
Path to a JSON or CSV file containing all the target text to match against. This entity definition is read at the beginning of an indexer run. Any updates to this file during an indexer run will not take effect until subsequent runs. This config must be accessible over HTTPS. |
globalDefaultAccentSensitive |
boolean |
A global flag for AccentSensitive. If AccentSensitive is not set in CustomEntity, this value will be the default value. |
globalDefaultCaseSensitive |
boolean |
A global flag for CaseSensitive. If CaseSensitive is not set in CustomEntity, this value will be the default value. |
globalDefaultFuzzyEditDistance |
integer |
A global flag for FuzzyEditDistance. If FuzzyEditDistance is not set in CustomEntity, this value will be the default value. |
inlineEntitiesDefinition |
The inline CustomEntity definition. |
|
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
CustomEntityLookupSkillLanguage
The language codes supported for input text by CustomEntityLookupSkill.
Name | Type | Description |
---|---|---|
da |
string |
Danish |
de |
string |
German |
en |
string |
English |
es |
string |
Spanish |
fi |
string |
Finnish |
fr |
string |
French |
it |
string |
Italian |
ko |
string |
Korean |
pt |
string |
Portuguese |
DefaultCognitiveServicesAccount
An empty object that represents the default Azure AI service resource for a skillset.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of Azure AI service resource attached to a skillset. |
description |
string |
Description of the Azure AI service resource attached to a skillset. |
DocumentExtractionSkill
A skill that extracts content from a file within the enrichment pipeline.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
configuration |
object |
A dictionary of configurations for the skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
dataToExtract |
string |
The type of data to be extracted for the skill. Will be set to 'contentAndMetadata' if not defined. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
parsingMode |
string |
The parsingMode for the skill. Will be set to 'default' if not defined. |
EntityCategory
A string indicating what entity categories to return.
Name | Type | Description |
---|---|---|
datetime |
string |
Entities describing a date and time. |
string |
Entities describing an email address. |
|
location |
string |
Entities describing a physical location. |
organization |
string |
Entities describing an organization. |
person |
string |
Entities describing a person. |
quantity |
string |
Entities describing a quantity. |
url |
string |
Entities describing a URL. |
EntityLinkingSkill
Using the Text Analytics API, extracts linked entities from text.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
string |
A value indicating which language code to use. Default is |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
minimumPrecision |
number |
A value between 0 and 1 that be used to only include entities whose confidence score is greater than the value specified. If not set (default), or if explicitly set to null, all entities will be included. |
modelVersion |
string |
The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
EntityRecognitionSkill
This skill is deprecated. Use the V3.EntityRecognitionSkill instead.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
categories |
A list of entity categories that should be extracted. |
|
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
includeTypelessEntities |
boolean |
Determines whether or not to include entities which are well known but don't conform to a pre-defined type. If this configuration is not set (default), set to null or set to false, entities which don't conform to one of the pre-defined types will not be surfaced. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
minimumPrecision |
number |
A value between 0 and 1 that be used to only include entities whose confidence score is greater than the value specified. If not set (default), or if explicitly set to null, all entities will be included. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
EntityRecognitionSkillLanguage
Deprecated. The language codes supported for input text by EntityRecognitionSkill.
Name | Type | Description |
---|---|---|
ar |
string |
Arabic |
cs |
string |
Czech |
da |
string |
Danish |
de |
string |
German |
el |
string |
Greek |
en |
string |
English |
es |
string |
Spanish |
fi |
string |
Finnish |
fr |
string |
French |
hu |
string |
Hungarian |
it |
string |
Italian |
ja |
string |
Japanese |
ko |
string |
Korean |
nl |
string |
Dutch |
no |
string |
Norwegian (Bokmaal) |
pl |
string |
Polish |
pt-BR |
string |
Portuguese (Brazil) |
pt-PT |
string |
Portuguese (Portugal) |
ru |
string |
Russian |
sv |
string |
Swedish |
tr |
string |
Turkish |
zh-Hans |
string |
Chinese-Simplified |
zh-Hant |
string |
Chinese-Traditional |
EntityRecognitionSkillV3
Using the Text Analytics API, extracts entities of different types from text.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
categories |
string[] |
A list of entity categories that should be extracted. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
string |
A value indicating which language code to use. Default is |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
minimumPrecision |
number |
A value between 0 and 1 that be used to only include entities whose confidence score is greater than the value specified. If not set (default), or if explicitly set to null, all entities will be included. |
modelVersion |
string |
The version of the model to use when calling the Text Analytics API. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
ImageAnalysisSkill
A skill that analyzes image files. It extracts a rich set of visual features based on the image content.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
details |
A string indicating which domain-specific details to return. |
|
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
visualFeatures |
A list of visual features. |
ImageAnalysisSkillLanguage
The language codes supported for input by ImageAnalysisSkill.
Name | Type | Description |
---|---|---|
ar |
string |
Arabic |
az |
string |
Azerbaijani |
bg |
string |
Bulgarian |
bs |
string |
Bosnian Latin |
ca |
string |
Catalan |
cs |
string |
Czech |
cy |
string |
Welsh |
da |
string |
Danish |
de |
string |
German |
el |
string |
Greek |
en |
string |
English |
es |
string |
Spanish |
et |
string |
Estonian |
eu |
string |
Basque |
fi |
string |
Finnish |
fr |
string |
French |
ga |
string |
Irish |
gl |
string |
Galician |
he |
string |
Hebrew |
hi |
string |
Hindi |
hr |
string |
Croatian |
hu |
string |
Hungarian |
id |
string |
Indonesian |
it |
string |
Italian |
ja |
string |
Japanese |
kk |
string |
Kazakh |
ko |
string |
Korean |
lt |
string |
Lithuanian |
lv |
string |
Latvian |
mk |
string |
Macedonian |
ms |
string |
Malay Malaysia |
nb |
string |
Norwegian (Bokmal) |
nl |
string |
Dutch |
pl |
string |
Polish |
prs |
string |
Dari |
pt |
string |
Portuguese-Portugal |
pt-BR |
string |
Portuguese-Brazil |
pt-PT |
string |
Portuguese-Portugal |
ro |
string |
Romanian |
ru |
string |
Russian |
sk |
string |
Slovak |
sl |
string |
Slovenian |
sr-Cyrl |
string |
Serbian - Cyrillic RS |
sr-Latn |
string |
Serbian - Latin RS |
sv |
string |
Swedish |
th |
string |
Thai |
tr |
string |
Turkish |
uk |
string |
Ukrainian |
vi |
string |
Vietnamese |
zh |
string |
Chinese Simplified |
zh-Hans |
string |
Chinese Simplified |
zh-Hant |
string |
Chinese Traditional |
ImageDetail
A string indicating which domain-specific details to return.
Name | Type | Description |
---|---|---|
celebrities |
string |
Details recognized as celebrities. |
landmarks |
string |
Details recognized as landmarks. |
IndexProjectionMode
Defines behavior of the index projections in relation to the rest of the indexer.
Name | Type | Description |
---|---|---|
includeIndexingParentDocuments |
string |
The source document will be written into the indexer's target index. This is the default pattern. |
skipIndexingParentDocuments |
string |
The source document will be skipped from writing into the indexer's target index. |
InputFieldMappingEntry
Input field mapping for a skill.
Name | Type | Description |
---|---|---|
inputs |
The recursive inputs used when creating a complex type. |
|
name |
string |
The name of the input. |
source |
string |
The source of the input. |
sourceContext |
string |
The source context used for selecting recursive inputs. |
KeyPhraseExtractionSkill
A skill that uses text analytics for key phrase extraction.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
maxKeyPhraseCount |
integer |
A number indicating how many key phrases to return. If absent, all identified key phrases will be returned. |
modelVersion |
string |
The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
KeyPhraseExtractionSkillLanguage
The language codes supported for input text by KeyPhraseExtractionSkill.
Name | Type | Description |
---|---|---|
da |
string |
Danish |
de |
string |
German |
en |
string |
English |
es |
string |
Spanish |
fi |
string |
Finnish |
fr |
string |
French |
it |
string |
Italian |
ja |
string |
Japanese |
ko |
string |
Korean |
nl |
string |
Dutch |
no |
string |
Norwegian (Bokmaal) |
pl |
string |
Polish |
pt-BR |
string |
Portuguese (Brazil) |
pt-PT |
string |
Portuguese (Portugal) |
ru |
string |
Russian |
sv |
string |
Swedish |
LanguageDetectionSkill
A skill that detects the language of input text and reports a single language code for every document submitted on the request. The language code is paired with a score indicating the confidence of the analysis.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultCountryHint |
string |
A country code to use as a hint to the language detection model if it cannot disambiguate the language. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
modelVersion |
string |
The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
LineEnding
Defines the sequence of characters to use between the lines of text recognized by the OCR skill. The default value is "space".
Name | Type | Description |
---|---|---|
carriageReturn |
string |
Lines are separated by a carriage return ('\r') character. |
carriageReturnLineFeed |
string |
Lines are separated by a carriage return and a line feed ('\r\n') character. |
lineFeed |
string |
Lines are separated by a single line feed ('\n') character. |
space |
string |
Lines are separated by a single space character. |
MergeSkill
A skill for merging two or more strings into a single unified string, with an optional user-defined delimiter separating each component part.
Name | Type | Default value | Description |
---|---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
|
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
|
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
||
insertPostTag |
string |
The tag indicates the end of the merged text. By default, the tag is an empty space. |
|
insertPreTag |
string |
The tag indicates the start of the merged text. By default, the tag is an empty space. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
|
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
OcrSkill
A skill that extracts text from image files.
Name | Type | Default value | Description |
---|---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
|
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
|
defaultLanguageCode |
A value indicating which language code to use. Default is |
||
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
|
detectOrientation |
boolean |
False |
A value indicating to turn orientation detection on or not. Default is false. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
||
lineEnding |
Defines the sequence of characters to use between the lines of text recognized by the OCR skill. The default value is "space". |
||
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
|
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
OcrSkillLanguage
The language codes supported for input by OcrSkill.
Name | Type | Description |
---|---|---|
Jns |
string |
Jaunsari (Devanagiri) |
af |
string |
Afrikaans |
anp |
string |
Angika (Devanagiri) |
ar |
string |
Arabic |
ast |
string |
Asturian |
awa |
string |
Awadhi-Hindi (Devanagiri) |
az |
string |
Azerbaijani (Latin) |
be |
string |
Belarusian (Cyrillic and Latin) |
be-cyrl |
string |
Belarusian (Cyrillic) |
be-latn |
string |
Belarusian (Latin) |
bfy |
string |
Bagheli |
bfz |
string |
Mahasu Pahari (Devanagiri) |
bg |
string |
Bulgarian |
bgc |
string |
Haryanvi |
bho |
string |
Bhojpuri-Hindi (Devanagiri) |
bi |
string |
Bislama |
bns |
string |
Bundeli |
br |
string |
Breton |
bra |
string |
Brajbha |
brx |
string |
Bodo (Devanagiri) |
bs |
string |
Bosnian Latin |
bua |
string |
Buryat (Cyrillic) |
ca |
string |
Catalan |
ceb |
string |
Cebuano |
ch |
string |
Chamorro |
cnr-cyrl |
string |
Montenegrin (Cyrillic) |
cnr-latn |
string |
Montenegrin (Latin) |
co |
string |
Corsican |
crh |
string |
Crimean Tatar (Latin) |
cs |
string |
Czech |
csb |
string |
Kashubian |
cy |
string |
Welsh |
da |
string |
Danish |
de |
string |
German |
dhi |
string |
Dhimal (Devanagiri) |
doi |
string |
Dogri (Devanagiri) |
dsb |
string |
Lower Sorbian |
el |
string |
Greek |
en |
string |
English |
es |
string |
Spanish |
et |
string |
Estonian |
eu |
string |
Basque |
fa |
string |
Persian |
fi |
string |
Finnish |
fil |
string |
Filipino |
fj |
string |
Fijian |
fo |
string |
Faroese |
fr |
string |
French |
fur |
string |
Frulian |
fy |
string |
Western Frisian |
ga |
string |
Irish |
gag |
string |
Gagauz (Latin) |
gd |
string |
Scottish Gaelic |
gil |
string |
Gilbertese |
gl |
string |
Galician |
gon |
string |
Gondi (Devanagiri) |
gv |
string |
Manx |
gvr |
string |
Gurung (Devanagiri) |
haw |
string |
Hawaiian |
hi |
string |
Hindi |
hlb |
string |
Halbi (Devanagiri) |
hne |
string |
Chhattisgarhi (Devanagiri) |
hni |
string |
Hani |
hoc |
string |
Ho (Devanagiri) |
hr |
string |
Croatian |
hsb |
string |
Upper Sorbian |
ht |
string |
Haitian Creole |
hu |
string |
Hungarian |
ia |
string |
Interlingua |
id |
string |
Indonesian |
is |
string |
Icelandic |
it |
string |
Italian |
iu |
string |
Inuktitut (Latin) |
ja |
string |
Japanese |
jv |
string |
Javanese |
kaa |
string |
Kara-Kalpak (Latin) |
kaa-cyrl |
string |
Kara-Kalpak (Cyrillic) |
kac |
string |
Kachin (Latin) |
kea |
string |
Kabuverdianu |
kfq |
string |
Korku |
kha |
string |
Khasi |
kk-cyrl |
string |
Kazakh (Cyrillic) |
kk-latn |
string |
Kazakh (Latin) |
kl |
string |
Greenlandic |
klr |
string |
Khaling |
kmj |
string |
Malto (Devanagiri) |
ko |
string |
Korean |
kos |
string |
Kosraean |
kpy |
string |
Koryak |
krc |
string |
Karachay-Balkar |
kru |
string |
Kurukh (Devanagiri) |
ksh |
string |
Ripuarian |
ku-arab |
string |
Kurdish (Arabic) |
ku-latn |
string |
Kurdish (Latin) |
kum |
string |
Kumyk (Cyrillic) |
kw |
string |
Cornish |
ky |
string |
Kyrgyz (Cyrillic) |
la |
string |
Latin |
lb |
string |
Luxembourgish |
lkt |
string |
Lakota |
lt |
string |
Lithuanian |
mi |
string |
Maori |
mn |
string |
Mongolian (Cyrillic) |
mr |
string |
Marathi |
ms |
string |
Malay (Latin) |
mt |
string |
Maltese |
mww |
string |
Hmong Daw (Latin) |
myv |
string |
Erzya (Cyrillic) |
nap |
string |
Neapolitan |
nb |
string |
Norwegian |
ne |
string |
Nepali |
niu |
string |
Niuean |
nl |
string |
Dutch |
no |
string |
Norwegian |
nog |
string |
Nogay |
oc |
string |
Occitan |
os |
string |
Ossetic |
pa |
string |
Punjabi (Arabic) |
pl |
string |
Polish |
prs |
string |
Dari |
ps |
string |
Pashto |
pt |
string |
Portuguese |
quc |
string |
K'iche' |
rab |
string |
Chamling |
rm |
string |
Romansh |
ro |
string |
Romanian |
ru |
string |
Russian |
sa |
string |
Sanskrit (Devanagiri) |
sat |
string |
Santali (Devanagiri) |
sck |
string |
Sadri (Devanagiri) |
sco |
string |
Scots |
sk |
string |
Slovak |
sl |
string |
Slovenian |
sm |
string |
Samoan (Latin) |
sma |
string |
Southern Sami |
sme |
string |
Northern Sami (Latin) |
smj |
string |
Lule Sami |
smn |
string |
Inari Sami |
sms |
string |
Skolt Sami |
so |
string |
Somali (Arabic) |
sq |
string |
Albanian |
sr |
string |
Serbian (Latin) |
sr-Cyrl |
string |
Serbian (Cyrillic) |
sr-Latn |
string |
Serbian (Latin) |
srx |
string |
Sirmauri (Devanagiri) |
sv |
string |
Swedish |
sw |
string |
Swahili (Latin) |
tet |
string |
Tetum |
tg |
string |
Tajik (Cyrillic) |
thf |
string |
Thangmi |
tk |
string |
Turkmen (Latin) |
to |
string |
Tongan |
tr |
string |
Turkish |
tt |
string |
Tatar (Latin) |
tyv |
string |
Tuvan |
ug |
string |
Uyghur (Arabic) |
unk |
string |
Unknown (All) |
ur |
string |
Urdu |
uz |
string |
Uzbek (Latin) |
uz-arab |
string |
Uzbek (Arabic) |
uz-cyrl |
string |
Uzbek (Cyrillic) |
vo |
string |
Volapük |
wae |
string |
Walser |
xnr |
string |
Kangri (Devanagiri) |
xsr |
string |
Sherpa (Devanagiri) |
yua |
string |
Yucatec Maya |
za |
string |
Zhuang |
zh-Hans |
string |
Chinese Simplified |
zh-Hant |
string |
Chinese Traditional |
zu |
string |
Zulu |
OutputFieldMappingEntry
Output field mapping for a skill.
Name | Type | Description |
---|---|---|
name |
string |
The name of the output defined by the skill. |
targetName |
string |
The target name of the output. It is optional and default to name. |
PIIDetectionSkill
Using the Text Analytics API, extracts personal information from an input text and gives you the option of masking it.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
string |
A value indicating which language code to use. Default is |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
domain |
string |
If specified, will set the PII domain to include only a subset of the entity categories. Possible values include: 'phi', 'none'. Default is 'none'. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
maskingCharacter |
string |
The character used to mask the text if the maskingMode parameter is set to replace. Default is '*'. |
maskingMode |
A parameter that provides various ways to mask the personal information detected in the input text. Default is 'none'. |
|
minimumPrecision |
number |
A value between 0 and 1 that be used to only include entities whose confidence score is greater than the value specified. If not set (default), or if explicitly set to null, all entities will be included. |
modelVersion |
string |
The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
piiCategories |
string[] |
A list of PII entity categories that should be extracted and masked. |
PIIDetectionSkillMaskingMode
A string indicating what maskingMode to use to mask the personal information detected in the input text.
Name | Type | Description |
---|---|---|
none |
string |
No masking occurs and the maskedText output will not be returned. |
replace |
string |
Replaces the detected entities with the character given in the maskingCharacter parameter. The character will be repeated to the length of the detected entity so that the offsets will correctly correspond to both the input text as well as the output maskedText. |
SearchError
Describes an error condition for the API.
Name | Type | Description |
---|---|---|
code |
string |
One of a server-defined set of error codes. |
details |
An array of details about specific errors that led to this reported error. |
|
message |
string |
A human-readable representation of the error. |
SearchIndexerDataNoneIdentity
Clears the identity property of a datasource.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of identity. |
SearchIndexerDataUserAssignedIdentity
Specifies the identity for a datasource to use.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of identity. |
userAssignedIdentity |
string |
The fully qualified Azure resource Id of a user assigned managed identity typically in the form "/subscriptions/12345678-1234-1234-1234-1234567890ab/resourceGroups/rg/providers/Microsoft.ManagedIdentity/userAssignedIdentities/myId" that should have been assigned to the search service. |
SearchIndexerIndexProjections
Definition of additional projections to secondary search indexes.
Name | Type | Description |
---|---|---|
parameters |
A dictionary of index projection-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type. |
|
selectors |
A list of projections to be performed to secondary search indexes. |
SearchIndexerIndexProjectionSelector
Description for what data to store in the designated search index.
Name | Type | Description |
---|---|---|
mappings |
Mappings for the projection, or which source should be mapped to which field in the target index. |
|
parentKeyFieldName |
string |
Name of the field in the search index to map the parent document's key value to. Must be a string field that is filterable and not the key field. |
sourceContext |
string |
Source context for the projections. Represents the cardinality at which the document will be split into multiple sub documents. |
targetIndexName |
string |
Name of the search index to project to. Must have a key field with the 'keyword' analyzer set. |
SearchIndexerIndexProjectionsParameters
A dictionary of index projection-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
Name | Type | Description |
---|---|---|
projectionMode |
Defines behavior of the index projections in relation to the rest of the indexer. |
SearchIndexerKnowledgeStore
Definition of additional projections to azure blob, table, or files, of enriched data.
Name | Type | Description |
---|---|---|
identity | SearchIndexerDataIdentity: |
The user-assigned managed identity used for connections to Azure Storage when writing knowledge store projections. If the connection string indicates an identity (ResourceId) and it's not specified, the system-assigned managed identity is used. On updates to the indexer, if the identity is unspecified, the value remains unchanged. If set to "none", the value of this property is cleared. |
parameters |
A dictionary of knowledge store-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type. |
|
projections |
A list of additional projections to perform during indexing. |
|
storageConnectionString |
string |
The connection string to the storage account projections will be stored in. |
SearchIndexerKnowledgeStoreFileProjectionSelector
Projection definition for what data to store in Azure Files.
Name | Type | Description |
---|---|---|
generatedKeyName |
string |
Name of generated key to store projection under. |
inputs |
Nested inputs for complex projections. |
|
referenceKeyName |
string |
Name of reference key to different projection. |
source |
string |
Source data to project. |
sourceContext |
string |
Source context for complex projections. |
storageContainer |
string |
Blob container to store projections in. |
SearchIndexerKnowledgeStoreObjectProjectionSelector
Projection definition for what data to store in Azure Blob.
Name | Type | Description |
---|---|---|
generatedKeyName |
string |
Name of generated key to store projection under. |
inputs |
Nested inputs for complex projections. |
|
referenceKeyName |
string |
Name of reference key to different projection. |
source |
string |
Source data to project. |
sourceContext |
string |
Source context for complex projections. |
storageContainer |
string |
Blob container to store projections in. |
SearchIndexerKnowledgeStoreParameters
A dictionary of knowledge store-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
Name | Type | Default value | Description |
---|---|---|---|
synthesizeGeneratedKeyName |
boolean |
False |
Whether or not projections should synthesize a generated key name if one isn't already present. |
SearchIndexerKnowledgeStoreProjection
Container object for various projection selectors.
Name | Type | Description |
---|---|---|
files |
Projections to Azure File storage. |
|
objects |
Projections to Azure Blob storage. |
|
tables |
Projections to Azure Table storage. |
SearchIndexerKnowledgeStoreTableProjectionSelector
Description for what data to store in Azure Tables.
Name | Type | Description |
---|---|---|
generatedKeyName |
string |
Name of generated key to store projection under. |
inputs |
Nested inputs for complex projections. |
|
referenceKeyName |
string |
Name of reference key to different projection. |
source |
string |
Source data to project. |
sourceContext |
string |
Source context for complex projections. |
tableName |
string |
Name of the Azure table to store projected data in. |
SearchIndexerSkillset
A list of skills.
Name | Type | Description |
---|---|---|
@odata.etag |
string |
The ETag of the skillset. |
cognitiveServices | CognitiveServicesAccount: |
Details about the Azure AI service to be used when running skills. |
description |
string |
The description of the skillset. |
encryptionKey |
A description of an encryption key that you create in Azure Key Vault. This key is used to provide an additional level of encryption-at-rest for your skillset definition when you want full assurance that no one, not even Microsoft, can decrypt your skillset definition. Once you have encrypted your skillset definition, it will always remain encrypted. The search service will ignore attempts to set this property to null. You can change this property as needed if you want to rotate your encryption key; Your skillset definition will be unaffected. Encryption with customer-managed keys is not available for free search services, and is only available for paid services created on or after January 1, 2019. |
|
indexProjections |
Definition of additional projections to secondary search index(es). |
|
knowledgeStore |
Definition of additional projections to Azure blob, table, or files, of enriched data. |
|
name |
string |
The name of the skillset. |
skills |
SearchIndexerSkill[]:
|
A list of skills in the skillset. |
SearchResourceEncryptionKey
A customer-managed encryption key in Azure Key Vault. Keys that you create and manage can be used to encrypt or decrypt data-at-rest, such as indexes and synonym maps.
Name | Type | Description |
---|---|---|
accessCredentials |
Optional Azure Active Directory credentials used for accessing your Azure Key Vault. Not required if using managed identity instead. |
|
identity | SearchIndexerDataIdentity: |
An explicit managed identity to use for this encryption key. If not specified and the access credentials property is null, the system-assigned managed identity is used. On update to the resource, if the explicit identity is unspecified, it remains unchanged. If "none" is specified, the value of this property is cleared. |
keyVaultKeyName |
string |
The name of your Azure Key Vault key to be used to encrypt your data at rest. |
keyVaultKeyVersion |
string |
The version of your Azure Key Vault key to be used to encrypt your data at rest. |
keyVaultUri |
string |
The URI of your Azure Key Vault, also referred to as DNS name, that contains the key to be used to encrypt your data at rest. An example URI might be |
SentimentSkill
This skill is deprecated. Use the V3.SentimentSkill instead.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
SentimentSkillLanguage
Deprecated. The language codes supported for input text by SentimentSkill.
Name | Type | Description |
---|---|---|
da |
string |
Danish |
de |
string |
German |
el |
string |
Greek |
en |
string |
English |
es |
string |
Spanish |
fi |
string |
Finnish |
fr |
string |
French |
it |
string |
Italian |
nl |
string |
Dutch |
no |
string |
Norwegian (Bokmaal) |
pl |
string |
Polish |
pt-PT |
string |
Portuguese (Portugal) |
ru |
string |
Russian |
sv |
string |
Swedish |
tr |
string |
Turkish |
SentimentSkillV3
Using the Text Analytics API, evaluates unstructured text and for each record, provides sentiment labels (such as "negative", "neutral" and "positive") based on the highest confidence score found by the service at a sentence and document-level.
Name | Type | Default value | Description |
---|---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
|
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
|
defaultLanguageCode |
string |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
|
includeOpinionMining |
boolean |
False |
If set to true, the skill output will include information from Text Analytics for opinion mining, namely targets (nouns or verbs) and their associated assessment (adjective) in the text. Default is false. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
||
modelVersion |
string |
The version of the model to use when calling the Text Analytics service. It will default to the latest available when not specified. We recommend you do not specify this value unless absolutely necessary. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
|
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
ShaperSkill
A skill for reshaping the outputs. It creates a complex type to support composite fields (also known as multipart fields).
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
SplitSkill
A skill to split a string into chunks of text.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultLanguageCode |
A value indicating which language code to use. Default is |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
maximumPageLength |
integer |
The desired maximum page length. Default is 10000. |
maximumPagesToTake |
integer |
Only applicable when textSplitMode is set to 'pages'. If specified, the SplitSkill will discontinue splitting after processing the first 'maximumPagesToTake' pages, in order to improve performance when only a few initial pages are needed from each document. |
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
pageOverlapLength |
integer |
Only applicable when textSplitMode is set to 'pages'. If specified, n+1th chunk will start with this number of characters/tokens from the end of the nth chunk. |
textSplitMode |
A value indicating which split mode to perform. |
SplitSkillLanguage
The language codes supported for input text by SplitSkill.
Name | Type | Description |
---|---|---|
am |
string |
Amharic |
bs |
string |
Bosnian |
cs |
string |
Czech |
da |
string |
Danish |
de |
string |
German |
en |
string |
English |
es |
string |
Spanish |
et |
string |
Estonian |
fi |
string |
Finnish |
fr |
string |
French |
he |
string |
Hebrew |
hi |
string |
Hindi |
hr |
string |
Croatian |
hu |
string |
Hungarian |
id |
string |
Indonesian |
is |
string |
Icelandic |
it |
string |
Italian |
ja |
string |
Japanese |
ko |
string |
Korean |
lv |
string |
Latvian |
nb |
string |
Norwegian |
nl |
string |
Dutch |
pl |
string |
Polish |
pt |
string |
Portuguese (Portugal) |
pt-br |
string |
Portuguese (Brazil) |
ru |
string |
Russian |
sk |
string |
Slovak |
sl |
string |
Slovenian |
sr |
string |
Serbian |
sv |
string |
Swedish |
tr |
string |
Turkish |
ur |
string |
Urdu |
zh |
string |
Chinese (Simplified) |
TextSplitMode
A value indicating which split mode to perform.
Name | Type | Description |
---|---|---|
pages |
string |
Split the text into individual pages. |
sentences |
string |
Split the text into individual sentences. |
TextTranslationSkill
A skill to translate text from one language to another.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
defaultFromLanguageCode |
The language code to translate documents from for documents that don't specify the from language explicitly. |
|
defaultToLanguageCode |
The language code to translate documents into for documents that don't specify the to language explicitly. |
|
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
suggestedFrom |
The language code to translate documents from when neither the fromLanguageCode input nor the defaultFromLanguageCode parameter are provided, and the automatic language detection is unsuccessful. Default is |
TextTranslationSkillLanguage
The language codes supported for input text by TextTranslationSkill.
Name | Type | Description |
---|---|---|
af |
string |
Afrikaans |
ar |
string |
Arabic |
bg |
string |
Bulgarian |
bn |
string |
Bangla |
bs |
string |
Bosnian (Latin) |
ca |
string |
Catalan |
cs |
string |
Czech |
cy |
string |
Welsh |
da |
string |
Danish |
de |
string |
German |
el |
string |
Greek |
en |
string |
English |
es |
string |
Spanish |
et |
string |
Estonian |
fa |
string |
Persian |
fi |
string |
Finnish |
fil |
string |
Filipino |
fj |
string |
Fijian |
fr |
string |
French |
ga |
string |
Irish |
he |
string |
Hebrew |
hi |
string |
Hindi |
hr |
string |
Croatian |
ht |
string |
Haitian Creole |
hu |
string |
Hungarian |
id |
string |
Indonesian |
is |
string |
Icelandic |
it |
string |
Italian |
ja |
string |
Japanese |
kn |
string |
Kannada |
ko |
string |
Korean |
lt |
string |
Lithuanian |
lv |
string |
Latvian |
mg |
string |
Malagasy |
mi |
string |
Maori |
ml |
string |
Malayalam |
ms |
string |
Malay |
mt |
string |
Maltese |
mww |
string |
Hmong Daw |
nb |
string |
Norwegian |
nl |
string |
Dutch |
otq |
string |
Queretaro Otomi |
pa |
string |
Punjabi |
pl |
string |
Polish |
pt |
string |
Portuguese |
pt-PT |
string |
Portuguese (Portugal) |
pt-br |
string |
Portuguese (Brazil) |
ro |
string |
Romanian |
ru |
string |
Russian |
sk |
string |
Slovak |
sl |
string |
Slovenian |
sm |
string |
Samoan |
sr-Cyrl |
string |
Serbian (Cyrillic) |
sr-Latn |
string |
Serbian (Latin) |
sv |
string |
Swedish |
sw |
string |
Kiswahili |
ta |
string |
Tamil |
te |
string |
Telugu |
th |
string |
Thai |
tlh |
string |
Klingon |
tlh-Latn |
string |
Klingon (Latin script) |
tlh-Piqd |
string |
Klingon (Klingon script) |
to |
string |
Tongan |
tr |
string |
Turkish |
ty |
string |
Tahitian |
uk |
string |
Ukrainian |
ur |
string |
Urdu |
vi |
string |
Vietnamese |
yua |
string |
Yucatec Maya |
yue |
string |
Cantonese (Traditional) |
zh-Hans |
string |
Chinese Simplified |
zh-Hant |
string |
Chinese Traditional |
VisualFeature
The strings indicating what visual feature types to return.
Name | Type | Description |
---|---|---|
adult |
string |
Visual features recognized as adult persons. |
brands |
string |
Visual features recognized as commercial brands. |
categories |
string |
Categories. |
description |
string |
Description. |
faces |
string |
Visual features recognized as people faces. |
objects |
string |
Visual features recognized as objects. |
tags |
string |
Tags. |
WebApiSkill
A skill that can call a Web API endpoint, allowing you to extend a skillset by having it call your custom code.
Name | Type | Description |
---|---|---|
@odata.type |
string:
#Microsoft. |
A URI fragment specifying the type of skill. |
authIdentity | SearchIndexerDataIdentity: |
The user-assigned managed identity used for outbound connections. If an authResourceId is provided and it's not specified, the system-assigned managed identity is used. On updates to the indexer, if the identity is unspecified, the value remains unchanged. If set to "none", the value of this property is cleared. |
authResourceId |
string |
Applies to custom skills that connect to external code in an Azure function or some other application that provides the transformations. This value should be the application ID created for the function or app when it was registered with Azure Active Directory. When specified, the custom skill connects to the function or app using a managed ID (either system or user-assigned) of the search service and the access token of the function or app, using this value as the resource id for creating the scope of the access token. |
batchSize |
integer |
The desired batch size which indicates number of documents. |
context |
string |
Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document. |
degreeOfParallelism |
integer |
If set, the number of parallel calls that can be made to the Web API. |
description |
string |
The description of the skill which describes the inputs, outputs, and usage of the skill. |
httpHeaders |
object |
The headers required to make the http request. |
httpMethod |
string |
The method for the http request. |
inputs |
Inputs of the skills could be a column in the source data set, or the output of an upstream skill. |
|
name |
string |
The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character '#'. |
outputs |
The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill. |
|
timeout |
string |
The desired timeout for the request. Default is 30 seconds. |
uri |
string |
The url for the Web API. |