DocumentExtractionSkill interface

A skill that extracts content from a file within the enrichment pipeline.

Extends

Properties

configuration

A dictionary of configurations for the skill.

dataToExtract

The type of data to be extracted for the skill. Will be set to 'contentAndMetadata' if not defined.

odatatype

Polymorphic discriminator, which specifies the different types this object can be

parsingMode

The parsingMode for the skill. Will be set to 'default' if not defined.

Inherited Properties

context

Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document.

description

The description of the skill which describes the inputs, outputs, and usage of the skill.

inputs

Inputs of the skills could be a column in the source data set, or the output of an upstream skill.

name

The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character #.

outputs

The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill.

Property Details

configuration

A dictionary of configurations for the skill.

configuration?: {[propertyName: string]: any}

Property Value

{[propertyName: string]: any}

dataToExtract

The type of data to be extracted for the skill. Will be set to 'contentAndMetadata' if not defined.

dataToExtract?: string

Property Value

string

odatatype

Polymorphic discriminator, which specifies the different types this object can be

odatatype: "#Microsoft.Skills.Util.DocumentExtractionSkill"

Property Value

"#Microsoft.Skills.Util.DocumentExtractionSkill"

parsingMode

The parsingMode for the skill. Will be set to 'default' if not defined.

parsingMode?: string

Property Value

string

Inherited Property Details

context

Represents the level at which operations take place, such as the document root or document content (for example, /document or /document/content). The default is /document.

context?: string

Property Value

string

Inherited From BaseSearchIndexerSkill.context

description

The description of the skill which describes the inputs, outputs, and usage of the skill.

description?: string

Property Value

string

Inherited From BaseSearchIndexerSkill.description

inputs

Inputs of the skills could be a column in the source data set, or the output of an upstream skill.

inputs: InputFieldMappingEntry[]

Property Value

Inherited From BaseSearchIndexerSkill.inputs

name

The name of the skill which uniquely identifies it within the skillset. A skill with no name defined will be given a default name of its 1-based index in the skills array, prefixed with the character #.

name?: string

Property Value

string

Inherited From BaseSearchIndexerSkill.name

outputs

The output of a skill is either a field in a search index, or a value that can be consumed as an input by another skill.

outputs: OutputFieldMappingEntry[]

Property Value

Inherited From BaseSearchIndexerSkill.outputs