你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

IndexingParametersConfiguration interface

包:: @azure/search-documents

特定于索引器的配置属性的字典。每个名称都是特定属性的名称。每个值都必须是基元类型。

属性

allowSkillsetToReadFileData	如果为 true，将创建一个路径 //document//file_data，该路径表示从 Blob 数据源下载的原始文件数据。这样，就可以将原始文件数据传递到自定义技能，以便在扩充管道中进行处理，或传递到文档提取技能。
dataToExtract	指定要从 Azure Blob 存储中提取的数据，并在“imageAction”设置为“none”以外的值时告知索引器从映像内容中提取的数据。这适用于.PDF或其他应用程序中的嵌入图像内容，或者 Azure blob 中的 .jpg 和 .png等图像文件。
delimitedTextDelimiter	对于 CSV Blob，为 CSV 文件指定行尾单字符分隔符，其中每行都启动一个新文档（例如“\|”）。
delimitedTextHeaders	对于 CSV Blob，指定以逗号分隔的列标题列表，可用于将源字段映射到索引中的目标字段。
documentRoot	对于 JSON 数组，给定结构化或半结构化文档，可以使用此属性指定数组的路径。
excludedFileNameExtensions	从 Azure Blob 存储进行处理时要忽略的文件扩展名的逗号分隔列表。例如，可以在索引期间排除“.png，.mp4”跳过这些文件。
executionEnvironment	指定索引器应在其中执行的环境。
failOnUnprocessableDocument	对于 Azure Blob，如果要在文档索引失败时继续编制索引，则设置为 false。
failOnUnsupportedContentType	对于 Azure Blob，如果想要在遇到不受支持的内容类型时继续编制索引，并且事先不知道所有内容类型（文件扩展名），则设置为 false。
firstLineContainsHeaders	对于 CSV Blob，指示每个 Blob 的第一行（非空白）行包含标头。
imageAction	确定如何在 Azure Blob 存储中处理嵌入的图像和图像文件。将“imageAction”配置设置为“none”以外的任何值需要技能集也附加到该索引器。
indexedFileNameExtensions	从 Azure Blob 存储进行处理时要选择的文件扩展名的逗号分隔列表。例如，可以将索引集中在特定应用程序文件“.docx、.pptx、.msg”上，以专门包括这些文件类型。
indexStorageMetadataOnlyForOversizedDocuments	对于 Azure Blob，请将此属性设置为 true，以仍为 Blob 内容的存储元数据编制索引，这些元数据太大而无法处理。默认情况下，超大 Blob 被视为错误。有关 blob 大小限制，请参阅 https://docs.microsoft.com/azure/search/search-limits-quotas-capacity。
parsingMode	表示用于从 Azure Blob 数据源编制索引的分析模式。
pdfTextRotationAlgorithm	确定用于从 Azure Blob 存储中的 PDF 文件提取文本的算法。
queryTimeout	为 Azure SQL 数据库数据源增加超过 5 分钟默认值的超时，格式为“hh：mm：ss”。

属性详细信息

allowSkillsetToReadFileData

如果为 true，将创建一个路径 //document//file_data，该路径表示从 Blob 数据源下载的原始文件数据。这样，就可以将原始文件数据传递到自定义技能，以便在扩充管道中进行处理，或传递到文档提取技能。

allowSkillsetToReadFileData?: boolean

属性值

boolean

dataToExtract

指定要从 Azure Blob 存储中提取的数据，并在“imageAction”设置为“none”以外的值时告知索引器从映像内容中提取的数据。这适用于.PDF或其他应用程序中的嵌入图像内容，或者 Azure blob 中的 .jpg 和 .png等图像文件。

dataToExtract?: "storageMetadata" | "allMetadata" | "contentAndMetadata"

属性值

"storageMetadata" | "allMetadata" | "contentAndMetadata"

delimitedTextDelimiter

对于 CSV Blob，为 CSV 文件指定行尾单字符分隔符，其中每行都启动一个新文档（例如“|”）。

delimitedTextDelimiter?: string

属性值

string

delimitedTextHeaders

对于 CSV Blob，指定以逗号分隔的列标题列表，可用于将源字段映射到索引中的目标字段。

delimitedTextHeaders?: string

属性值

string

documentRoot

对于 JSON 数组，给定结构化或半结构化文档，可以使用此属性指定数组的路径。

documentRoot?: string

属性值

string

excludedFileNameExtensions

从 Azure Blob 存储进行处理时要忽略的文件扩展名的逗号分隔列表。例如，可以在索引期间排除“.png，.mp4”跳过这些文件。

excludedFileNameExtensions?: string

属性值

string

executionEnvironment

指定索引器应在其中执行的环境。

executionEnvironment?: "standard" | "private"

属性值

"standard" | "private"

failOnUnprocessableDocument

对于 Azure Blob，如果要在文档索引失败时继续编制索引，则设置为 false。

failOnUnprocessableDocument?: boolean

属性值

boolean

failOnUnsupportedContentType

对于 Azure Blob，如果想要在遇到不受支持的内容类型时继续编制索引，并且事先不知道所有内容类型（文件扩展名），则设置为 false。

failOnUnsupportedContentType?: boolean

属性值

boolean

firstLineContainsHeaders

对于 CSV Blob，指示每个 Blob 的第一行（非空白）行包含标头。

firstLineContainsHeaders?: boolean

属性值

boolean

imageAction

确定如何在 Azure Blob 存储中处理嵌入的图像和图像文件。将“imageAction”配置设置为“none”以外的任何值需要技能集也附加到该索引器。

imageAction?: "none" | "generateNormalizedImages" | "generateNormalizedImagePerPage"

属性值

"none" | "generateNormalizedImages" | "generateNormalizedImagePerPage"

indexedFileNameExtensions

从 Azure Blob 存储进行处理时要选择的文件扩展名的逗号分隔列表。例如，可以将索引集中在特定应用程序文件“.docx、.pptx、.msg”上，以专门包括这些文件类型。

indexedFileNameExtensions?: string

属性值

string

indexStorageMetadataOnlyForOversizedDocuments

对于 Azure Blob，请将此属性设置为 true，以仍为 Blob 内容的存储元数据编制索引，这些元数据太大而无法处理。默认情况下，超大 Blob 被视为错误。有关 blob 大小限制，请参阅 https://docs.microsoft.com/azure/search/search-limits-quotas-capacity。

indexStorageMetadataOnlyForOversizedDocuments?: boolean

属性值

boolean

parsingMode

表示用于从 Azure Blob 数据源编制索引的分析模式。

parsingMode?: "text" | "default" | "delimitedText" | "json" | "jsonArray" | "jsonLines"

属性值

pdfTextRotationAlgorithm

确定用于从 Azure Blob 存储中的 PDF 文件提取文本的算法。

pdfTextRotationAlgorithm?: "none" | "detectAngles"

属性值

"none" | "detectAngles"

queryTimeout

为 Azure SQL 数据库数据源增加超过 5 分钟默认值的超时，格式为“hh：mm：ss”。

queryTimeout?: string

属性值

string