Azure AI Search Index/Indexer error

Question

Azure AI Search Index/Indexer error

Charles Lawson 20

When trying to index/indexer using the AI Search Import of a blob dataset get this error when indexer runs. Verified field is specified when creating index

Target field 'metadata_storage_path' is either not present, doesn't have a value set, or no data could be extracted from the document for it.Failed document: 'https://cordstorage.blob.core.windows.net/cordepriblob/3002016583.pdf'

Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-14T05:02:40.49+00:00
Hi Charles Lawson,

As the error says like Azure AI Search indexer couldn't populate the metadata_storage_path field for the blob document you're indexing.

Lets follow below troubleshooting step,

This metadata_storage_path field is automatically populated by the blob indexer if the data source is a blob container. You don’t need to extract it from the document. But you must include it in your index schema, As you mentioned you've already specified it. But if you used a custom skillset or data source, double-check that it's part of the final projected output.

If your indexer runs but can't extract the metadata, check:

The blob exists and is publicly accessible or your data source has a managed identity or connection string with read access.

The blob has metadata — technically, this field is populated from the blob’s path, but access issues can still block it.

Try accessing the URL directly from your browser or curl and see If it returns a 403 or 404, it’s likely a permission issue.
curl -I "https://cordstorage.blob.core.windows.net/cordepriblob/3002016583.pdf"

If you're modifying documents in a custom skill, make sure you pass through this field. For example, if your skillset outputs only certain fields, metadata_storage_path might get dropped. define-a-field-mapping Ensure the skillset’s outputFieldMappings or final projection includes:
"fieldMappings": [ "sourceFieldName": "/document/metadata_storage_path", "targetFieldName": "metadata_storage_path", }]

If you're using a custom documentExtraction mode in the indexer ("textExtractionAlgorithm": "none" or "contentAndMetadata"), ensure metadata is still requested. Normally the default ("contentAndMetadata") is fine.

In the Azure Portal or via REST/CLI, re-run the indexer and check the indexer execution history. It should show which document failed and exactly what was missing. You can also enable logging to App Insights for deeper debugging.

If you have any further assistant, do let me know.
Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-15T04:04:08.2166667+00:00

Thanks Mario Garcia, for sharing your input on the above case.

Hi Charles Lawson, Just checking back to see if the above comment helped or you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Charles Lawson 20 Reputation points

2025-04-16T03:17:35.8733333+00:00

Mario's solution to the wizard-generated indexer did not apply a base64Encode transformation to the metadata_storage_path, which is required to convert the document path into a valid key by editing the indexer as he stated fixed my issue. Azure AI Search Team should fix the wizard. Thanks Mario!
Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-16T04:02:21.4366667+00:00

Hi Charles Lawson,

Glad that it fixed your issue, and thanks for confirming, I would request you to accept answer so that other people who faces similar issue may get benefitted from it.

would like to mention and Thank Mario Garcia for his input to resolve!!

1 answer

Your answer

Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-15T04:04:08.2166667+00:00

Thanks Mario Garcia, for sharing your input on the above case.

Hi Charles Lawson, Just checking back to see if the above comment helped or you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.
Charles Lawson 20 Reputation points

2025-04-16T03:17:35.8733333+00:00

Mario's solution to the wizard-generated indexer did not apply a base64Encode transformation to the metadata_storage_path, which is required to convert the document path into a valid key by editing the indexer as he stated fixed my issue. Azure AI Search Team should fix the wizard. Thanks Mario!
Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-16T04:02:21.4366667+00:00

Hi Charles Lawson,

Glad that it fixed your issue, and thanks for confirming, I would request you to accept answer so that other people who faces similar issue may get benefitted from it.

would like to mention and Thank Mario Garcia for his input to resolve!!

Answer 1

In my case:

While using the Data Import Wizard in Azure AI Search to create an indexer from Azure Blob Storage, I encountered the following error:

statusCode: 400  
name: DocumentExtraction.azureblob.mg-demo-ds  
errorMessage: Could not parse document. Invalid document key: 'https://mgdemostorage.blob.core.windows.net/mgdemocontainer/201801.pdf'.  
Keys can only contain letters, digits, underscore (_), dash (-), or equal sign (=).  
documentationLink: https://docs.microsoft.com/azure/search/search-howto-indexing-azure-blob-storage#DocumentKeys
details: Target field 'metadata_storage_path' is either not present, doesn't have a value set, or no data could be extracted from the document for it.

This occurred because the wizard-generated indexer did not apply a base64Encode transformation to the metadata_storage_path, which is required to convert the document path into a valid key.

Expected Behavior:

The import wizard should include the following field mapping in the generated indexer:

"fieldMappings": [
  {
    "sourceFieldName": "metadata_storage_path",
    "targetFieldName": "metadata_storage_path",
    "mappingFunction": {
      "name": "base64Encode",
      "parameters": null
    }
  }
]

Workaround / How I Solved It:

To resolve the issue, I manually updated the indexer using the Azure CLI or REST API and added the missing field mapping. After including the base64Encode function for the metadata_storage_path, the documents were successfully indexed without errors. If the answer is helpful, please click Accept Answer so that other people who faces similar issue may get benefitted from it.

Siva Nair 2,420 Reputation points Microsoft External Staff Moderator

2025-04-17T01:26:30.6533333+00:00

Hi Charles Lawson,

If the answer is helpful, please click Accept Answer so that other people who faces similar issue may get benefitted from it. Thanks
Deleted

This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.

Share via

Azure AI Search Index/Indexer error

1 answer

Your answer