Indexer does not read the metadata from the blob

Bryś Andrzej 10 Reputation points
2024-08-23T11:09:33.76+00:00

Hi,

I have a few documents uploaded to the blob storage, they all have a metadata added with name "Project".

User's image

In Azure AI Search i clicked "Import and Vectorize data" - and went through with configuration.

Then I added a new column to the vector named "Project" with the type "Edm.String" that is Retreiveable, Filterable and Searchable.

User's image

I reseted and rerun the indexer - it finished correctly.

When I run a query in the Index's Search Explorer, i can see that this new field is always null.

User's image

I tried to add the field mapping to the indexer, but still the metadata was not copied to the index.

Indexer is set to the "Content and metadata" so it should be read correctly, am I right?User's image

I cannot see what am I doing wrong. Could you help me?

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
3,220 questions
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,062 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Nehruji R 8,146 Reputation points Microsoft Vendor
    2024-08-26T08:43:56.8333333+00:00

    Hello Bryś Andrzej,

    Greetings! Welcome to Microsoft Q&A Platform.

    I understand that you’re having trouble with your Azure Blob Storage indexer not reading metadata correctly and from the issue description, you are getting null values for the custom metadata fields that you wanted to transition to the Search Index as retrievable fields. You also tried to add the field mapping to the indexer, but still the metadata was not copied to the index, but the new field is always null. kindly check the logs to fetch more details about the error: Diagnostic settings.

    Troubleshooting common indexer errors and warnings in Azure AI Search

    In Azure AI Search a vectorizer is software that performs vectorization, such as a deployed embedding model on Azure OpenAI, that converts text (or images) to vectors during query execution.

    It's defined in a search index, it applies to searchable vector fields, and it's used at query time to generate an embedding for a text or image query input. If instead you need to vectorize content as part of the indexing process, refer to Integrated Vectorization (Preview). For built-in vectorization during indexing, you can configure an indexer and skillset that calls an embedding model for your raw text content.

    refer - https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-configure-vectorizer,

    https://learn.microsoft.com/en-us/azure/search/search-get-started-portal-import-vectors?tabs=sample-data-storage%2Cmodel-aoai%2Cconnect-data-storage,

    https://learn.microsoft.com/en-us/azure/architecture/ai-ml/architecture/search-blob-metadata

    Ensure that the indexer has the necessary read permissions for the blob storage. The managed identity of the search service should have Storage Blob Data Reader permissions. Ensure that the data source configuration is correct and that the blobs contain the “Project” metadata. You can manually inspect a few blobs to confirm that the metadata is present and correctly formatted. Check the indexer logs for any errors or warnings that might indicate why the “Project” field is not being populated. The logs can provide detailed information about any issues encountered during the indexing process.

    refer - https://learn.microsoft.com/en-us/azure/search/search-howto-indexing-azure-blob-storage,https://learn.microsoft.com/en-us/azure/search/search-howto-index-one-to-many-blobs,https://github.com/Azure/azure-search-vector-samples/issues/71.

    Hope this answer helps! Please let us know if you have any further queries. I’m happy to assist you further.


    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.