Azure Search Index with both text and images

Question

Azure Search Index with both text and images

Maja Ru 25

Hi,
I want to implement a RAG architecture where the input data is in pdfs and contain text, images and tables. Working only with text is straightforward and I've built such applications before. However the addition of images in the new use case makes me uncertain which approach I should use.
I have some pdfs which contain technical documentation. There is raw text, images and tables. Ideally the solution work like this (however i am not certain if it makes sense when it comes to Azure). I do not need any advanced algorithms that will analyze the pictures. I just want the pictures and tables to be returned from the search index as some kind of attachments to the text.
So for example a user asks a question about an error in the machine. Then I retrieve most relevant content chunk plus associated to it picture. But is this doable? Do i need to first myself extract the images and tables from the pdf? Or are there some Azure services that help me achieve this?

Second question: how do I return the pictures together with the text to the user. Which model do I use?

Grmacjon-MSFT 19,151 Reputation points Moderator

2024-05-07T22:33:11.3566667+00:00

Hi@Maja Ru Thanks for bringing this our attention. We are checking internally with the engineering team to see if this is possible and will get back to you when we hear back from them.

Best,

Grace

1 answer

Your answer

Grmacjon-MSFT 19,151 Reputation points Moderator

2024-05-07T22:33:11.3566667+00:00

Hi@Maja Ru Thanks for bringing this our attention. We are checking internally with the engineering team to see if this is possible and will get back to you when we hear back from them.

Best,

Grace

Answer 1

Hello @Maja Ru this is a way to achieve your scenario, this includes you tweaking code in the custom skill mentioned below to fit your needs:

With built-in indexers (Indexer overview - Azure AI Search | Microsoft Learn) you can use a skillset (https://learn.microsoft.com/en-us/azure/search/cognitive-search-concept-intro, https://learn.microsoft.com/en-us/azure/search/cognitive-search-defining-skillset).

The skillset can be built with this functionality:

Extracting images and text steps, chunking (and vectorizing if needed):

OCR skill (https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-ocr) and extracting the images and text and outputs to a merged intermediate output.
Using split skill (https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-textsplit) to chunk the data that comes from OCR skill that would fit the LLM input max size.
If needed you can consider using vectorizing the extracted outputs from the above and vectorize to have not only keyword but similarity/semantic search with an embedding skill such as https://learn.microsoft.com/en-us/azure/search/cognitive-search-skill-azure-openai-embedding

Extracting tabular data from docs:

Use a custom skill with AI document intelligence to extract the tables: https://learn.microsoft.com/en-us/training/modules/build-form-recognizer-custom-skill-for-azure-cognitive-search/

Write the enriched data to the index

Write the enrichment process data to the index with the respective skill outputs or if using chunking (split skill) with index projections (https://learn.microsoft.com/en-us/azure/search/index-projections-concept-intro?tabs=kstore-rest).

There is no specific sample with the steps you require exactly as is, but you can run the "Import and vectorize data" wizard from the portal: https://learn.microsoft.com/en-us/azure/search/search-get-started-portal-import-vectors

This will create the following configurations: data source, indexer, skillset configuration (you can choose OCR so you have the first part described here - and it will include chunking and vectorization described above) and an initial version of the index. After created, you can change the index fields where you plan to add the tabular data coming out of the custom skill and add the custom skill configuration to the skillset as described above.

Hope that helps. Let us know if you have further questions.

Best,

Grace

Share via

Azure Search Index with both text and images

1 answer

Your answer