Hi @Sushant Shelake apologies for the delay in response.
Here's how to properly crawl your PDF document and retrieve relevant answers:
1. Enable Text Extraction with Blob Indexer:
- Since you've imported your data, next thing to do is activate the "Enable Text Extraction" option. This instructs the indexer to extract text content from the PDF using Azure Cognitive Services (specifically Text Analytics).
2. Analyze Text Extraction Settings:
- In your Blob indexer configuration, review the "Text Extraction" settings. You can specify custom skills or cognitive services for handling specific file formats like PDF.
- By default, Azure AI Search uses a pre-built skill for text extraction. If your PDFs require advanced processing (e.g., handling complex layouts or tables), consider creating a custom skill using Cognitive Services Text Analytics for more granular control.
3. Search with Relevant Fields:
- When formulating your search query, target specific fields extracted from the PDF document. These fields might include extracted text content, metadata, or custom properties defined during indexing.
- For example, instead of searching the entire document, search for keywords within the extracted text content field:
content:"your search term"
Hope that helps.
-Grace