How to search pdf files using Natural Language

TPMeehan 1 Reputation point
2021-03-07T15:53:57.503+00:00

I'm new to all the Azure services and have been tasked with finding a solution to search a collection of pdf files using natural language queries. One use case would be to search resumés by saying (or entering) something like. "Find me all the candidates who have worked for XYZ company within the last 5 years". It looks like Azure Cognitive Search could be very useful for this. I've created a search service, and resource and data source etc and have successfully done a simple text search of my file. The resulting JSON looks like it did a good identifying all the various text concepts (names and places etc). I'm overwhelmed with all the various services that are available. If anyone can give a little newbie guidances as to which direction to go it would be greatly appreciated and would save me insane amounts of time running down rabbit holes looking for the correct services to use for my purposes. Thanks.

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
858 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Grmacjon-MSFT 17,366 Reputation points
    2021-03-09T07:45:49.097+00:00

    Hi @TPMeehan ,

    Have you tried AI enrichment in Azure Cognitive Search? AI enrichment is an extension of indexers that can be used to extract text from images, blobs, and other unstructured data sources. Enrichment and extraction make your content more searchable in indexer output objects, either a search index or a knowledge store. Extraction and enrichment are implemented using cognitive skills attached to the indexer-driven pipeline.

    Please check out this tutorial on How to create a skillset in an AI enrichment pipeline in Azure Cognitive Search for more info.

    Thanks,

    Grace