Language service- Entity extraction for files stored in file storage

Harish A 50 Reputation points


I am trying to use Language Services to extract entities.

I have seen the python code samples, where program takes the text content for which entities are extracted.

However, I have dump of text files from which entities have to be extracted. Using above code , I can not keep sending the file content of each file to get the response.

My expectation is I upload these files in the file storage and provide the link of the folder of documents to the python code.

This way, I can rely on the folder for any new documents uploaded to files storage.

How to achieve this. Can you help me on this.

I read through following documentation, but did not find required answer.


Harish Alwala

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
364 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 16,146 Reputation points

    You can use the Azure AI Language Services along with Azure Functions or Azure Logic Apps to create a serverless pipeline. My idea here is that you will be able to process files automatically when they are uploaded to the storage.

    Here is the recipe :

    • You need to set up an Azure Blob Storage where you can upload your text files (consider it your storage system where all files are maintained)
    • An Azure Function can be triggered whenever a new file is uploaded to a specified Blob Storage container.
    • Within the Azure Function, use the Azure AI Language Services to perform entity extraction on each file's content. You will need to use the SDK provided by Azure for Language Services and specifically the Named Entity Recognition (NER) feature.

    Try this example the Azure Function in Python with the Azure AI Language Service for entity extraction:

    import azure.functions as func
    from import TextAnalyticsClient
    from azure.core.credentials import AzureKeyCredential
    def main(blob: func.InputStream):
        key = "your_language_service_key"
        endpoint = "your_language_service_endpoint"
        text_analytics_client = TextAnalyticsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
        document ='utf-8')
        # Extract entities
        poller = text_analytics_client.begin_recognize_entities([document])
        result = poller.result()
        for idx, doc in enumerate(result):
            if not doc.is_error:
                for entity in doc.entities:
                    print(f"Entity: {entity.text}, Category: {entity.category}, Confidence Score: {entity.confidence_score}")
                print(f"Error: {doc.error}")

    Try and tell us :)