Document Intelligence's Document Studio does not provide a python code for the contract model

Harsh Khewal 150 Reputation points
2024-04-22T04:44:28+00:00

Hi, Document Intelligence Studio provides a sample code to make API calls for many models but that is not the case for the prebuilt contracts model. I could not find any documentation online on how to use Doc Intelligence API for the contract model. Where can I find a wrapper code for the contract model or maybe some online sources that can give an outline on how to write the script? Any help will be appreciated.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
2,111 questions
0 comments No comments
{count} votes

Accepted answer
  1. dupammi 8,615 Reputation points Microsoft External Staff
    2024-04-22T08:13:56.4433333+00:00

    Hi @Harsh Khewal

    Thank you for using the Microsoft Q&A forum.

    It seems that you are trying to use Azure Document intelligence to analyze a contract document. Unfortunately, I couldn't find any documentation containing the wrapper code for the pre-built contract model. However, you can use the Azure DI SDK for Python to analyze the contract document. The SDK provides a begin_recognize_content method that can be used to extract text and layout information from the contract document. You can find information about the SDK/ REST API and its usage in the Azure documentation.

    Here's a sample repro code snippet, that I tried to analyze the contract document:

    from azure.ai.formrecognizer import FormRecognizerClient
    from azure.core.credentials import AzureKeyCredential
    endpoint = "YOUR_ENDPOINT"
    key = "YOUR_KEY"
    def analyze_contract():
        try:
            form_recognizer_client = FormRecognizerClient(endpoint, AzureKeyCredential(key))
            with open("YOUR_PATH_TO_CONTRACT_FILE", "rb") as f:
                poller = form_recognizer_client.begin_recognize_content(form=f)
            result = poller.result()
            if not isinstance(result, list):
                print("Unexpected result format. Expected a list.")
            else:
                for page_idx, page in enumerate(result):
                    print("Page #{}:".format(page_idx))
                    for attr_name in dir(page):
                        if not attr_name.startswith("__"):
                            attr_value = getattr(page, attr_name)
                            print("Attribute: {}, Value: {}".format(attr_name, attr_value))
        except Exception as e:
            print("An error occurred:", e)
    # Call the function to analyze the contract
    analyze_contract()
    

    Please modify the code as per your use-case.

    Output:
    User's image

    For more details, please refer this github repo.

    I hope you understand. Thank you.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator
    2024-04-22T08:14:29.9666667+00:00

    @Harsh Khewal You can use the same sample reference from github for other new pre-built models. The model ID in this case changes to prebuilt-contract

    The fields to be extracted change according to the list mentioned here.

    Sample should like below for contract models.

        from azure.core.credentials import AzureKeyCredential
        from azure.ai.formrecognizer import DocumentAnalysisClient
    
        endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]     
        key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]      
    
        document_analysis_client = DocumentAnalysisClient(
             endpoint=endpoint, credential=AzureKeyCredential(key)
        )     
    
        with open(path_to_sample_documents, "rb") as f:         
            poller = document_analysis_client.begin_analyze_document(
                "prebuilt-contract", document=f, locale="en-US"         
        )     
    
        contracts = poller.result()      
    
       for idx, contracts in enumerate(invoices.documents):
           print("--------Recognizing contract #{}--------".format(idx + 1))         
           contract_id = contracts.fields.get("
    
           if contract_id:             
                print(                 
                    "Contract ID: {} has confidence: {}".format(                                      
                 	      contract_id.value,contract_id.confidence
                     )
                 )
    ...
    
    
    

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.