Discrepancy Between Document Intelligence Python SDK and UI Labels

Syed Umair Hasan 90 Reputation points
2024-04-30T22:32:45.5133333+00:00

Hello, this is quite urgent as I'm encountering an issue with Document Intelligence. I'm using API version 2024-02-29-preview both in Document Intelligence UI and the Python SDK, which is the latest version supporting this API.

SDK version Supported API service version
1.0.0b1 2023-10-31-preview
1.0.0b1 2023-10-31-preview
1.0.0b2 2024-02-29-preview

The problem arises when I test the same file on a trained custom extraction model using both the UI and the Python SDK. The output JSON from the UI correctly detects table cells, whereas the Python SDK incorrectly detects one cell, causing it to be missing from the custom table. This discrepancy is breaking the code logic.

Why is there a difference between Document Intelligence UI and the Python SDK? How can I ensure that the Python SDK provides labels similar to Document Intelligence UI? I have attached a screenshot for reference: the output on the left is from the Python SDK, and on the right is from the UI. The label 'Related Substances' has fewer occurrences in the Python SDK and is not labeled in the custom table, while it has more occurrences in the UI JSON and is correctly labeled in the UI. Please note that I used the same file, and this is the first time this has happened.

User's image

User's image

User's image

Thank you.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,430 questions
{count} votes