Document Intelligence Studio AI training failing

George R 20 Reputation points
2025-11-19T13:14:30.5366667+00:00

I am using Document Intelligence Studio and my SDS_Recogniser_V2 model keeps failing, despite giving it a 30-hour limit and the model only has about 200 documents and about 2000 pages. I retrained the TDS_Recogniser yesterday and that worked but the SDS_Recogniser_V2 keeps failing and the error message is generic "InternalServerError: An unexpected error occurred." Can you tell me what is wrong?
User's image

Azure AI Document Intelligence
0 comments No comments
{count} votes

Answer accepted by question author
  1. SRILAKSHMI C 10,640 Reputation points Microsoft External Staff Moderator
    2025-11-19T14:59:44.12+00:00

    Hello George R,

    Welcome to Microsoft Q&A and Thank you for sharing the additional details and the screenshot.

    Based on the behavior you’re seeing where SDS_Recogniser_V2 fails repeatedly with “InternalServerError: An unexpected error occurred”, while other models like TDS_Recogniser train without issues this points to a problem that is specific to the SDS_Recogniser_V2 model version or its training dataset, rather than anything related to training limits or document volume.

    Below is a consolidated breakdown of what’s going on and how you can proceed.

    What we can confirm from your screenshot

    TDS_Recogniser trained successfully on Nov 17.

    SDS_Recogniser_V3 is currently running.

    Multiple SDS_Recogniser_V2 training attempts failed instantly (within seconds/minutes).

    This indicates the failure is not caused by:

    30-hour timeout

    Number of documents/pages

    API version

    Region or performance issues

    The failure occurs before the training pipeline even begins, which strongly suggests an issue with the dataset or the backend state of that specific model version.

    The failure occurs before the training pipeline even begins, which strongly suggests an issue with the dataset or the backend state of that specific model version.

    Most likely causes of repeated early failures

    Based on similar cases, the most common reasons are

    1. A corrupted or unsupported document in the SDS V2 dataset

    Even a single file that is:

    • partially corrupted
    • password-protected
    • malformed PDF
    • extremely large
    • unreadable OCR image

    can cause an instant generic error with no detailed message.

    2. Label schema mismatches

    If, some fields were renamed/removed between versions, or

    certain documents have missing labels that are not marked optional

    the training pipeline will fail during validation.

    3. Model version metadata corruption

    Sometimes a specific version (V2) becomes “stuck” on the backend. This kind of corruption explains why:

    • V2 fails repeatedly
    • V3, a new version, is able to run successfully

    4. A backend service issue

    InternalServerError is a generic fallback when the training service can’t generate a detailed error message. This can happen if there is a pipeline failure in the ingestion/indexing stage.

    Troubleshooting Steps:

    1. Continue with SDS_Recogniser_V3

    V3 is running, which suggests V2’s metadata or internal state is corrupted. If V3 succeeds, that confirms the issue is isolated to SDS_Recogniser_V2.

    2. Validate the training dataset

    Please check that:

    • No PDF/image is password-protected
    • No file is 0 bytes
    • No file has unusually large or damaged pages
    • All labels exist across every sample (or are marked optional)
    • All samples use the same schema version

    Also Try isolating documents added recently or ones known to have quality issues.

    The repeated SDS_Recogniser_V2 failures are not caused by training limits or the number of documents. They are most likely due to:

    A corrupted file

    A schema mismatch

    Or corruption in the V2 model version metadata on the backend

    Since SDS_Recogniser_V3 is training successfully, I recommend continuing with that version while we validate the dataset.

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. George R 20 Reputation points
    2025-11-19T16:13:19.7633333+00:00

    I will wait and see if SDS_Recogniser_V3 works. But are you able to tell me why the old version fail? Cause if it is because of a bad file I need to know which file it is since there are 100+ files in there.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.