BuildDocumentClassifierRequest from python SDK resulting in TrainingContentMissing: Training data is missing: Could not find any training data at the given path

Rony Tayoun 20 Reputation points
2025-10-27T10:26:01.3966667+00:00

I am trying to create a custom classification model.
I have 2 classes '2058-a' and '2058-b'.

I made sure that the containerURL works by using it to print the existing files.



container_client = ContainerClient.from_container_url(container_sas_url)
blobs = container_client.list_blobs(name_starts_with="examples-de-chaque-document/2058-a/")
print(f"Found {len(list(blobs))} files")

container_client = ContainerClient.from_container_url(container_sas_url)
blobs = container_client.list_blobs(name_starts_with="examples-de-chaque-document/2058-b/")
print(f"Found {len(list(blobs))} files")


The files I am using hera are the same files I use in the document intelligence studio (6 pdfs for each class)

doc_types

In the next section, I try to call the begin_build_classifier. Note that in the latest SDK , BuildDocumentClassifierRequest does not accept build_mode as parameter as suggested and also begin_build_classifier accepts one parameter body. So the code is as follows :

PythonCopy



# Optional: Add model_id if building on a prebuilt classifier

classifier_id = f"new-test-{uuid.uuid4()}"

build_request = BuildDocumentClassifierRequest(

    classifier_id=classifier_id,

    description="Example classifier",

    doc_types=doc_types,  # Now using AzureBlobContentSource objects

    allow_overwrite=True,
)

try:

    poller = admin_client.begin_build_classifier(
        build_request
    )

    print(f"Training started! Classifier ID: {classifier_id}")

    

    # Poll for result with details

    result = poller.result()

    if result.status == "failed":

        print(f"Error details: {result.errors}")

    else:

        print(f"Classifier built successfully: {result.model_id}")

        

except Exception as e:

    print(f"Build failed: {e}")

Training started! Classifier ID: new-test-62c4dfb9-2765-46f2-86fc-609cc8603672 Build failed: (InvalidRequest) Invalid request. Code: InvalidRequest Message: Invalid request. Exception Details: (TrainingContentMissing) Training data is missing: Could not find any training data at the given path. Code: TrainingContentMissing Message: Training data is missing: Could not find any training data at the given path.

I tried different things like adding ClassifierDocumentTypeDetails to the doc_types, removing trailing "/" from the prefix etc.. but still no luck.

It is still failing.

Azure AI Document Intelligence
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 26,661 Reputation points Volunteer Moderator
    2025-11-26T13:38:39.9066667+00:00

    Hello Rony Tayoun,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand about your BuildDocumentClassifierRequest from python SDK resulting in TrainingContentMissing.

    Nikhil Jha (on Oct 28) have identified breaking change in v4.0 removing prefix and embedding folder path in container_url fixed it. - https://learn.microsoft.com/en-gb/answers/questions/5598410/builddocumentclassifierrequest-from-python-sdk-res and you confirmed workaround efficacy by uploading into separate folder and generating .jsonl, but reported continuing issues using SDK directly. However, both approaches were attempted. The final accepted answer clarifies that remove prefix, embed folder in SAS URL solved it.

    The final advice is to remove the prefix and include the folder path directly in the container_url. This step is crucial for SDK version 4.0. Previous instructions that used prefix led to confusion. While the official documentation still needs updates, the solution provided has been tested and works. It addresses all major issues such as SAS permissions, file format requirements, and SDK changes, so you can successfully train your classifiers.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.