Hello Rony Tayoun,
Thank you for providing such a detailed follow-up, including your new code and the persistent error message. Your thorough testing helps us pinpoint the exact problem, based on the code you've shared, your issue is a very specific and critical breaking change in the new v4.0 SDK (azure-ai-documentintelligence) that you are using.
The root cause of the TrainingContentMissing error is that the v4.0 AzureBlobContentSource model does not have a prefix parameter. I see you are still passing a prefix parameter, just like in your original dictionary.
The Python SDK is simply ignoring this unknown prefix argument. As a result, it is using your container_sas_url (which points to the root of your container) and looking for your training files there. Since your files are not at the root—they are in the "folder" specified by your prefix—the service correctly reports that it cannot find any training data at the given path.
The solution is to remove the prefix parameter and instead append the folder path directly to the container_url string before the SAS token.
Recommended Steps:
1.Your SAS URL must point directly to the specific "folder" containing the files for that class.
container_sas_url = "[YOUR_BASE_SAS_URL_WITH_TOKEN]"
# e.g., "https://myaccount.blob.core.windows.net/mycontainer?sv=..."
# Append the prefix (folder path) to the container name
url_2058a = "https://[STORAGE_NAME].blob.core.windows.net/[CONTAINER_NAME]/examples-de-chaque-document/2058-a/?[SAS_TOKEN]"
url_2058b = "https://[STORAGE_NAME].blob.core.windows.net/[CONTAINER_NAME]/examples-de-chaque-document/2058-b/?[SAS_TOKEN]"
Note: You must generate a SAS token at the container level, not the blob level, for this to work. Also, check the trailing slash '/' after the folder name, before the '?'
- Now, build your
doc_typesobject using these new, complete URLs and noprefixparameter.
from azure.ai.documentintelligence.models import AzureBlobContentSource
doc_types = {
'2058a': {
'azureBlobSource': AzureBlobContentSource(
container_url=url_2058a # Use the full path with the folder
# NO 'prefix' parameter here
)
},
'2058b': {
'azureBlobSource': AzureBlobContentSource(
container_url=url_2058b
)
}
}
- Run Your Existing Training Code
Your BuildDocumentClassifierRequest and begin_build_classifier code is already correct. You do not need to change it. Simply run it again using the corrected doc_types object from Step 2, and the service will now find your files.
For more information, please refer to the official Microsoft documentation:
AzureBlobContentSource (v4.0 SDK): Note this model only has container_url.
ClassifierDocumentTypeDetails (v3 SDK - for comparison): This is the old model that used container_url and prefix separately. This shows the change.
Please let us know if this helps. If yes, kindly "Accept the answer" and/or upvote, so it will be beneficial to others in the community as well.