python sdk can't find custom classification model, can find custom extraction model

Question

python sdk can't find custom classification model, can find custom extraction model

Sarah Cummings 45

I created both an Azure Custom Classification Model and and Azure Custom Extraction model in Form Recognizer Studio, and I am satisfied with the results. Now, I want to generate these results in python.

The python SDK document_analysis_client.begin_analyze_document() function works great for my extraction model, but for some reason when i switch out the model_id for my classification model, I get:

ResourceNotFoundError: (NotFound) Resource not found.
Code: NotFound
Message: Resource not found.
Inner error: {
    "code": "ModelNotFound",
    "message": "The requested model was not found."
}

I have tried both sdk version 3.2.1 and 3.3.0b1, and get the same error with each. Why can't I find and use my classification model?

YutongTie-MSFT 53,971 Reputation points Moderator

2023-07-13T01:44:07.6566667+00:00

Hello @Sarah Cummings

Thanks for reaching out to us, there are some pre-check items for your issue -

The "Resource not found" error message that you are seeing when trying to use your Azure Custom Classification Model in the Python SDK may be caused by a few different issues. Here are a few things you can try to resolve the issue:

Check the model ID: Make sure that you are using the correct model ID for your Azure Custom Classification Model. You can find the model ID in the Azure portal or in the Form Recognizer Studio. Double-check that the model ID you are using in your Python code matches the ID of your classification model.

Check the model status: Make sure that your Azure Custom Classification Model is in a "ready" state before trying to use it in the Python SDK. You can check the status of your model in the Azure portal or in the Form Recognizer Studio. If the model is still training or has encountered errors, you may need to wait until it is ready before using it in the Python SDK.

Check your authentication credentials: Make sure that you are using the correct authentication credentials in your Python code. You may need to check that your Azure subscription and Form Recognizer resource are properly configured and that you have the correct credentials.

Check your API version: Make sure that you are using the correct API version in your Python code. You can check the supported API versions in the Azure portal or in the Form Recognizer documentation. If you are using an outdated API version, you may need to update your code to use a newer version.

If possible, could you please check on above to see if there any inconsistencies. Please let me know how it works. Thanks

Regards, Yutong
Sarah Cummings 45 Reputation points

2023-07-13T14:19:14.65+00:00

Hi YutongTie-MSFT

Can you confirm that for the classification models, the model_id is actually called the classifier ID? I'm using a classifier ID that corresponds to a model that is in "Succeeded" status in https://formrecognizer.appliedai.azure.com/studio/document-classifier/projects/

Where do I see if it is "ready"?

I'm fairly confident that my authentication credentials are correct and that my resource is configured correctly, since I am able to use my custom extraction model via python. Swapping out for the classification model id should not effect the credentials.

How do I confirm I'm using the right API version? My pytyhon sdk is the newest version, and my model API version says 2023-02-28-preview.

Thanks for all your help.
Dittrich, Florian 0 Reputation points

2023-12-22T15:47:17.2566667+00:00

Running into the same issue. Using API Version 2023-07-31. My Model ID is the same as the Classifier ID in my Models list in the document intelligence studio. Endpoint and Key are definitely correct aswell, however no matter if I run my tests through python or Azure's own Testing page for the API, I am receiving the same error message, telling me that the model was not found (404).

Sarah Cummings 45

Hi @Dittrich, Florian , I ended up working with a support person on this and ended up using custom python code to call my classification model:

def _post_to_classification_model(pdf_bytes: bytes) -> dict:
    """
    Using configured form recognizer key and model specifications from config,
    post the pdf to the and azure ai classification model for prediction.
    Returns the post response.
    """

    FORM_RECOGNIZER_KEY = os.getenv("FORM_RECOGNIZER_KEY")

    post_url = (
        ENDPOINT
        + f"/formrecognizer/{API_TYPE}/{FACEPAGE_CLASSIFICATION_MODEL_ID}:analyze?api-version={API_VERSION}"
    )
    params = {"includeTextDetails": True}

    headers = {
        # Request headers
        "Content-Type": "application/pdf",
        "Ocp-Apim-Subscription-Key": FORM_RECOGNIZER_KEY,
    }
    logger.debug(f"FORM REC KEY IS: {FORM_RECOGNIZER_KEY}")
    try:
        resp = post(
            url=post_url, data=pdf_bytes, headers=headers, params=params
        )
        if resp.status_code != 202:
            logger.warning(
                "POST analyze failed:\n%s" % json.dumps(resp.json())
            )
            quit()
        logger.info("POST analyze succeeded:\n%s" % resp.headers)
    except Exception as e:
        logger.warning("POST analyze failed:\n%s" % str(e))

    return resp



def _get_classification_results(post_response: dict) -> dict:
    """
    Given our response from our post request for classification,
    retrieve the classificaiton results. Returns the get response.
    """

    get_url = post_response.headers["operation-location"]

    n_tries = 15
    n_try = 0
    wait_sec = 5
    max_wait_sec = 60
    resp_json = None

    while n_try < n_tries:
        try:
            resp = get(
                url=get_url,
                headers={
                    "Ocp-Apim-Subscription-Key": os.getenv(
                        "FORM_RECOGNIZER_KEY"
                    )
                },
            )
            resp_json = resp.json()
            if resp.status_code != 200:
                logger.warning(
                    "GET analyze results failed:\n%s" % json.dumps(resp_json)
                )
                break
            status = resp_json["status"]
            if status == "succeeded":
                logger.info("Analysis succeeded:\n%s" % json.dumps(resp_json))
                break
            if status == "failed":
                logger.warning("Analysis failed:\n%s" % json.dumps(resp_json))
                break
            # Analysis still running. Wait and retry.
            time.sleep(wait_sec)
            n_try += 1
            wait_sec = min(2 * wait_sec, max_wait_sec)

        except Exception as e:
            msg = "GET analyze results failed:\n%s" % str(e)
            logger.warning(msg)
            break

    return resp_json

but it is a bit weird to me that you can't call your model within azure's own testing page. I did not experience that problem so i might not solve your issue.

Dittrich, Florian 0 Reputation points

2024-01-02T08:15:45.5366667+00:00

Correct, sadly it did not solve my issue.
Are you using he Azure AI Services Resource or the Form Recognizer Resource to Train the Model?

Edit:
I tried using the Form Recognizer Resource, however I am still running into the same issue.
Dittrich, Florian 0 Reputation points

2024-01-04T17:12:51.0133333+00:00

Okay, I found what caused my issues - I was using "documentModels" as my API Type instead of "documentClassifiers". Rookie mistake. Easy to make.
Rasmus Kromann 20 Reputation points

2024-04-12T07:14:00.7+00:00

@Sarah Cummings , did you find a better solution than the custom python scripts? I have experienced the exact same thing as you: Trained both a custom extraction model and a custom classification model, and being able to call the extraction model in the "standard" way through the Python SDK, but not being able to call the classification model at all. In my view this is a bug on Microsofts parts, and the custom scripts, while they may work, in not something you would base a production on. Just my two cents.
Rasmus Kromann 20 Reputation points

2024-04-12T07:17:23.04+00:00

@Dittrich, Florian could you expand a bit on your comment about "documentModels" versus "documentClassifiers". Are you specifying the API Type somewhere in your code?
Dittrich, Florian 0 Reputation points

2024-04-14T20:15:47.7366667+00:00
@Rasmus Kromann Yes. If you look at the code that Srah Cummings posted as an answer, she is using
f"/formrecognizer/{API_TYPE}/{FACEPAGE_CLASSIFICATION_MODEL_ID}:analyze?api-version={API_VERSION}"

I simply defined API_TYPE as:

API_TYPE : str = "documentModels" # for extraction models # or API_TYPE : str = "documentClassifiers" # for classification models

and used it to build my POST endpoint url.

for more information and testing out the API, look here:

DocumentModels

DocumentClassifiers

1 answer

Your answer

YutongTie-MSFT 53,971 Reputation points Moderator

2023-07-13T01:44:07.6566667+00:00

Hello @Sarah Cummings

Thanks for reaching out to us, there are some pre-check items for your issue -

The "Resource not found" error message that you are seeing when trying to use your Azure Custom Classification Model in the Python SDK may be caused by a few different issues. Here are a few things you can try to resolve the issue:

Check the model ID: Make sure that you are using the correct model ID for your Azure Custom Classification Model. You can find the model ID in the Azure portal or in the Form Recognizer Studio. Double-check that the model ID you are using in your Python code matches the ID of your classification model.

Check the model status: Make sure that your Azure Custom Classification Model is in a "ready" state before trying to use it in the Python SDK. You can check the status of your model in the Azure portal or in the Form Recognizer Studio. If the model is still training or has encountered errors, you may need to wait until it is ready before using it in the Python SDK.

Check your authentication credentials: Make sure that you are using the correct authentication credentials in your Python code. You may need to check that your Azure subscription and Form Recognizer resource are properly configured and that you have the correct credentials.

Check your API version: Make sure that you are using the correct API version in your Python code. You can check the supported API versions in the Azure portal or in the Form Recognizer documentation. If you are using an outdated API version, you may need to update your code to use a newer version.

If possible, could you please check on above to see if there any inconsistencies. Please let me know how it works. Thanks

Regards, Yutong
Sarah Cummings 45 Reputation points

2023-07-13T14:19:14.65+00:00

Hi YutongTie-MSFT

Can you confirm that for the classification models, the model_id is actually called the classifier ID? I'm using a classifier ID that corresponds to a model that is in "Succeeded" status in https://formrecognizer.appliedai.azure.com/studio/document-classifier/projects/

Where do I see if it is "ready"?

I'm fairly confident that my authentication credentials are correct and that my resource is configured correctly, since I am able to use my custom extraction model via python. Swapping out for the classification model id should not effect the credentials.

How do I confirm I'm using the right API version? My pytyhon sdk is the newest version, and my model API version says 2023-02-28-preview.

Thanks for all your help.
Dittrich, Florian 0 Reputation points

2023-12-22T15:47:17.2566667+00:00

Running into the same issue. Using API Version 2023-07-31. My Model ID is the same as the Classifier ID in my Models list in the document intelligence studio. Endpoint and Key are definitely correct aswell, however no matter if I run my tests through python or Azure's own Testing page for the API, I am receiving the same error message, telling me that the model was not found (404).
Dittrich, Florian 0 Reputation points

2024-01-02T08:15:45.5366667+00:00

Correct, sadly it did not solve my issue.
Are you using he Azure AI Services Resource or the Form Recognizer Resource to Train the Model?

Edit:
I tried using the Form Recognizer Resource, however I am still running into the same issue.
Dittrich, Florian 0 Reputation points

2024-01-04T17:12:51.0133333+00:00

Okay, I found what caused my issues - I was using "documentModels" as my API Type instead of "documentClassifiers". Rookie mistake. Easy to make.
Rasmus Kromann 20 Reputation points

2024-04-12T07:14:00.7+00:00

@Sarah Cummings , did you find a better solution than the custom python scripts? I have experienced the exact same thing as you: Trained both a custom extraction model and a custom classification model, and being able to call the extraction model in the "standard" way through the Python SDK, but not being able to call the classification model at all. In my view this is a bug on Microsofts parts, and the custom scripts, while they may work, in not something you would base a production on. Just my two cents.
Rasmus Kromann 20 Reputation points

2024-04-12T07:17:23.04+00:00

@Dittrich, Florian could you expand a bit on your comment about "documentModels" versus "documentClassifiers". Are you specifying the API Type somewhere in your code?
Dittrich, Florian 0 Reputation points

2024-04-14T20:15:47.7366667+00:00

@Rasmus Kromann Yes. If you look at the code that Srah Cummings posted as an answer, she is using
f"/formrecognizer/{API_TYPE}/{FACEPAGE_CLASSIFICATION_MODEL_ID}:analyze?api-version={API_VERSION}"

I simply defined API_TYPE as:

API_TYPE : str = "documentModels" # for extraction models # or API_TYPE : str = "documentClassifiers" # for classification models

and used it to build my POST endpoint url.

for more information and testing out the API, look here:

DocumentModels

DocumentClassifiers

Answer 1

I ended up using custom python code to use my classification model:

def _post_to_classification_model(pdf_bytes: bytes) -> dict:
    """
    Using configured form recognizer key and model specifications from config,
    post the pdf to the and azure ai classification model for prediction.
    Returns the post response.
    """

    FORM_RECOGNIZER_KEY = os.getenv("FORM_RECOGNIZER_KEY")

    post_url = (
        ENDPOINT
        + f"/formrecognizer/{API_TYPE}/{FACEPAGE_CLASSIFICATION_MODEL_ID}:analyze?api-version={API_VERSION}"
    )
    params = {"includeTextDetails": True}

    headers = {
        # Request headers
        "Content-Type": "application/pdf",
        "Ocp-Apim-Subscription-Key": FORM_RECOGNIZER_KEY,
    }
    logger.debug(f"FORM REC KEY IS: {FORM_RECOGNIZER_KEY}")
    try:
        resp = post(
            url=post_url, data=pdf_bytes, headers=headers, params=params
        )
        if resp.status_code != 202:
            logger.warning(
                "POST analyze failed:\n%s" % json.dumps(resp.json())
            )
            quit()
        logger.info("POST analyze succeeded:\n%s" % resp.headers)
    except Exception as e:
        logger.warning("POST analyze failed:\n%s" % str(e))

    return resp


def _get_classification_results(post_response: dict) -> dict:
    """
    Given our response from our post request for classification,
    retrieve the classificaiton results. Returns the get response.
    """

    get_url = post_response.headers["operation-location"]

    n_tries = 15
    n_try = 0
    wait_sec = 5
    max_wait_sec = 60
    resp_json = None

    while n_try < n_tries:
        try:
            resp = get(
                url=get_url,
                headers={
                    "Ocp-Apim-Subscription-Key": os.getenv(
                        "FORM_RECOGNIZER_KEY"
                    )
                },
            )
            resp_json = resp.json()
            if resp.status_code != 200:
                logger.warning(
                    "GET analyze results failed:\n%s" % json.dumps(resp_json)
                )
                break
            status = resp_json["status"]
            if status == "succeeded":
                logger.info("Analysis succeeded:\n%s" % json.dumps(resp_json))
                break
            if status == "failed":
                logger.warning("Analysis failed:\n%s" % json.dumps(resp_json))
                break
            # Analysis still running. Wait and retry.
            time.sleep(wait_sec)
            n_try += 1
            wait_sec = min(2 * wait_sec, max_wait_sec)

        except Exception as e:
            msg = "GET analyze results failed:\n%s" % str(e)
            logger.warning(msg)
            break

    return resp_json

Share via

python sdk can't find custom classification model, can find custom extraction model

1 answer

Your answer