AzureMLCompute job failed with Misdirected Error. Not sure why

Uzzi Emuchay 30 Reputation points
2024-06-23T00:05:13.2066667+00:00
Failed to execute command group with error Docker container `b4ba5e60ccd14697ad826a07d3d6baed-lifecycler` failed with status code `1`. Service 'DATA_CAPABILITY' returned capability start response with code: Response { code: "500", error: Some(Error { code: "data-capability.UriMountSession.HTTPError.None", message: "421 Client Error: Misdirected Request for url: https://[1609ab072b7b462be9fe579eb9719fcf]/[54818b05d116eadc7f67517a3a6e4b33]/[8944f8348afe2edd651ea99be857e3f8]/[3e22d4496725dcfe224039c4eb7f6769]", target: "UriMountSession:INPUT_job_data_path", node_info: Some(NodeInfo { node_id: "tvmps_e165a15d5543aecba2910072504dd72e23b17abf300c6b27689fa4855c613d9e_d", vm_id: "b50fa253-d221-43e3-83b8-d4f60996f617" }), category: SystemError, error_details: [ErrorDetail { key: "StackTrace", value: "  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/capability_session.py\", line 70, in start\n    (data_path, sub_data_path) = session.start()\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/data_sessions.py\", line 630, in start\n    self._get_mount_uri(),\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/data_sessions.py\", line 606, in _get_mount_uri\n    workspace = self._get_workspace()\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/cr_utilities.py\", line 66, in get_azureml_workspace\n    return self.get_azureml_run().experiment.workspace\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/cr_utilities.py\", line 72, in get_azureml_run\n    self._run = Run.get_context(allow_offline=False)\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/azureml/core/run.py\", line 377, in get_context\n    experiment, run_id = cls._load_scope()\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/azureml/core/run.py\", line 263, in _load_scope\n    service_context = ServiceContext(subscription_id,\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/azureml/_restclient/service_context.py\", line 168, in __init__\n    self._endpoints = self._fetch_endpoints()\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/azureml/_restclient/service_context.py\", line 293, in _fetch_endpoints\n    url = get_service_url(self._authentication, scope, self._workspace_id,\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/azureml/_base_sdk_common/service_discovery.py\", line 120, in get_service_url\n    return cached_service_object.get_cached_service_url(workspace_scope, service_name,\n\n  File \"/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/azureml/_base_sdk_common/service_discovery.py\", line 282, in get_cached_service_url\n    return self.get_cached_services_uris(arm_scope, service_name, unique_id=unique_id,\n\n  File \"/opt/miniconda/envs/da
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,688 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 18,501 Reputation points
    2024-06-24T08:22:08.1+00:00

    When I read the error message, I see that you have 421 Client Error: Misdirected Request" when accessing a certain URL, which suggests that the request was sent to the wrong server.

    https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/421

    https://github.com/traefik/traefik/issues/7070

    The stack trace provides details about where the error occurred in the code. It starts in the capability_session.py file at line 70, then traces through data_sessions.py, cr_utilities.py, run.py, and service_discovery.py.

    I am not an expert in this subject but after checking many forums I can tell the possible causes :

    • The URL used in the request might be misconfigured or pointing to the wrong server.
    • The request is misrouted due to incorrect DNS settings or load balancer configurations.
    • The service doesn't have the necessary permissions and authentication tokens are not correctly set up.