Can not set instance count when using Batch Endpoint. (From job)

Hongbo Jiao (CSI Interfusion Inc) 0 Reputation points Microsoft External Staff
2025-06-05T09:47:22.7133333+00:00

Describe your suggestion

https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-batch-pipeline-from-job?view=azureml-api-2&tabs=cli

I have a Pipeline job with 2 steps. The first step is configured with 2 instance counts. However, when I publish this job to Batch Endpoint, the instance count becomes 1, and there is no other way to set the instance count when calling batch endpoint.

When published to Batch Endpoint, the instance_count and process_count_per_node not works, but the run_invocation_timeout in environment variables is works.

"Jobs": { "flow_step": { "type": "parallel", "resources": { "instance_count": 2 }, "error_threshold": -1, "environment_variables": { "AZUREML_PARAMETER_aml_run_invocation_timeout": "13800", "AZUREML_PARAMETER_aml_run_max_try": "3", "AZUREML_PARAMETER_aml_process_count_per_node": "8", "AZUREML_PARAMETER_aml_max_concurrency_per_instance": "8", "AZUREML_PARAMETER_aml_error_threshold": "-1" }, "max_concurrency_per_instance": 8,

Additional details

My step is a PromptFlow step and is a parallel job step.

Also, in PromptFlow, the instance count settings are after the flow_component.

https://microsoft.github.io/promptflow/cloud/azureai/use-flow-in-azure-ml-pipeline.html#component-ports-and-run-settings

My BatchEndpoint: https://ml.azure.com/endpoints/batch/magazine-quality-endpoint/detail?wsid=%2Fsubscriptions%2Fcdad8da4-3993-4639-abd6-61ae2ca998c2%2FresourceGroups%2Falgoblock-llm-westus2%2Fproviders%2FMicrosoft.MachineLearningServices%2Fworkspaces%2FAlgoBlock_LLM_westus2&tid=72f988bf-86f1-41af-91ab-2d7cd011db47&reloadCount=1

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,333 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Pavankumar Purilla 8,335 Reputation points Microsoft External Staff Moderator
    2025-06-09T07:12:06.88+00:00

    Hi Hongbo Jiao (CSI Interfusion Inc),

    Currently, when publishing a pipeline job to a Batch Endpoint, the instance_count and process_count_per_node parameters specified inside a step (such as a parallel step using a PromptFlow component) are not respected at runtime. Instead, Batch Endpoints run with a default instance_count of 1, and there is no supported way to override this directly when invoking the endpoint.

    This behavior is by design. Unlike standalone parallel jobs or pipeline jobs executed directly, Batch Endpoints do not currently support configuring compute resource scaling (e.g., instance_count) from the job definition or via endpoint invocation.

    If scaling is required for your workload (e.g., increasing the number of nodes to speed up processing), we recommend running the parallel job or pipeline job directly, outside of the Batch Endpoint context, where these resource parameters will be honored.

    We understand this limitation may impact certain use cases, and we encourage you to submit feedback through Azure feedback channels for support of custom scaling in Batch Endpoints.


  2. Alex Burlachenko 9,780 Reputation points
    2025-06-09T08:31:15.8066667+00:00

    Hongbo Jiao hi and thanks for posting this, its a tricky one but we'll figure it out ))

    microsoft's batch endpoint has its own way of handling instance counts, and yeah it can be confusing when pipeline settings don't carry over.

    for microsoft's batch endpoint, u need to specify instance count directly in the deployment config. look for the 'scale_settings' section- thats where u set the instance count for batch runs. its not pulling from the pipeline job settings, which is why u see it defaulting to 1.

    aha! and for promptflow specifically, since its running as a parallel step, u might wanna double check the component settings in the pipeline. the instance count there is just for pipeline execution, not batch deployment. kinda weird.... i know )

    when dealing with parallel jobs across different platforms, always check where the scaling controls live. some systems put it in deployment configs, others in runtime params. worth looking into the specific docs for each tool.

    also try this -sometimes environment variables get ignored if the syntax is slightly off. make sure u're using the exact parameter names the service expects. this might help in other tools too where settings seem to vanish ))

    let me know if this helps %

    Best regards,

    Alex

    and "yes" if you would follow me at Q&A - personaly thx.
    P.S. If my answer help to you, please Accept my answer
    PPS That is my Answer and not a Comment
    

    https://ctrlaltdel.blog/

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.