How to deploy a prompt flow with az ml CLI without enabling streaming response?

Tibor Fabian 0 Reputation points
2025-03-20T11:30:51.5266667+00:00

Hi, I deployed a prompt flow as online endpoint via CLI az ml (as part of CI/CD pipeline). The deployment succeeded but the default output mode is streaming obviously. I get the following error during testing:

... execution.flow     INFO     Node generate_answer completes.
... - Flow run result: <REDACTED>
... - Flow does not enable streaming response.

Which is makes sense because my flow is not a streaming one.

Is there a way to create and endpoint/deployment via yaml configuration so streaming is not activated.

Thanks

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,244 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Azar 27,670 Reputation points MVP
    2025-03-20T17:10:21.7566667+00:00

    Hi there Tibor Fabian

    thanks for using QandA platform

    you need to explicitly set the mode in your YAMl config

    In your deployment YAML file, add the following under

    output:
      mode: raw
    

    this sets the output mode is non-streaming (raw) instead of the default streaming mode.

    Then, update your deployment with:

    az ml online-endpoint update --name <your-endpoint-name> --file <your-config-file>.yml
    

    This should prevent Azure ML from using streaming responses when deploying your Prompt Flow.

    If this helps kindly accept the answer thanks much.


  2. Tibor Fabian 0 Reputation points
    2025-03-21T07:37:21.1366667+00:00

    If you've come across this message, don't worry—it's just an informational log, not an actual issue. It was a bit misleading during debugging.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.