Hi there Tibor Fabian
thanks for using QandA platform
you need to explicitly set the mode
in your YAMl config
In your deployment YAML file, add the following under
output:
mode: raw
this sets the output mode is non-streaming (raw
) instead of the default streaming mode.
Then, update your deployment with:
az ml online-endpoint update --name <your-endpoint-name> --file <your-config-file>.yml
This should prevent Azure ML from using streaming responses when deploying your Prompt Flow.
If this helps kindly accept the answer thanks much.