Hi Dileep Ponna,
The issue you're facing is a silent failure where your request to the Cohere model via Azure AI returns a 200 OK HTTP status code, but the response body is empty (Content-Length: 0). That’s why the code crashes when trying to parse the JSON response — there's nothing to parse.
This usually happens in Azure AI Services when:
· The model is not fully ready or deployed incorrectly.
· The request is hitting an endpoint that doesn’t properly route to the model's /embeddings operation.
· A wrong API version or incorrect model name or capability is used.
· Cohere model is not compatible with the embeddings operation as invoked (especially if it's not an Azure-native model).
Steps to Fix:
1.Double-check endpoint and route
Ensure the full endpoint includes /embeddings, like:
https://<resource-name>.services.azure.com/openai/deployments/<deployment-name>/embeddings?api-version=2024-02-15-preview
If you're calling just /models/embeddings, that might not route correctly for Cohere.
2.Verify model capabilities
Check if the Cohere-embed-v3-english model is actually registered to serve embedding tasks. Use the Azure CLI or REST to list models and capabilities:
az ai model list --resource-group <rg> --workspace-name <workspace>
Look for the task: embedding capability under the model metadata.
3.Use latest supported api-version
Use a supported preview version like 2024-02-15-preview instead of an old or redacted one.
4.Test with simple curl/postman
Try sending a simple POST manually:
curl -X POST https://<endpoint>/openai/deployments/<deployment>/embeddings?api-version=2024-02-15-preview \
-H "api-key: <your-key>" \
-H "Content-Type: application/json" \
-d '{"input": ["Hello world"]}'
If this still returns 200 and empty — then either:
· The model deployment is broken,
· OR you're hitting the wrong route.
Hope this helps, do let me know if you have any further queries.
Thank you!