Share via

I am facing issue while using Cohere Rerank model. Get 500 internal error

Gohil, Shubh 0 Reputation points
2026-02-18T17:24:11.0766667+00:00

    filtered_data = [
        {
            "pageText": "This is test1",
            "metadata": "Source1"
        },
        {
            "pageText": "This is test2",
            "metadata": "Source2"
        },
        {
            "pageText": "This is test3",
            "metadata": "Source3"
        },
        {
            "pageText": "This is test4",
            "metadata": "Source4"
        },
    ]

    results = reranker.rerank(
        model="rerank-v4.0-fast",
        documents=filtered_data,
        query=state["question"],
        top_n=3,
    )

I get 500 internal error. I am calling right endpoint and using correct key.

Azure OpenAI in Foundry Models
0 comments No comments

1 answer

Sort by: Most helpful
  1. SRILAKSHMI C 19,100 Reputation points Microsoft External Staff Moderator
    2026-03-10T12:16:32.84+00:00

    Hello Gohil, Shubh,

    Welcome to Microsoft Q&A and Thank you for sharing the code snippet and details.

    A 500 Internal Server Error when calling the Cohere Rerank v4.0 Fast generally means that the request reached the service successfully, but something failed during processing. This can sometimes be caused by request formatting issues, temporary service conditions, or regional capacity constraints.

    Based on the code you shared, one potential issue is the document format passed to the rerank API. The documents parameter is typically expected to be a list of text strings, but your example sends objects containing pageText and metadata. You may want to pass only the text field, for example:

    documents = filtered_data.map { |doc| doc["pageText"] }
    
    results = reranker.rerank(
      model: "rerank-v4.0-fast",
      documents: documents,
      query: state["question"],
      top_n: 3
    )
    

    In addition to adjusting the document format, you may want to check the following:

    Sometimes 500 errors are temporary. Waiting briefly and retrying the request can help determine whether it was a transient service issue.

    Ensure the request payload matches the expected schema and that all JSON keys and values are correctly formatted. Also verify that the query (state["question"]) is not empty or null.

    Double-check that you are calling the correct endpoint and API version for the model within Azure AI Foundry.

    Verify that your resource is deployed in a region where the rerank model is available and that there are no service health issues affecting the region.

    If your application sends many requests, review usage in Azure Monitor to confirm you are not hitting capacity limits or throttling conditions.

    Adding retry logic and proper error handling in your application can help manage intermittent 500 responses more gracefully.Please refer this Cohere Rerank models.

    I Hope this helps. Do let me know if you have any further queries.

    Thank you!

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.