Clarification on Chat Completion Roles, Capabilities, and Exposing Chain-of-Thought for o4-mini in Azure OpenAI

Harinath J 285 Reputation points
2025-05-21T07:32:06.5033333+00:00

Hi everyone,

I’m working with the Azure OpenAI o4-mini reasoning model via the Chat Completions API and I have a few questions about best practices, limitations, and how to surface the model’s internal reasoning:

  1. Roles
    • I see that “developer” messages are supported for o-series models, but normal models use “system.” Should I always use role: "developer" with o4-mini (and never system), or is mixing them acceptable?
    • Are there any other hidden or unsupported roles I should know about?
  2. Parameter Support & Limits
    • Which parameters are supported for o4-mini? I know that max_completion_tokens and reasoning_effort work, but parameters like temperature, top_p, presence_penalty, etc., seem to be ignored. Can you confirm?
    • What are the hard limits for token counts and throughput on o4-mini in Azure?
  3. Prompting Guidance
    • Do you have any official or community-recommended templates for structuring system/developer instructions vs. user content when using o-series models?
    • Any tips on controlling “chain-of-thought” depth or output format (e.g., JSON, bullet lists) via the prompt?
  4. Exposing Chain-of-Thought / “Thought Summaries”
    • In the Azure docs I saw a mention of a summary (or “thoughts”) parameter to explicitly surface the model’s reasoning steps, similar to the GPT UI “Show reasoning” feature. When I include that parameter in my payload, I get an “invalid argument” error. Is there a supported way to retrieve the model’s internal chain-of-thought or “thought summaries” in Azure’s o-series API?
    • If it isn’t yet supported, are there any recommended workarounds or upcoming feature flags?
  5. Documentation & Examples
    • Are there any up-to-date Azure docs, sample code repos, or forum threads that dive deep into these topics?

Thanks in advance for your help! Any pointers, example payloads, or timelines for feature support would be greatly appreciated.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,118 questions
{count} votes

Accepted answer
  1. JAYA SHANKAR G S 4,035 Reputation points Microsoft External Staff Moderator
    2025-05-27T08:37:44.36+00:00

    Hello @Harinath J ,

    Below are the answers for your questions.

    1. Roles : Both developer and system message alone are supported but not both at same time, it is also mentioned in this documentation

    When you use a system message with o4-mini, o3, o3-mini, and o1 it will be treated as a developer message. You should not use both a developer message and a system message in the same API request.

    There are no hidden roles mentioned in documentation al we have is developer, system and user roles.

    1. Parameter Support & Limits: Yes, you are right that max_completion_tokens and reasoning_effort works and below are not supported with reasoning models.
    • temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs, logit_bias, max_tokens

    Regarding the toke usage you can refer this table for reasoning models.

    1. Prompting Guidance: Yes, here is the documentation for designing the system messages, you can also control output, below is the sample system message and output for it.

    System message

    You're an assistant designed to extract entities from text. Users will paste in a string of text and you'll respond with entities you've extracted from the text as a JSON object.
    

    Output sample

    {  
       "name": "",
       "company": "",
       "phone_number": ""
    }
    
    1. Exposing Chain-of-Thought / “Thought Summaries”: We have a parameter reasoning while doing responses Api where you will get summaries of the model's chain of thought reasoning.

    Below is the sample request.

    response = client.responses.create(
        input="Tell me about the curious case of neural text degeneration",
        model="o4-mini", # replace with model deployment name
        reasoning={
            "effort": "medium",
            "summary": "detailed" # auto, concise, or detailed (currently only supported with o4-mini and o3)
        }
    )
    

    You can check more about this here.

    1. Documentation & Examples: You can check below documentation for reasoning models, here supporting features and usages with demo are documented. Azure OpenAI reasoning models - o3-mini, o1, o1-mini - Azure OpenAI | Microsoft Learn

    Please check all of this and let us know if you have any query in comments.

    Thank you


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.