Clarification on Chat Completion Roles, Capabilities, and Exposing Chain-of-Thought for o4-mini in Azure OpenAI

Question

Clarification on Chat Completion Roles, Capabilities, and Exposing Chain-of-Thought for o4-mini in Azure OpenAI

Harinath J 285

Hi everyone,

I’m working with the Azure OpenAI o4-mini reasoning model via the Chat Completions API and I have a few questions about best practices, limitations, and how to surface the model’s internal reasoning:

Roles
- I see that “developer” messages are supported for o-series models, but normal models use “system.” Should I always use role: "developer" with o4-mini (and never system), or is mixing them acceptable?
- Are there any other hidden or unsupported roles I should know about?
Parameter Support & Limits
- Which parameters are supported for o4-mini? I know that max_completion_tokens and reasoning_effort work, but parameters like temperature, top_p, presence_penalty, etc., seem to be ignored. Can you confirm?
- What are the hard limits for token counts and throughput on o4-mini in Azure?
Prompting Guidance
- Do you have any official or community-recommended templates for structuring system/developer instructions vs. user content when using o-series models?
- Any tips on controlling “chain-of-thought” depth or output format (e.g., JSON, bullet lists) via the prompt?
Exposing Chain-of-Thought / “Thought Summaries”
- In the Azure docs I saw a mention of a summary (or “thoughts”) parameter to explicitly surface the model’s reasoning steps, similar to the GPT UI “Show reasoning” feature. When I include that parameter in my payload, I get an “invalid argument” error. Is there a supported way to retrieve the model’s internal chain-of-thought or “thought summaries” in Azure’s o-series API?
- If it isn’t yet supported, are there any recommended workarounds or upcoming feature flags?
Documentation & Examples
- Are there any up-to-date Azure docs, sample code repos, or forum threads that dive deep into these topics?

Thanks in advance for your help! Any pointers, example payloads, or timelines for feature support would be greatly appreciated.

Saideep Anchuri 9,500 Reputation points Moderator

2025-05-21T08:07:52.3+00:00
Hi Harinath J

To clarify your questions regarding working with the Azure OpenAI o4-mini reasoning model via the Chat Completions API,

Roles:

For the o4-mini, you should generally use role: "developer" as it is meant for o-series models. Mixing with role: "system" is not recommended since the roles have specific contexts and purposes.

As for unsupported roles, currently, the main roles are system, user, and developer. There are no public mentions of additional hidden roles, so it’s safe to stick with these.

Parameter Support & Limits:

The parameters you can reliably use with o4-mini include max_completion_tokens and reasoning_effort. However, temperature, top_p, and presence_penalty may not be effective as you indicated—they appear to be ignored for this model.

For token counts and throughput limits, the specifics may vary, but typical hard limits for Azure OpenAI models are around 4096 tokens for input and output combined. Throughput can vary based on your subscription and service tier, so check your Azure account for details.

Prompting Guidance:

For templates, while official documentation may not provide exact templates, it’s helpful to clearly segregate system/developer instructions and user inputs. You can use a structured format, for example:
System Message: Provide guidance on X.

User Message: What should I do next? ```

To manage chain-of-thought depth, ask the model directly in your prompt to provide reasoning steps. Using structured outputs like JSON can be encouraged through explicit instructions, but the success will depend on the model's capabilities.

Exposing Chain-of-Thought / “Thought Summaries”:

The summary parameter you're referencing seems to be unsupported based on the error you're encountering. Unfortunately, if you're facing an "invalid argument" error, Azure's o-series API does not currently facilitate explicit output of reasoning steps directly in your completions.

As for workarounds, you might prompt the model to summarize its thought process instead of relying on an explicit parameter.

Documentation & Examples:

For updated and in-depth resources, the Azure documentation is a reliable source. You can look for specific topics in the official Azure docs on OpenAI services. Also, GitHub repositories related to Azure OpenAI often contain sample codes and community discussions

Kindly refer below link: transparency-note

Thank You.
Prashanth Veeragoni 5,645 Reputation points Microsoft External Staff Moderator

2025-05-22T12:16:39.34+00:00

Hi Harinath J,

Following up to see if the above suggestion was helpful. If you have any further queries do let us know.

Thank you!
Saideep Anchuri 9,500 Reputation points Moderator

2025-05-23T09:39:08.21+00:00

Hi Harinath J,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.

Thank You.

Accepted answer

0 additional answers

Your answer

Prashanth Veeragoni 5,645 Reputation points Microsoft External Staff Moderator

2025-05-22T12:16:39.34+00:00

Hi Harinath J,

Following up to see if the above suggestion was helpful. If you have any further queries do let us know.

Thank you!
Saideep Anchuri 9,500 Reputation points Moderator

2025-05-23T09:39:08.21+00:00

Hi Harinath J,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.

Thank You.

Answer 1

Hello @Harinath J ,

Below are the answers for your questions.

Roles : Both developer and system message alone are supported but not both at same time, it is also mentioned in this documentation

When you use a system message with o4-mini, o3, o3-mini, and o1 it will be treated as a developer message. You should not use both a developer message and a system message in the same API request.

There are no hidden roles mentioned in documentation al we have is developer, system and user roles.

Parameter Support & Limits: Yes, you are right that max_completion_tokens and reasoning_effort works and below are not supported with reasoning models.

temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs, logit_bias, max_tokens

Regarding the toke usage you can refer this table for reasoning models.

Prompting Guidance: Yes, here is the documentation for designing the system messages, you can also control output, below is the sample system message and output for it.

System message

You're an assistant designed to extract entities from text. Users will paste in a string of text and you'll respond with entities you've extracted from the text as a JSON object.

Output sample

{  
   "name": "",
   "company": "",
   "phone_number": ""
}

Exposing Chain-of-Thought / “Thought Summaries”: We have a parameter reasoning while doing responses Api where you will get summaries of the model's chain of thought reasoning.

Below is the sample request.

response = client.responses.create(
    input="Tell me about the curious case of neural text degeneration",
    model="o4-mini", # replace with model deployment name
    reasoning={
        "effort": "medium",
        "summary": "detailed" # auto, concise, or detailed (currently only supported with o4-mini and o3)
    }
)

You can check more about this here.

Documentation & Examples: You can check below documentation for reasoning models, here supporting features and usages with demo are documented. Azure OpenAI reasoning models - o3-mini, o1, o1-mini - Azure OpenAI | Microsoft Learn

Please check all of this and let us know if you have any query in comments.

Thank you

JAYA SHANKAR G S 4,035 Reputation points Microsoft External Staff Moderator

2025-05-28T05:00:31.97+00:00

Hello @Harinath J ,

Please check if above given answer helped you or not and let us know if you have any query in comments.

Thank you
JAYA SHANKAR G S 4,035 Reputation points Microsoft External Staff Moderator

2025-05-29T06:43:44.8266667+00:00

Hello @Harinath J ,

I hope the above given answer is helped you and do let us know if you have any query in comments.

Thank you

Share via

Clarification on Chat Completion Roles, Capabilities, and Exposing Chain-of-Thought for o4-mini in Azure OpenAI

0 additional answers

Your answer