Hi Siry, Gaetan
The issue you're experiencing with the o4-mini model generating hallucinations, such as providing a workout plan in response to a math problem, could be attributed to several factors. Since the Responses API is currently in preview, it may not yet be fully optimized for all types of queries, especially complex reasoning tasks.
The hallucinations could be a result of the model's limitations in understanding the context or the specific nature of the task when reasoning is enabled. In contrast, the ChatCompletions API might be more robust for certain types of queries, as you noted that it performs well with reasoning.
It's important to consider that the Responses API may have different performance characteristics compared to the ChatCompletions API.
Kindly refer below link: responses
Thank You.