An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Marvin Garcia,
Thank you for sharing the details. Let me clarify what is happening with your batch job.
The garbled or blank-looking output you observed (for example, "rrrr /n \r\n\n\n") typically occurs when the model is asked to generate a very large number of tokens (max_tokens=10000). In such cases, the model may “drift” or produce repetitive filler text when it reaches the upper bound of generation. This behavior is not treated as an error at the API level, which is why you see status_code: 200.
In your example, prompt_tokens was 309 but total_tokens was 33,077. This is because total_tokens includes both the prompt and all generated output tokens, even if those outputs are repetitive or appear blank. Thousands of newline or whitespace tokens still count toward billing, which explains why the token usage seems unexpectedly high.
Regarding billing, yes you are charged for all tokens that the model processes or generates, even if the output is not useful. Azure OpenAI billing is based solely on token usage (prompt + completion), with no distinction between “good” and “bad” tokens. So, if a request consumes 33k tokens, those tokens are billable. The batch system returns a 200 response as long as the request was successfully processed by the model and produced a response payload. Errors are only thrown if the input exceeds limits, the job fails internally, or the service is unavailable. A malformed or repetitive response is still considered a valid completion.
To reduce this issue, it’s best to set a more realistic max_tokens limit rather than always using 10,000 starting with smaller values such as 1,000-2,000 and increasing only if needed. You can also add post-processing validation to detect malformed or empty outputs and retry those rows with adjusted parameters. Prompt engineering techniques, such as instructing the model to “stop when the answer is complete” or using stop sequences, may help prevent drifting. Finally, make sure to monitor token usage in the Azure portal and configure cost alerts to avoid unexpected charges.
Please find the attached document for your reference:
I Hope this helps. Do let me know if you have any further queries.
Thank you!