An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
GPT-4.1 and similar LLMs can occasionally ignore or override language instructions in the prompt because responses are probabilistic and the model can be influenced by subtle signals in the conversation history or training data. Even when all visible messages are in English and the system prompt says “Preserve original language… Never translate,” the model can still:
- Misinterpret the “original language” when there are prior multilingual tokens, names, or examples in the thread.
- Drift to another language due to stochastic behavior, especially across multiple turns.
- Be affected by other hidden or upstream instructions in the calling application or orchestration layer.
To reduce this behavior for a query condenser prompt:
- Strengthen and repeat the language constraint
- Add explicit, high‑priority instructions such as:
- “All input messages are in English. Always respond in English only. Do not use any other language under any circumstances.”
- Repeat this instruction near the end of the system message so it is more salient.
- Add explicit, high‑priority instructions such as:
- Make the task and language requirement more specific
- Instead of only “Preserve original language,” use something like:
- “Reformulate the user’s latest query in English, preserving meaning and intent. Do not translate to or from any other language. If any non‑English text appears in the input, still respond in English.”
- Instead of only “Preserve original language,” use something like:
- Avoid ambiguous or multilingual concatenation
- Ensure the application does not concatenate user data or prior messages in a way that mixes languages in a single instruction block. Guidance for multilingual prompts recommends avoiding concatenating user data directly into the prompt because it can create a multilingual direct instruction and confuse the model.
- Validate and monitor
- Because model behavior is probabilistic and can vary for the same prompt, test the prompt across multiple runs and conversation histories.
- Implement guardrails in the application layer (for example, checking the output language and re‑prompting or correcting if it is not English).
These steps align with general prompt‑engineering and localization guidance: be explicit about language, avoid mixed‑language instructions, and tailor prompts for the specific task and locale behavior.
References: