An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Tarandeep Singh Khurana,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that your GPT Tool Output Ignored - Model Uses Training Data Prices Instead of Function Call Results.
Following the below steps will help in resolving the issue:
- Use Structured outputs so the model must emit the exact tool values; pin both the string and numeric price via
enum(orconstif supported) and setstrictto true. See: Structured outputs and the JSON‑schema class notes on supported keywords in SDKs (e.g.,.JsonSchemaFormat) for limits and strict validation behavior. - https://learn.microsoft.com/en-us/azure/foundry/openai/how-to/structured-outputs"response_format": {"type":"json_schema","json_schema":{ "name":"price_payload","schema":{ "type":"object","properties":{ "ticker":{"type":"string"}, "price_text":{"type":"string","enum":["$248.35"]}, "price_value":{"type":"number","enum":[248.35]}, "as_of":{"type":"string"} },"required":["ticker","price_text","price_value","as_of"], "additionalProperties":false },"strict":true }} - Fetch live data via function/tool calling, then have the model write prose with placeholders and let your server inject the numbers before rendering; this mirrors Microsoft’s recommended function‑calling orchestration. See: Function calling how‑to and, if using agents, Agents + function tools.
-
template = "AAPL trades at {PRICE} as of {AS_OF}." rendered = template.replace("{PRICE}", price_text).replace("{AS_OF}", as_of) - Add a post‑generation guardrail (block drift before users see it) by run the final text against Azure AI Content Safety – Groundedness using the tool payload as context; regenerate or block if ungrounded, especially for finance. See - Groundedness (concepts) and Groundedness quickstart.
grounded = groundedness.check(response_text, context=tool_context)if not grounded: regenerate_or_block() - Keep temperature low, but don’t rely on it for correctness—schema wins Set a low temperature (e.g.,
0–0.2) and aseedif your stack supports it to steady style; accuracy still comes from schema + binding, not sampling knobs. See: Temperature semantics in the .NET inference SDK (doc) and prefer Structured outputs over plain JSON mode for strict adherence (guidance).{"model":"gpt-4.1","temperature":0.1,"seed":42,"response_format":{ "...": "json_schema as above" }} - This is (Optional) Fine‑tune for tool‑use habits, not for truth, if you want fewer validation failures, fine‑tune on traces where the assistant calls the tool and returns schema‑shaped results, but keep the schema/binding gates—those enforce truth. See: Fine‑tuning for tool calling.
{"messages":[ {"role":"user","content":"Apple price?"}, {"role":"assistant","tool_calls":[{"type":"function","function":{"name":"get_price","arguments":"{"ticker":"AAPL"}"}}]} ], "tools":[{"type":"function","function":{"name":"get_price","parameters":{"type":"object","properties":{"ticker":{"type":"string"}}," - Validate, retry, and fail closed with tool data as the source of truth On 400s from strict schema checks, retry once with a simplified schema; if still failing, render the tool values directly and log for triage. Use the Responses/Chat APIs as the transport and keep schemas within supported subsets. See: Responses API guide and schema‑strictness notes from platform guidance on structured outputs (how‑to).
try: result = client.responses.create(payload) except HTTPError as e: if e.status==400: show(tool_price); log(e)
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.