Tutorial: Moderate a model service's content with guardrails and service policies

Important

This feature is in Beta. Account admins can manage access to this feature from the account console Previews page. See Manage Azure Databricks previews.

This tutorial walks through moderating the content of a Model Service's interactions in two complementary ways. A model service can front a Azure Databricks-hosted model or an external provider such as OpenAI, Anthropic, or Google, and you govern both the same way:

Built-in guardrails: managed, Azure Databricks-provided checks for common risks such as PII, unsafe content, jailbreak attempts, and hallucinations. You attach one by selecting it in the UI, with no code to write.
Custom service policies: SQL functions you write for rules specific to your organization, such as a confidential project codename or a banned response pattern.

You attach both the same way, from a model service's Policies tab in the Unity AI Gateway UI, and you can mix them. Azure Databricks evaluates each at two points: ON CALL (before it invokes the model) and ON RESULT (after the model responds). In the UI, you select the phase when you attach the policy. A custom policy can also scope itself to one phase by branching on event:type.

Scenario: Your team exposes an LLM through a Model Service (main.default.team_chat) that apps and agents call. You want to block personally identifiable information (PII) with a managed guardrail, block any prompt that mentions a confidential project, and block responses that contain an insecure link, all without changing application code.

By the end of this tutorial, you have:

A built-in PII guardrail on the service.
A custom request policy that blocks a confidential codename at ON CALL.
A custom response policy that blocks insecure links at ON RESULT.
All three attached and verified from the model service's playground.

Prerequisites

A workspace enabled for Unity Catalog. See Get started with Unity Catalog.
The Unity AI Gateway preview enabled for your account. See Manage Azure Databricks previews.
A Model Service to govern, and EXECUTE on it so you can test. To create one, see Discover foundation models. This tutorial uses main.default.team_chat.
MANAGE on the Model Service, to attach policies.
CREATE FUNCTION on the schema where you create the custom policy functions (main.governance in this tutorial).
For the built-in guardrail: the evaluator model that runs the guardrail's check (the LLM judge) is preselected, so no setup is needed. If you select a different evaluator under Advanced options, you need CAN_QUERY on it.

Step 1: Apply a built-in guardrail

Built-in guardrails are managed LLM-judge checks. You select one from the Guardrail type menu; the evaluator model service that runs the check (the LLM judge) is preselected for you. The available guardrails are:

PII Blocking (system.ai.block_pii): denies content that contains PII.
Unsafe Content (system.ai.block_unsafe_content): denies unsafe or harmful content.
Jailbreak (system.ai.block_jailbreak): denies prompt-injection and jailbreak attempts (requests only).
Hallucination (system.ai.block_hallucination): denies hallucinated responses (responses only).

Attach the PII guardrail to your model service:

In the workspace sidebar, click AI Gateway.
On the Models tab, select your model service (main.default.team_chat).
Open the Policies tab, then click New policy.
Enter a Name, such as block-pii.
Under Applied to, keep All account users, or scope the policy to specific principals.
In Guardrail type, select PII Blocking.
Set Rank to 1. Rank sets the evaluation order: the lowest rank runs first on the request and last on the response.
Under Phase, select both Input guardrails (Before the Model) and Output guardrails (After the Model), so the guardrail runs on requests and responses.
Click Create policy.

The guardrail uses a preselected Evaluator model service (the LLM judge that runs the check). To use a different model, expand Advanced options before creating the policy; you need CAN_QUERY on the model you select.

Note

After you attach or change a policy on a Model Service, allow a short time for the change to take effect before testing. During the beta, it can take a couple of minutes to propagate.

Step 2: Add a custom request policy

Guardrails cover common risks. For a rule specific to your organization, write a custom policy. A custom policy is a SQL UDF that takes (event VARIANT) and returns a decision; read the message text from event:context.message, an API-agnostic projection of the last user or assistant message.

This policy denies any request that mentions a confidential project codename. The event:type::string = 'request' check confines it to ON CALL:

CREATE OR REPLACE FUNCTION main.governance.block_confidential_codename(
  event VARIANT
)
RETURNS VARIANT
LANGUAGE SQL
RETURN
  CASE
    WHEN event:type::string = 'request'
      AND contains(lower(event:context.message::string), 'project aurora')
    THEN to_variant_object(named_struct('result', 'DENY', 'reason', 'Requests about confidential projects are not permitted.'))
    ELSE to_variant_object(named_struct('result', 'ALLOW', 'reason', ''))
  END;

contains and lower are part of the SQL subset supported in policy bodies. For the full list and the rules for writing a policy function, see Service policy function reference.

Attach the function as a custom policy, scoped to the request phase:

On the model service's Policies tab, click New policy and enter a Name, such as block-codename.
In Guardrail type, select Custom.
Click Custom function, then Select function, and select main.governance.block_confidential_codename.
Under Phase, select only Input guardrails (Before the Model), because this is a request policy.
Set Rank to 10, then click Create policy.

Step 3: Add a custom response policy

A model can also return content you don't want to pass through. This policy denies a response that contains an insecure link (an http:// URL or a javascript: URI). The event:type::string = 'response' check confines it to ON RESULT, so a user who merely mentions those schemes in a prompt doesn't trip it on the way in:

CREATE OR REPLACE FUNCTION main.governance.block_unsafe_links(
  event VARIANT
)
RETURNS VARIANT
LANGUAGE SQL
RETURN
  CASE
    WHEN event:type::string = 'response'
      AND (contains(lower(event:context.message::string), 'http://')
        OR contains(lower(event:context.message::string), 'javascript:'))
    THEN to_variant_object(named_struct('result', 'DENY', 'reason', 'Response contained an insecure link and was blocked by policy.'))
    ELSE to_variant_object(named_struct('result', 'ALLOW', 'reason', ''))
  END;

Attach the function as a custom policy, scoped to the response phase:

On the Policies tab, click New policy and enter a Name, such as block-unsafe-links.
In Guardrail type, select Custom, then select main.governance.block_unsafe_links under Custom function.
Under Phase, select only Output guardrails (After the Model), because this is a response policy.
Set Rank to 20, then click Create policy.

Step 4: Verify

All three policies now appear on the service's Policies tab. Use the playground to confirm each one fires:

On the model service page, click Chat in playground.
Send a prompt that contains PII, such as My SSN is 123-45-6789, store it. The PII guardrail blocks the request and you receive a structured error.
Send Tell me about Project Aurora. The request policy blocks it with the reason Requests about confidential projects are not permitted.
Send a prompt that makes the model return an http:// link. The response policy blocks it at ON RESULT with the reason Response contained an insecure link and was blocked by policy.
Send an ordinary prompt. It returns a normal completion.

To test the service from your own apps or scripts instead, see Use model services.

Clean up

When you're done, remove the policies on the Policies tab: open each policy you created and delete it. Then optionally drop the custom functions:

DROP FUNCTION IF EXISTS main.governance.block_confidential_codename;
DROP FUNCTION IF EXISTS main.governance.block_unsafe_links;

Next steps

Feedback

Was this page helpful?

Last updated on 2026-06-29