Edit

Share via


Guardrail annotations (classic)

Note

This document refers to the Microsoft Foundry (classic) portal.

🔍 View the Microsoft Foundry (new) documentation to learn about the new portal.

Microsoft Foundry Models provide annotations to help you understand the Guardrail (previously content filtering) results for your requests. Annotations can be enabled even for filters and severity levels that have been disabled from blocking content.

Standard guardrail annotations

When annotations are enabled as shown in the code snippets below, the following information is returned via the API for the categories hate and fairness, sexual, violence, and self-harm:

  • risk category (hate, sexual, violence, self_harm)
  • the severity level (safe, low, medium, or high) within each content category
  • filtering status (true or false).

Azure OpenAI in Foundry Models provides content filtering annotations to help you understand the content filtering results for your requests. Annotations can be enabled even for filters and severity levels that have been disabled from blocking content.

Optional model annotations

Optional models can be set to annotate mode (returns information when content is flagged, but not filtered) or filter mode (returns information when content is flagged and filtered).

When annotations are enabled as shown in the code snippets below, the following information is returned by the API for each optional model:

Model Output
User prompt attack detected (true or false),
filtered (true or false)
indirect attacks detected (true or false),
filtered (true or false)
protected material text detected (true or false),
filtered (true or false)
protected material code detected (true or false),
filtered (true or false),
Example citation of public GitHub repository where code snippet was found,
The license of the repository
Personally identifiable information (PII) detected (true or false)
filtered (true or false)
redacted (true or false)
Groundedness detected (true or false)
filtered (true or false, with details)
(Annotate mode only) details:(completion_end_offset, completion_start_offset)

When displaying code in your application, we strongly recommend that the application also displays the example citation from the annotations. Compliance with the cited license may also be required for Customer Copyright Commitment coverage.

See the following table for the annotation mode availability in each API version:

Filter category 2024-10-01-preview 2024-02-01 GA 2024-04-01-preview 2023-10-01-preview 2023-06-01-preview 2025-01-01-preview
Hate
Violence
Sexual
Self-harm
Prompt Shield for user prompt attacks
Prompt Shield for indirect attacks
Protected material text
Protected material code
Personally identifiable information (PII)
Profanity blocklist
Custom blocklist
Groundedness1

1 Not available in non-streaming scenarios; only available for streaming scenarios. The following regions support Groundedness Detection: Central US, East US, France Central, and Canada East

Code examples

The following code snippets show how to view content filter annotations in different programming languages.

# os.getenv() for the endpoint and key assumes that you are using environment variables.

import os
from openai import AzureOpenAI
client = AzureOpenAI(
    api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
    api_version="2024-03-01-preview",
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT") 
    )

response = client.completions.create(
    model="gpt-35-turbo-instruct", # model = "deployment_name".
    prompt="{Example prompt where a severity level of low is detected}" 
    # Content that is detected at severity level medium or high is filtered, 
    # while content detected at severity level low isn't filtered by the content filters.
)

print(response.model_dump_json(indent=2))

Output

{ 
  "choices": [ 
    { 
      "content_filter_results": { 
        "hate": { 
          "filtered": false, 
          "severity": "safe" 
        }, 
        "protected_material_code": { 
          "citation": { 
            "URL": " https://github.com/username/repository-name/path/to/file-example.txt", 
            "license": "EXAMPLE-LICENSE" 
          }, 
          "detected": true,
          "filtered": false 
        }, 
        "protected_material_text": { 
          "detected": false, 
          "filtered": false 
        }, 
        "self_harm": { 
          "filtered": false, 
          "severity": "safe" 
        }, 
        "sexual": { 
          "filtered": false, 
          "severity": "safe" 
        }, 
        "violence": { 
          "filtered": false, 
          "severity": "safe" 
        } 
      }, 
      "finish_reason": "stop", 
      "index": 0, 
      "message": { 
        "content": "Example model response will be returned ", 
        "role": "assistant" 
      } 
    } 
  ], 
  "created": 1699386280, 
  "id": "chatcmpl-8IMI4HzcmcK6I77vpOJCPt0Vcf8zJ", 
  "model": "gpt-35-turbo-instruct", 
  "object": "text.completion",
  "usage": { 
    "completion_tokens": 40, 
    "prompt_tokens": 11, 
    "total_tokens": 417 
  },  
  "prompt_filter_results": [ 
    { 
      "content_filter_results": { 
        "hate": { 
          "filtered": false, 
          "severity": "safe" 
        }, 
        "jailbreak": { 
          "detected": false, 
          "filtered": false 
        }, 
        "profanity": { 
          "detected": false, 
          "filtered": false 
        }, 
        "self_harm": { 
          "filtered": false, 
          "severity": "safe" 
        }, 
        "sexual": { 
          "filtered": false, 
          "severity": "safe" 
        }, 
        "violence": { 
          "filtered": false, 
          "severity": "safe" 
        } 
      }, 
      "prompt_index": 0 
    } 
  ]
} 

For details on the inference REST API endpoints for Azure OpenAI and how to create Chat and Completions, follow Azure OpenAI REST API reference guidance. Annotations are returned for all scenarios when using any preview API version starting from 2023-06-01-preview, as well as the GA API version 2024-02-01.