Quickstart: Custom categories (standard mode)

Follow this guide to use Azure AI Content Safety Custom category REST API to create your own content categories for your use case and train Azure AI Content Safety to detect them in new text content.

Important

This feature is only available in certain Azure regions. See Region availability.

Important

Allow enough time for model training

The end-to-end execution of custom category training can take from around five hours to ten hours. Plan your moderation pipeline accordingly.

Prerequisites

  • An Azure subscription - Create one for free
  • Once you have your Azure subscription, create a Content Safety resource in the Azure portal to get your key and endpoint. Enter a unique name for your resource, select your subscription, and select a resource group, supported region, and supported pricing tier. Then select Create.
    • The resource takes a few minutes to deploy. After it finishes, Select go to resource. In the left pane, under Resource Management, select Subscription Key and Endpoint. Copy the endpoint and either of the key values to a temporary location for later use.
  • Also create an Azure blob storage container where you'll keep your training annotation file.
  • One of the following installed:

Prepare your training data

To train a custom category, you need example text data that represents the category you want to detect. In this guide, you can use sample data. The provided annotation file contains text prompts about survival advice in camping/wilderness situations. The trained model will learn to detect this type of content in new text data.

Tip

For tips on creating your own data set, see the How-to guide.

  1. Download the sample text data file from the GitHub repository.
  2. Upload the .jsonl file to your Azure Storage account blob container. Then copy the blob URL to a temporary location for later use.

Grant storage access

Next, you need to give your Content Safety resource access to read from the Azure Storage resource. Enable system-assigned Managed identity for the Azure AI Content Safety instance and assign the role of Storage Blob Data Contributor/Owner/Reader to the identity:

  1. Enable managed identity for the Azure AI Content Safety instance.

    Screenshot of Azure portal enabling managed identity.

  2. Assign the role of Storage Blob Data Contributor/Owner to the Managed identity. Any roles highlighted below should work.

    Screenshot of the Add role assignment screen in Azure portal.

    Screenshot of assigned roles in the Azure portal.

    Screenshot of the managed identity role.

Create and train a custom category

In the command below, replace <your_api_key>, <your_endpoint>, and other necessary parameters with your own values. Then enter each command in a terminal window and run it.

Create new category version

curl -X PUT "<your_endpoint>/contentsafety/text/categories/survival-advice?api-version=2024-02-15-preview" \
     -H "Ocp-Apim-Subscription-Key: <your_api_key>" \
     -H "Content-Type: application/json" \
     -d "{
            \"categoryName\": \"survival-advice\",
            \"definition\": \"text prompts about survival advice in camping/wilderness situations\",
            \"sampleBlobUrl\": \"https://<your-azure-storage-url>/example-container/survival-advice.jsonl\"
        }"

Start the category build process:

Replace <your_api_key> and <your_endpoint> with your own values. Allow enough time for model training: the end-to-end execution of custom category training can take from around five hours to ten hours. Plan your moderation pipeline accordingly. After you receive the response, store the operation ID (referred to as id) in a temporary location. This ID will be necessary for retrieving the build status using the Get status API in the next section.

curl -X POST "<your_endpoint>/contentsafety/text/categories/survival-advice:build?api-version=2024-02-15-preview" \
     -H "Ocp-Apim-Subscription-Key: <your_api_key>" \
     -H "Content-Type: application/json"

Get the category build status:

To retrieve the status, utilize the id obtained from the previous API response and place it in the path of the API below.

curl -X GET "<your_endpoint>/contentsafety/text/categories/operations/<id>?api-version=2024-02-15-preview" \
     -H "Ocp-Apim-Subscription-Key: <your_api_key>" \
     -H "Content-Type: application/json"

Analyze text with a customized category

Run the following command to analyze text with your customized category. Replace <your_api_key> and <your_endpoint> with your own values.

curl -X POST "<your_endpoint>/contentsafety/text:analyzeCustomCategory?api-version=2024-02-15-preview" \
     -H "Ocp-Apim-Subscription-Key: <your_api_key>" \
     -H "Content-Type: application/json" \
     -d "{
            \"text\": \"<Example text to analyze>\",
            \"categoryName\": \"survival-advice\", 
            \"version\": 1
        }"