Analyze text content with Docker container (preview)

Article
09/24/2024

The analyze text container scans text generated by foundation models or people for sexual content, violence, hate, and self-harm with multi-severity levels. This guide shows you how to download, install, and run a content safety analyze text container.

For more information about prerequisites, validating that a container is running, running multiple containers on the same host, and running disconnected containers, see Install and run content safety containers with Docker.

Specify a container image

The content safety analyze text container image for all supported versions can be found on the Microsoft Container Registry (MCR) syndicate. It resides within the azure-cognitive-services/contentsafety repository and is named text-analyze.

Screenshot of text container on registry website.

The fully qualified container image name is, mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze. Append a specific container version, or append :latest to get the most recent version. For example:

Version	Path
Latest	`mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze:latest` The `latest` tag pulls the latest image.
1.0.0-amd64-preview	`mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze:1.0.0-amd64-preview`

Get the container image

Make sure you meet the prerequisites including required hardware. Also see the recommended allocation of resources section for each content safety container.

Use the docker pull command to download a container image from Microsoft Container Registry:

docker pull mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze:latest

Run the container

Use the docker run command to run the container.

Standard Container
Disconnected container

The following table represents the various docker run parameters and their corresponding descriptions:

Parameter	Description
`{ENDPOINT_URI}`	The endpoint is required for metering and billing. For more information, see billing arguments.
`{API_KEY}`	The API key is required. For more information, see billing arguments.

When you run the content safety analyze text container, configure the port, GPU according to the content safety container requirements and recommendations.

Here's a sample docker run command with placeholder values. You must specify the ENDPOINT_URI and API_KEY values:

docker run --rm -it -p 5000:5000 --gpus all \
mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze:latest \
Eula=accept \
Billing={ENDPOINT_URI} \
ApiKey={API_KEY}

This command:

Runs a content safety container from the container image.
Uses all available GPU resources (by specifying --gpus all). Content safety container requires CUDA for optimal performance. See more in host requirements and recommendations. Also make sure your host install NVIDIA container toolkit
Exposes TCP port 5000 and allocates a pseudo-TTY for the container.
Automatically removes the container after it exits. The container image is still available on the host computer.

If you're testing the container on a machine without CUDA, the container exits when trying to use CUDA. Use below command to disable CUDA to continue the testing.

docker run -e CUDA_ENABLED=false

Run the container with blocklist

The analyze text container supports the use of a blocklist feature, which allows you to block custom terms. You, as the customer, have the ability to manage these blocklists by using CSV files. You have the flexibility to use multiple CSV files for multiple blocklists.

To run the container with a blocklist, use the following command:

docker run -e BLOCKLIST_DIR=/tmp/blocklist  -v {/path/on/host}:/tmp/blocklist

In the command above, replace {/path/on/host} with the path to the blocklist folder on your host machine. This command mounts the blocklist directory from your host machine to the BLOCKLIST_DIR=/tmp/blocklist environment variable within the container.

Note, the analyze text container uses an exact match method for the blocklist. All items in the blocklist will be converted to lowercase before the matching process. This means, for instance, if you have Contoso in your blocklist, both "Contoso" and "contoso" from your input are considered a match.

To run disconnected containers (not connected to the internet), you must submit a request form and wait for approval. For more information about applying and purchasing a commitment plan to use containers in disconnected environments, see Use containers in disconnected environments in the Azure AI services documentation.

If you're approved to run the disconnected container, the following example shows the formatting of the docker run command to use, with placeholder values. Replace these values with your own values.

Placeholder	Description
`{IMAGE}`	The container image you want to use. For example: `mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze:latest`
`{LICENSE_MOUNT}`	The path where the license is downloaded, and mounted. For example: `/host/license:/path/to/license/directory`
`{ENDPOINT_URI}`	The endpoint for authenticating your service request. You can find it on your resource's Key and endpoint page, on the Azure portal. For example: `https://<your-resource-name>.cognitiveservices.azure.com`
`{API_KEY}`	The key for your content safety resource. You can find it on your resource's Key and endpoint page, on the Azure portal.
`{CONTAINER_LICENSE_DIRECTORY}`	Location of the license folder on the container's local filesystem. For example: `/path/to/license/directory`

docker run --rm -it -p 5000:5000 \ 
-v {LICENSE_MOUNT} \
{IMAGE} \
eula=accept \
billing={ENDPOINT_URI} \
apikey={API_KEY} \
DownloadLicense=True \
Mounts:License={CONTAINER_LICENSE_DIRECTORY}

The DownloadLicense=True parameter in your docker run command downloads a license file to enable your Docker container to run when it isn't connected to the internet. It also contains an expiration date, after which the license file is invalid to run the container. You can only use a license file with the appropriate container that you're approved for. For example, you can't use a license file for a text-analyze container with a image-analyze container.

Once the license file is downloaded, you can run the container in a disconnected environment. The following example shows the formatting of the docker run command you use, with placeholder values. Replace these values with your own values.

Wherever the container is run, the license file must be mounted to the container and the location of the license folder on the container's local filesystem must be specified with Mounts:License=. An output mount must also be specified so that billing usage records can be written.

Placeholder	Value	Format or example
`{IMAGE}`	The container image you want to use. For example: `mcr.microsoft.com/azure-cognitive-services/contentsafety/text-analyze:latest`
`{LICENSE_MOUNT}`	The path where the license is located and mounted. For example: `/host/license:/path/to/license/directory`
`{OUTPUT_PATH}`	The output path for logging. For example: `/host/output:/path/to/output/directory` For more information, see usage records in the Azure AI services documentation.
`{CONTAINER_LICENSE_DIRECTORY}`	Location of the license folder on the container's local filesystem. For example: `/path/to/license/directory`
`{CONTAINER_OUTPUT_DIRECTORY}`	Location of the output folder on the container's local filesystem. For example: `/path/to/output/directory`

docker run --rm -it -p 5000:5000 --gpus all \ 
-v {LICENSE_MOUNT} \ 
-v {OUTPUT_PATH} \
{IMAGE} \
eula=accept \
Mounts:License={CONTAINER_LICENSE_DIRECTORY}
Mounts:Output={CONTAINER_OUTPUT_DIRECTORY}

Content safety containers provide a default directory for writing the license file and billing log at runtime. The default directories are /license and /output respectively.

When you're mounting these directories to the container with the docker run -v command, make sure the local machine directory has set ownership to user:group nonroot:nonroot before running the container.

Here's a sample command to set file/directory ownership:

sudo chown -R nonroot:nonroot <YOUR_LOCAL_MACHINE_PATH_1> <YOUR_LOCAL_MACHINE_PATH_2> ...

Run the container with blocklist

To run the container with a blocklist, use the following command:

docker run -e BLOCKLIST_DIR=/tmp/blocklist  -v {/path/on/host}:/tmp/blocklist

Note, the Analyze text container uses an exact match method for the blocklist. All items in the blocklist are converted to lowercase before the matching process. This means, for instance, if you have Contoso in your blocklist, both "Contoso" and "contoso" from your input are considered a match.

Test the container

Once the container is up and running, you can validate its operation by sending a request to the REST endpoint deployed within the container. To do this, follow the steps in the quickstart. Note, you need to replace the endpoint URL with the Docker URL specific to your container deployment. Also, ensure that you're using host authentication, rather than key-based authentication.

Analyze text quickstart

Next steps

See the content safety containers overview
Use more Azure AI containers

Share via

Analyze text content with Docker container (preview)

Specify a container image

Get the container image

Run the container

Run the container with blocklist

Run the container with blocklist

Test the container

Next steps

Feedback

Additional resources