Training data for Azure AI text moderation?

Shelby Joyner 0 Reputation points
2024-03-27T20:43:46.11+00:00

Was curious if there was any information about the training data used to power the text moderation tool - the one that classifies the text based on three categories. Also was wondering if the word lists for what is considered profanity were available anywhere. I know you can create custom word lists but I was curious about the default list. Thanks!

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
355 questions
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 4,185 Reputation points Microsoft Vendor
    2024-03-28T07:40:24.9166667+00:00

    Hi @Shelby Joyner,

    Thank you for your question.

    Azure Content Moderator is an AI service that lets you handle content that is potentially offensive, risky, or otherwise undesirable. The API group of Azure Text moderation Scans text for offensive content based on these categories: sexually explicit or suggestive content, profanity, and personal data. The Azure Content Moderator service uses machine learning models to classify text into potentially offensive categories. The models are trained on a large dataset of labeled text examples, which includes a mix of public and private data.

    The Content Moderator service includes a built-in list of profane terms in various languages, which is used to detect profanity in text content. However, the list is not publicly available. The API detects any profane terms in any of the supported languages, those terms are included in the response. The response also contains their location (Index) in the original text. The ListId in the following sample JSON refers to terms found in custom term lists if available.

    "Terms": [
    	{ 
    		"Index": 118, 
    		"OriginalIndex": 118, 
    		"ListId": 0, 
    		"Term": "<offensive word>" 
    	}
    

    For more info see: content-moderator and text moderation concepts.

    I hope you understand. And, if you have any further query do let us know.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    0 comments No comments