@Jimstacy2 Welcome to Microsoft Q&A Forum, Thank you for posting your query here!
.
The Azure OpenAI Service and OpenAI’s Moderation API use different systems for content filtering and moderation. See here.
.
OpenAI’s Moderation API provides a dictionary of per-category raw scores output by the model, denoting the model’s confidence that the input violates OpenAI’s policy for the category. The value is between 0 and 1, where higher values denote higher confidence. These scores should not be interpreted as probabilities. See here.
.
On the other hand, Azure OpenAI Service uses a content filtering system that works alongside core models. This system detects four categories of harmful content (violence
, hate
, sexual
, and self-harm
) at four severity levels respectively (safe
, low
, medium
, and high
). The default content filtering configuration is set to filter at the medium severity threshold for all four content harm categories for both prompts and completions. That means that content that is detected at severity level medium or high is filtered, while content detected at severity level low isn’t filtered by the content filters.
.
.
Below is the sample response from Azure Open AI for the content filtering:
data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":null,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"content_filter_offsets":{"check_offset":65,"start_offset":65,"end_offset":1056}}],"usage":null} data: {"id":"","object":"","created":0,"model":"","choices":[{"index":0,"finish_reason":"content_filter","content_filter_results":{"protected_material_text":{"detected":true,"filtered":true}},"content_filter_offsets":{"check_offset":65,"start_offset":65,"end_offset":1056}}],"usage":null}
More info here.
So, there isn’t a direct mapping between the scores from OpenAI’s Moderation API and the low/medium/high thresholds in Azure’s content filtering system.
.
.
However, you can still create a rough mapping based on typical score ranges and their corresponding severity levels in Azure's system.
Here's a possible approach for creating a mapping:
Low Confidence (0-0.3):
- These could roughly correspond to "safe" or "low" severity in Azure's system.
Medium Confidence (0.3-0.7):
- These could correspond to "medium" severity in Azure's system.
High Confidence (0.7-1):
- These could correspond to "high" severity in Azure's system.
.
Keep in mind that this mapping is based on approximate score ranges. Therefore, you might need to do some testing and calibration to understand how the scores from OpenAI’s Moderation API correspond to the thresholds in Azure’s system.
.
.
Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.