An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
Hello Garrick White,
Welcome to Microsoft Q&A .Thank you for reaching out.
The observed behaviour - successful calls using an API key while calls using Entra ID return 401 Unauthorized can be expected because of the difference between the two authentication mechanisms. API key authentication directly validates the key against the Azure OpenAI resource, while Entra ID authentication relies on a bearer token that must be issued for the correct audience, validated against the exact resource endpoint and authorized through Azure RBAC. When any of these elements do not align precisely, the request is rejected with a 401 response.
The most common cause in this scenario is a token audience or scope mismatch. For Azure OpenAI data‑plane operations, the access token must be requested with the following scope https://cognitiveservices.azure.com/.default
Tokens issued for other audiences, such as Azure Resource Manager or unrelated services, are considered valid by Entra ID but are rejected by the Azure OpenAI endpoint during validation. Ensuring the token is minted for this exact scope is critical for successful calls.
Closely related to the token scope is the resource endpoint being called. The Azure OpenAI endpoint must be the resource‑specific endpoint in the following format https://<resource-name>.openai.azure.com/
Using a generic endpoint or an incorrect region or embedding additional path segments in the base endpoint can result in the token audience not matching the service expectation, even when the token itself appears correct.
Authorization is enforced through Azure role‑based access control (RBAC). The identity used to obtain the token - whether a user account, service principal, or managed identity must have one of the following roles assigned at the Azure OpenAI resource scope (or a broader scope that includes it):
- Cognitive Services OpenAI User
- Cognitive Services OpenAI Contributor
After a role assignment is added or updated, a short propagation delay can occur before access is fully effective.
In addition, the token provider implementation must return a valid, non‑expired bearer token issued by the Entra ID v2.0 endpoint. Tokens cached beyond their lifetime or requested with incorrect claims can lead to intermittent or persistent authorization failures. Using supported SDK helpers to acquire and refresh tokens helps avoid these issues.
Please consider the following troubleshooting steps to resolve the error:
- Please verify that the access token is requested with the scope
https://cognitiveservices.azure.com/.default. - Then confirm that the Azure OpenAI endpoint matches the resource name exactly and does not include additional path segments.
- Try to validate that the calling identity has the Cognitive Services OpenAI User (or Contributor) role assigned at the Azure OpenAI resource scope.
- Proceed to confirm that the token provider is issuing Entra ID v2.0 tokens and refreshing them as needed.
- As a diagnostic step, please obtain a token using Azure CLI and test the call outside the application to isolate configuration issues from SDK behavior.
The following references might be helpful , please check them out
- How to configure Azure OpenAI in Microsoft Foundry Models with Microsoft Entra ID authentication (classic) - Microsoft Foundry (classic) portal | Microsoft Learn
- Use the Azure OpenAI Responses API - Microsoft Foundry | Microsoft Learn
- Role-based access control for Azure OpenAI (classic) - Microsoft Foundry (classic) portal | Microsoft Learn
- Authentication and authorization in Microsoft Foundry - Microsoft Foundry | Microsoft Learn
Thank you
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the response was helpful. This will be benefitting other community members who face the same issue.