Quickstart: Speech to text with the Azure OpenAI Whisper model
სტატია
This quickstart explains how to use the Azure OpenAI Whisper model for speech to text conversion. The Whisper model can transcribe human speech in numerous languages, and it can also translate other languages into English.
The file size limit for the Whisper model is 25 MB. If you need to transcribe a file larger than 25 MB, you can use the Azure AI Speech batch transcription API.
Download the example data from GitHub if you don't have your own data.
Set up
Retrieve key and endpoint
To successfully make a call against Azure OpenAI, you need an endpoint and a key.
Variable name
Value
AZURE_OPENAI_ENDPOINT
The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: https://docs-test-001.openai.azure.com/.
AZURE_OPENAI_API_KEY
This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either KEY1 or KEY2.
Go to your resource in the Azure portal. The Endpoint and Keys can be found in the Resource Management section. Copy your endpoint and access key as you'll need both for authenticating your API calls. You can use either KEY1 or KEY2. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.
Environment variables
Create and assign persistent environment variables for your key and endpoint.
მნიშვნელოვანი
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
In a bash shell, run the following command. You need to replace YourDeploymentName with the deployment name you chose when you deployed the Whisper model. The deployment name isn't necessarily the same as the model name. Entering the model name results in an error unless you chose a deployment name that is identical to the underlying model name.
For production, store and access your credentials using a secure method, such as Azure Key Vault. For more information about credential security, see Azure AI services security.
Output
Bash
{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}
To successfully make a call against Azure OpenAI, you need an endpoint and a key.
Variable name
Value
AZURE_OPENAI_ENDPOINT
The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: https://docs-test-001.openai.azure.com/.
AZURE_OPENAI_API_KEY
This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either KEY1 or KEY2.
Go to your resource in the Azure portal. The Endpoint and Keys can be found in the Resource Management section. Copy your endpoint and access key as you'll need both for authenticating your API calls. You can use either KEY1 or KEY2. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.
Environment variables
Create and assign persistent environment variables for your key and endpoint.
მნიშვნელოვანი
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
The OpenAI Python library version 0.28.1 is deprecated. We recommend using 1.x. Consult our migration guide for information on moving from 0.28.1 to 1.x.
Console
pip install openai==0.28.1
Create the Python app
Create a new Python file called quickstart.py. Then open it up in your preferred editor or IDE.
Replace the contents of quickstart.py with the following code. Modify the code to add your deployment name:
import os
from openai import AzureOpenAI
client = AzureOpenAI(
api_key=os.getenv("AZURE_OPENAI_API_KEY"),
api_version="2024-02-01",
azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
)
deployment_id = "YOUR-DEPLOYMENT-NAME-HERE"#This will correspond to the custom name you chose for your deployment when you deployed a model."
audio_test_file = "./wikipediaOcelot.wav"
result = client.audio.transcriptions.create(
file=open(audio_test_file, "rb"),
model=deployment_id
)
print(result)
Python
import openai
import time
import os
openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")
openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
openai.api_type = "azure"
openai.api_version = "2024-02-01"
model_name = "whisper"
deployment_id = "YOUR-DEPLOYMENT-NAME-HERE"#This will correspond to the custom name you chose for your deployment when you deployed a model."
audio_language="en"
audio_test_file = "./wikipediaOcelot.wav"
result = openai.Audio.transcribe(
file=open(audio_test_file, "rb"),
model=model_name,
deployment_id=deployment_id
)
print(result)
Run the application using the python command on your quickstart file:
For production, store and access your credentials using a secure method, such as Azure Key Vault. For more information about credential security, see Azure AI services security.
Output
Python
{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}
To successfully make a call against Azure OpenAI, you need an endpoint and a key.
Variable name
Value
AZURE_OPENAI_ENDPOINT
The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: https://docs-test-001.openai.azure.com/.
AZURE_OPENAI_API_KEY
This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either KEY1 or KEY2.
Go to your resource in the Azure portal. The Endpoint and Keys can be found in the Resource Management section. Copy your endpoint and access key as you'll need both for authenticating your API calls. You can use either KEY1 or KEY2. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.
Passwordless authentication is more secure than key-based alternatives and is the recommended approach for connecting to Azure services. If you choose to use Passwordless authentication, you'll need to complete the following:
Assign the Cognitive Services User role to your user account. This can be done in the Azure portal on your OpenAI resource under Access control (IAM) > Add role assignment.
Sign-in to Azure using Visual Studio or the Azure CLI via az login.
Update the app code
Replace the contents of program.cs with the following code and update the placeholder values with your own.
using Azure;
using Azure.AI.OpenAI;
using Azure.Identity; // Required for Passwordless authvar endpoint = new Uri("YOUR_OPENAI_ENDPOINT");
var credentials = new AzureKeyCredential("YOUR_OPENAI_KEY");
// var credentials = new DefaultAzureCredential(); // Use this line for Passwordless authvar deploymentName = "whisper"; // Default deployment name, update with your own if necessaryvar audioFilePath = "YOUR_AUDIO_FILE_PATH";
var openAIClient = new AzureOpenAIClient(endpoint, credentials);
var audioClient = openAIClient.GetAudioClient(deploymentName);
var result = await audioClient.TranscribeAudioAsync(audioFilePath);
Console.WriteLine("Transcribed text:");
foreach (var item in result.Value.Text)
{
Console.Write(item);
}
მნიშვნელოვანი
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
Run the application using the dotnet run command or the run button at the top of Visual Studio:
.NET CLI
dotnetrun
If you are using the sample audio file, you should see the following text printed out in the console:
text
The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States,
Mexico, and Central and South America. This medium-sized cat is characterized by solid
black spots and streaks on its coat, round ears...
For the recommended keyless authentication with Microsoft Entra ID, you need to:
Install the Azure CLI used for keyless authentication with Microsoft Entra ID.
Assign the Cognitive Services User role to your user account. You can assign roles in the Azure portal under Access control (IAM) > Add role assignment.
Set up
Create a new folder synthesis-quickstart to contain the application and open Visual Studio Code in that folder with the following command:
shell
mkdir synthesis-quickstart && cd synthesis-quickstart
Create the package.json with the following command:
shell
npm init -y
Install the OpenAI client library for JavaScript with:
Console
npm install openai
For the recommended passwordless authentication:
Console
npm install @azure/identity
Retrieve resource information
You need to retrieve the following information to authenticate your application with your Azure OpenAI resource:
This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
AZURE_OPENAI_DEPLOYMENT_NAME
This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
AZURE_OPENAI_API_KEY
This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal. You can use either KEY1 or KEY2.
AZURE_OPENAI_DEPLOYMENT_NAME
This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}
For the recommended keyless authentication with Microsoft Entra ID, you need to:
Install the Azure CLI used for keyless authentication with Microsoft Entra ID.
Assign the Cognitive Services User role to your user account. You can assign roles in the Azure portal under Access control (IAM) > Add role assignment.
Set up
Create a new folder whisper-quickstart to contain the application and open Visual Studio Code in that folder with the following command:
shell
mkdir whisper-quickstart && cd whisper-quickstart
Create the package.json with the following command:
shell
npm init -y
Update the package.json to ECMAScript with the following command:
shell
npm pkg set type=module
Install the OpenAI client library for JavaScript with:
Console
npm install openai
For the recommended passwordless authentication:
Console
npm install @azure/identity
Retrieve resource information
You need to retrieve the following information to authenticate your application with your Azure OpenAI resource:
This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
AZURE_OPENAI_DEPLOYMENT_NAME
This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
AZURE_OPENAI_API_KEY
This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal. You can use either KEY1 or KEY2.
AZURE_OPENAI_DEPLOYMENT_NAME
This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}
An Azure OpenAI Service resource with a model deployed. For more information about model deployment, see the resource deployment guide.
An Azure OpenAI Service resource with either the gpt-35-turbo or the gpt-4 models deployed. For more information about model deployment, see the resource deployment guide.
Set up
Retrieve key and endpoint
To successfully make a call against Azure OpenAI, you need an endpoint and a key.
Variable name
Value
AZURE_OPENAI_ENDPOINT
The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: https://docs-test-001.openai.azure.com/.
AZURE_OPENAI_API_KEY
This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either KEY1 or KEY2.
Go to your resource in the Azure portal. The Endpoint and Keys can be found in the Resource Management section. Copy your endpoint and access key as you'll need both for authenticating your API calls. You can use either KEY1 or KEY2. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.
Environment variables
Create and assign persistent environment variables for your key and endpoint.
მნიშვნელოვანი
Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.
Run the following command. You need to replace YourDeploymentName with the deployment name you chose when you deployed the Whisper model. The deployment name isn't necessarily the same as the model name. Entering the model name results in an error unless you chose a deployment name that is identical to the underlying model name.
PowerShell
# Azure OpenAI metadata variables$openai = @{
api_key = $Env:AZURE_OPENAI_API_KEY
api_base = $Env:AZURE_OPENAI_ENDPOINT# your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
api_version = '2024-02-01'# this may change in the future
name = 'YourDeploymentName'#This will correspond to the custom name you chose for your deployment when you deployed a model.
}
# Header for authentication$headers = [ordered]@{
'api-key' = $openai.api_key
}
$form = @{ file = get-item -path'./wikipediaOcelot.wav' }
# Send a completion call to generate an answer$url = "$($openai.api_base)/openai/deployments/$($openai.name)/audio/transcriptions?api-version=$($openai.api_version)"$response = Invoke-RestMethod -Uri$url -Headers$headers -Form$form -Method Post -ContentType'multipart/form-data'return$response.text
The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs.
Clean up resources
If you want to clean up and remove an Azure OpenAI resource, you can delete the resource. Before deleting the resource, you must first delete any deployed models.
შემოუერთდით Meetup სერიას, რათა შექმნათ მასშტაბური AI გადაწყვეტილებები რეალურ სამყაროში გამოყენების შემთხვევებზე დაყრდნობით თანამემამულე დეველოპერებთან და ექსპერტებთან.