Quickstart: Speech to text with the Azure OpenAI Whisper model

2025-07-02

This quickstart explains how to use the Azure OpenAI Whisper model for speech to text conversion. The Whisper model can transcribe human speech in numerous languages, and it can also translate other languages into English.

Note

For information about other audio models that you can use with Azure OpenAI, see Audio models.

The file size limit for the Whisper model is 25 MB. If you need to transcribe a file larger than 25 MB, you can use the Azure AI Speech batch transcription API.

Prerequisites

An Azure subscription - Create one for free.
An Azure OpenAI resource with a speech to text model deployed in a supported region. For more information, see Create a resource and deploy a model with Azure OpenAI.
Be sure that you are assigned at least the Cognitive Services Contributor role for the Azure OpenAI resource.
Download the example data from GitHub if you don't have your own data.

Set up

Retrieve key and endpoint

To successfully make a call against Azure OpenAI, you need an endpoint and a key.

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: `https://docs-test-001.openai.azure.com/`.
`AZURE_OPENAI_API_KEY`	This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.

Go to your resource in the Azure portal. The Endpoint and Keys can be found in the Resource Management section. Copy your endpoint and access key as you'll need both for authenticating your API calls. You can use either KEY1 or KEY2. Always having two keys allows you to securely rotate and regenerate keys without causing a service disruption.

Environment variables

Create and assign persistent environment variables for your key and endpoint.

Important

Use API keys with caution. Don't include the API key directly in your code, and never post it publicly. If you use an API key, store it securely in Azure Key Vault. For more information about using API keys securely in your apps, see API keys with Azure Key Vault.

For more information about AI services security, see Authenticate requests to Azure AI services.

setx AZURE_OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"

setx AZURE_OPENAI_ENDPOINT "REPLACE_WITH_YOUR_ENDPOINT_HERE"

[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_API_KEY', 'REPLACE_WITH_YOUR_KEY_VALUE_HERE', 'User')

[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_ENDPOINT', 'REPLACE_WITH_YOUR_ENDPOINT_HERE', 'User')

echo export AZURE_OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE" >> /etc/environment && source /etc/environment

echo export AZURE_OPENAI_ENDPOINT="REPLACE_WITH_YOUR_ENDPOINT_HERE" >> /etc/environment && source /etc/environment

Create a REST API request and response

In a bash shell, run the following command. You need to replace YourDeploymentName with the deployment name you chose when you deployed the Whisper model. The deployment name isn't necessarily the same as the model name. Entering the model name results in an error unless you chose a deployment name that is identical to the underlying model name.

curl $AZURE_OPENAI_ENDPOINT/openai/deployments/YourDeploymentName/audio/transcriptions?api-version=2024-02-01 \
 -H "api-key: $AZURE_OPENAI_API_KEY" \
 -H "Content-Type: multipart/form-data" \
 -F file="@./wikipediaOcelot.wav"

The first line of the preceding command with an example endpoint would appear as follows:

curl https://aoai-docs.openai.azure.com/openai/deployments/{YourDeploymentName}/audio/transcriptions?api-version=2024-02-01 \

You can get sample audio files, such as wikipediaOcelot.wav, from the Azure AI Speech SDK repository at GitHub.

Important

For production, store and access your credentials using a secure method, such as Azure Key Vault. For more information, see credential security.

Output

{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}

Prerequisites

An Azure subscription. You can create one for free.
An Azure OpenAI resource with a speech to text model deployed in a supported region. For more information, see Create a resource and deploy a model with Azure OpenAI.
Python 3.8 or later
The following Python library: os

Set up

Retrieve key and endpoint

To successfully make a call against Azure OpenAI, you need an endpoint and a key.

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: `https://docs-test-001.openai.azure.com/`.
`AZURE_OPENAI_API_KEY`	This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.

Environment variables

Create and assign persistent environment variables for your key and endpoint.

Important

For more information about AI services security, see Authenticate requests to Azure AI services.

setx AZURE_OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"

setx AZURE_OPENAI_ENDPOINT "REPLACE_WITH_YOUR_ENDPOINT_HERE"

[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_API_KEY', 'REPLACE_WITH_YOUR_KEY_VALUE_HERE', 'User')

[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_ENDPOINT', 'REPLACE_WITH_YOUR_ENDPOINT_HERE', 'User')

echo export AZURE_OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE" >> /etc/environment && source /etc/environment

echo export AZURE_OPENAI_ENDPOINT="REPLACE_WITH_YOUR_ENDPOINT_HERE" >> /etc/environment && source /etc/environment

Passwordless authentication is recommended

For passwordless authentication, you need to:

Use the @azure/identity package.
Assign the Cognitive Services User role to your user account. This can be done in the Azure portal under Access control (IAM) > Add role assignment.
Sign in with the Azure CLI such as az login.

Create a Python environment

Install the OpenAI Python client library with:

OpenAI Python 1.x
OpenAI Python 0.28.1

pip install openai

Note

The OpenAI Python library version 0.28.1 is deprecated. We recommend using 1.x. Consult our migration guide for information on moving from 0.28.1 to 1.x.

pip install openai==0.28.1

Create the Python app

Create a new Python file called quickstart.py. Then open it up in your preferred editor or IDE.
Replace the contents of quickstart.py with the following code. Modify the code to add your deployment name:

OpenAI Python 1.x
OpenAI Python 0.28.1

    import os
    from openai import AzureOpenAI
        
    client = AzureOpenAI(
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
        api_version="2024-02-01",
        azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
    )
    
    deployment_id = "YOUR-DEPLOYMENT-NAME-HERE" #This will correspond to the custom name you chose for your deployment when you deployed a model."
    audio_test_file = "./wikipediaOcelot.wav"
    
    result = client.audio.transcriptions.create(
        file=open(audio_test_file, "rb"),            
        model=deployment_id
    )
    
    print(result)

    import openai
    import time
    import os
    
    openai.api_key = os.getenv("AZURE_OPENAI_API_KEY")
    openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT")  # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
    openai.api_type = "azure"
    openai.api_version = "2024-02-01"
    
    model_name = "whisper"
    deployment_id = "YOUR-DEPLOYMENT-NAME-HERE" #This will correspond to the custom name you chose for your deployment when you deployed a model."
    audio_language="en"
    
    audio_test_file = "./wikipediaOcelot.wav"
    
    result = openai.Audio.transcribe(
                file=open(audio_test_file, "rb"),            
                model=model_name,
                deployment_id=deployment_id
            )
    
    print(result)

Run the application using the python command on your quickstart file:

python quickstart.py

You can get sample audio files, such as wikipediaOcelot.wav, from the Azure AI Speech SDK repository at GitHub.

Important

For production, store and access your credentials using a secure method, such as Azure Key Vault. For more information, see credential security.

Output

{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}

Prerequisites

An Azure subscription. You can create one for free.
An Azure OpenAI resource with a speech to text model deployed in a supported region. For more information, see Create a resource and deploy a model with Azure OpenAI.
The .NET 8.0 SDK

Microsoft Entra ID prerequisites

For the recommended keyless authentication with Microsoft Entra ID, you need to:

Install the Azure CLI used for keyless authentication with Microsoft Entra ID.
Assign the Cognitive Services User role to your user account. You can assign roles in the Azure portal under Access control (IAM) > Add role assignment.

Set up

Create a new folder whisper-quickstart and go to the quickstart folder with the following command:
```
mkdir whisper-quickstart && cd whisper-quickstart
```
Create a new console application with the following command:
```
dotnet new console
```
Install the OpenAI .NET client library with the dotnet add package command:
```
dotnet add package Azure.AI.OpenAI
```
For the recommended keyless authentication with Microsoft Entra ID, install the Azure.Identity package with:
```
dotnet add package Azure.Identity
```
For the recommended keyless authentication with Microsoft Entra ID, sign in to Azure with the following command:
```
az login
```

Retrieve resource information

You need to retrieve the following information to authenticate your application with your Azure OpenAI resource:

Microsoft Entra ID
API key

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
`AZURE_OPENAI_DEPLOYMENT_NAME`	This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
`OPENAI_API_VERSION`	Learn more about API Versions. You can change the version in code or use an environment variable.

Learn more about keyless authentication and setting environment variables.

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
`AZURE_OPENAI_API_KEY`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.
`AZURE_OPENAI_DEPLOYMENT_NAME`	This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
`OPENAI_API_VERSION`	Learn more about API Versions.

Learn more about finding API keys and setting environment variables.

Important

For more information about AI services security, see Authenticate requests to Azure AI services.

Run the quickstart

The sample code in this quickstart uses Microsoft Entra ID for the recommended keyless authentication. If you prefer to use an API key, you can replace the DefaultAzureCredential object with an AzureKeyCredential object.

Microsoft Entra ID
API key

AzureOpenAIClient openAIClient = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential());

AzureOpenAIClient openAIClient = new AzureOpenAIClient(new Uri(endpoint), new AzureKeyCredential(key));

Note

You can get sample audio files, such as wikipediaOcelot.wav, from the Azure AI Speech SDK repository at GitHub.

To run the quickstart, follow these steps:

Replace the contents of Program.cs with the following code and update the placeholder values with your own.

using Azure;
using Azure.AI.OpenAI;
using Azure.Identity; // Required for Passwordless auth


string deploymentName = "whisper";

string endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT") ?? "https://<your-resource-name>.openai.azure.com/";
string key = Environment.GetEnvironmentVariable("AZURE_OPENAI_API_KEY") ?? "<your-key>";

// Use the recommended keyless credential instead of the AzureKeyCredential credential.
AzureOpenAIClient openAIClient = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential()); 
//AzureOpenAIClient openAIClient = new AzureOpenAIClient(new Uri(endpoint), new AzureKeyCredential(key));

var audioFilePath = "<audio file path>"

var audioClient = openAIClient.GetAudioClient(deploymentName);

var result = await audioClient.TranscribeAudioAsync(audioFilePath);

Console.WriteLine("Transcribed text:");
foreach (var item in result.Value.Text)
{
    Console.Write(item);
}

Run the application using the dotnet run command or the run button at the top of Visual Studio:
```
dotnet run
```

Output

If you are using the sample audio file, you should see the following text printed out in the console:

The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, 
Mexico, and Central and South America. This medium-sized cat is characterized by solid 
black spots and streaks on its coat, round ears...

Source code | Package (npm) | Samples

Prerequisites

An Azure subscription - Create one for free
LTS versions of Node.js
Azure CLI used for passwordless authentication in a local development environment, create the necessary context by signing in with the Azure CLI.
An Azure OpenAI resource with a speech to text model deployed in a supported region. For more information, see Create a resource and deploy a model with Azure OpenAI.

Microsoft Entra ID prerequisites

For the recommended keyless authentication with Microsoft Entra ID, you need to:

Install the Azure CLI used for keyless authentication with Microsoft Entra ID.
Assign the Cognitive Services User role to your user account. You can assign roles in the Azure portal under Access control (IAM) > Add role assignment.

Set up

Create a new folder synthesis-quickstart and go to the quickstart folder with the following command:
```
mkdir synthesis-quickstart && cd synthesis-quickstart
```
Create the package.json with the following command:
```
npm init -y
```
Install the OpenAI client library for JavaScript with:
```
npm install openai
```
For the recommended passwordless authentication:
```
npm install @azure/identity
```

Retrieve resource information

You need to retrieve the following information to authenticate your application with your Azure OpenAI resource:

Microsoft Entra ID
API key

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
`AZURE_OPENAI_DEPLOYMENT_NAME`	This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
`OPENAI_API_VERSION`	Learn more about API Versions. You can change the version in code or use an environment variable.

Learn more about keyless authentication and setting environment variables.

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
`AZURE_OPENAI_API_KEY`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.
`AZURE_OPENAI_DEPLOYMENT_NAME`	This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
`OPENAI_API_VERSION`	Learn more about API Versions.

Learn more about finding API keys and setting environment variables.

Important

For more information about AI services security, see Authenticate requests to Azure AI services.

Caution

To use the recommended keyless authentication with the SDK, make sure that the AZURE_OPENAI_API_KEY environment variable isn't set.

Create the index.js file with the following code:

const { createReadStream } = require("fs");
const { AzureOpenAI } = require("openai");
const { DefaultAzureCredential, getBearerTokenProvider } = require("@azure/identity");

// You will need to set these environment variables or edit the following values
const audioFilePath = "<audio file path>";
const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "Your endpoint";

// Required Azure OpenAI deployment name and API version
const apiVersion = process.env.OPENAI_API_VERSION || "2024-08-01-preview";
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "whisper";

// keyless authentication    
const credential = new DefaultAzureCredential();
const scope = "https://cognitiveservices.azure.com/.default";
const azureADTokenProvider = getBearerTokenProvider(credential, scope);

function getClient() {
  return new AzureOpenAI({
    endpoint,
    azureADTokenProvider,
    apiVersion,
    deployment: deploymentName,
  });
}

export async function main() {
  console.log("== Transcribe Audio Sample ==");

  const client = getClient();
  const result = await client.audio.transcriptions.create({
    model: "",
    file: createReadStream(audioFilePath),
  });

  console.log(`Transcription: ${result.text}`);
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Sign in to Azure with the following command:
```
az login
```
Run the JavaScript file.
```
node index.js
```

Create the index.js file with the following code:

import { createReadStream } from "fs";
import { AzureOpenAI } from "openai";

// You will need to set these environment variables or edit the following values
const audioFilePath = "<audio file path>";
const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "Your endpoint";
const apiKey = process.env.AZURE_OPENAI_API_KEY || "Your API key";

// Required Azure OpenAI deployment name and API version
const apiVersion = "2024-08-01-preview";
const deploymentName = "whisper";

function getClient(): AzureOpenAI {
  return new AzureOpenAI({
    endpoint,
    apiKey,
    apiVersion,
    deployment: deploymentName,
  });
}

export async function main() {
  console.log("== Transcribe Audio Sample ==");

  const client = getClient();
  const result = await client.audio.transcriptions.create({
    model: "",
    file: createReadStream(audioFilePath),
  });

  console.log(`Transcription: ${result.text}`);
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Sign in to Azure with the following command:
```
az login
```
Run the JavaScript file.
```
node index.js
```

You can get sample audio files, such as wikipediaOcelot.wav, from the Azure AI Speech SDK repository at GitHub.

Output

{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}

Source code | Package (npm) | Samples

Prerequisites

An Azure subscription - Create one for free
LTS versions of Node.js
TypeScript
Azure CLI used for passwordless authentication in a local development environment, create the necessary context by signing in with the Azure CLI.
An Azure OpenAI resource with a speech to text model deployed in a supported region. For more information, see Create a resource and deploy a model with Azure OpenAI.

Microsoft Entra ID prerequisites

For the recommended keyless authentication with Microsoft Entra ID, you need to:

Install the Azure CLI used for keyless authentication with Microsoft Entra ID.
Assign the Cognitive Services User role to your user account. You can assign roles in the Azure portal under Access control (IAM) > Add role assignment.

Set up

Create a new folder whisper-quickstart and go to the quickstart folder with the following command:
```
mkdir whisper-quickstart && cd whisper-quickstart
```
Create the package.json with the following command:
```
npm init -y
```
Update the package.json to ECMAScript with the following command:
```
npm pkg set type=module
```
Install the OpenAI client library for JavaScript with:
```
npm install openai
```
For the recommended passwordless authentication:
```
npm install @azure/identity
```

Retrieve resource information

You need to retrieve the following information to authenticate your application with your Azure OpenAI resource:

Microsoft Entra ID
API key

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
`AZURE_OPENAI_DEPLOYMENT_NAME`	This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
`OPENAI_API_VERSION`	Learn more about API Versions. You can change the version in code or use an environment variable.

Learn more about keyless authentication and setting environment variables.

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal.
`AZURE_OPENAI_API_KEY`	This value can be found in the Keys and Endpoint section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.
`AZURE_OPENAI_DEPLOYMENT_NAME`	This value will correspond to the custom name you chose for your deployment when you deployed a model. This value can be found under Resource Management > Model Deployments in the Azure portal.
`OPENAI_API_VERSION`	Learn more about API Versions.

Learn more about finding API keys and setting environment variables.

Important

For more information about AI services security, see Authenticate requests to Azure AI services.

Caution

To use the recommended keyless authentication with the SDK, make sure that the AZURE_OPENAI_API_KEY environment variable isn't set.

Create a sample application

Microsoft Entra ID
API key

Create the index.ts file with the following code:

import { createReadStream } from "fs";
import { AzureOpenAI } from "openai";
import { DefaultAzureCredential, getBearerTokenProvider } from "@azure/identity";

// You will need to set these environment variables or edit the following values
const audioFilePath = "<audio file path>";
const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "Your endpoint";

// Required Azure OpenAI deployment name and API version
const apiVersion = process.env.OPENAI_API_VERSION || "2024-08-01-preview";
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "whisper";

// keyless authentication    
const credential = new DefaultAzureCredential();
const scope = "https://cognitiveservices.azure.com/.default";
const azureADTokenProvider = getBearerTokenProvider(credential, scope);

function getClient(): AzureOpenAI {
  return new AzureOpenAI({
    endpoint,
    azureADTokenProvider,
    apiVersion,
    deployment: deploymentName,
  });
}

export async function main() {
  console.log("== Transcribe Audio Sample ==");

  const client = getClient();
  const result = await client.audio.transcriptions.create({
    model: "",
    file: createReadStream(audioFilePath),
  });

  console.log(`Transcription: ${result.text}`);
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Create the tsconfig.json file to transpile the TypeScript code and copy the following code for ECMAScript.

{
    "compilerOptions": {
      "module": "NodeNext",
      "target": "ES2022", // Supports top-level await
      "moduleResolution": "NodeNext",
      "skipLibCheck": true, // Avoid type errors from node_modules
      "strict": true // Enable strict type-checking options
    },
    "include": ["*.ts"]
}

Transpile from TypeScript to JavaScript.
```
tsc
```
Sign in to Azure with the following command:
```
az login
```
Run the code with the following command:
```
node index.js
```

Create the index.ts file with the following code:

import { createReadStream } from "fs";
import { AzureOpenAI } from "openai";

// You will need to set these environment variables or edit the following values
const audioFilePath = "<audio file path>";
const endpoint = process.env.AZURE_OPENAI_ENDPOINT || "Your endpoint";
const apiKey = process.env.AZURE_OPENAI_API_KEY || "Your API key";

// Required Azure OpenAI deployment name and API version
const apiVersion = process.env.OPENAI_API_VERSION || "2024-08-01-preview";
const deploymentName = process.env.AZURE_OPENAI_DEPLOYMENT_NAME || "whisper";

function getClient(): AzureOpenAI {
  return new AzureOpenAI({
    endpoint,
    apiKey,
    apiVersion,
    deployment: deploymentName,
  });
}

export async function main() {
  console.log("== Transcribe Audio Sample ==");

  const client = getClient();
  const result = await client.audio.transcriptions.create({
    model: "",
    file: createReadStream(audioFilePath),
  });

  console.log(`Transcription: ${result.text}`);
}

main().catch((err) => {
  console.error("The sample encountered an error:", err);
});

Create the tsconfig.json file to transpile the TypeScript code and copy the following code for ECMAScript.

{
    "compilerOptions": {
      "module": "NodeNext",
      "target": "ES2022", // Supports top-level await
      "moduleResolution": "NodeNext",
      "skipLibCheck": true, // Avoid type errors from node_modules
      "strict": true // Enable strict type-checking options
    },
    "include": ["*.ts"]
}

Transpile from TypeScript to JavaScript.
```
tsc
```
Run the code with the following command:
```
node index.js
```

You can get sample audio files, such as wikipediaOcelot.wav, from the Azure AI Speech SDK repository at GitHub.

Important

For more information about AI services security, see Authenticate requests to Azure AI services.

Output

{"text":"The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs."}

Prerequisites

An Azure subscription - Create one for free
You can use either the latest version, PowerShell 7, or Windows PowerShell 5.1.
An Azure OpenAI resource with a speech to text model deployed in a supported region. For more information, see Create a resource and deploy a model with Azure OpenAI.

Set up

Retrieve key and endpoint

To successfully make a call against Azure OpenAI, you need an endpoint and a key.

Variable name	Value
`AZURE_OPENAI_ENDPOINT`	The service endpoint can be found in the Keys & Endpoint section when examining your resource from the Azure portal. Alternatively, you can find the endpoint via the Deployments page in Azure AI Foundry portal. An example endpoint is: `https://docs-test-001.openai.azure.com/`.
`AZURE_OPENAI_API_KEY`	This value can be found in the Keys & Endpoint section when examining your resource from the Azure portal. You can use either `KEY1` or `KEY2`.

Environment variables

Create and assign persistent environment variables for your key and endpoint.

Important

For more information about AI services security, see Authenticate requests to Azure AI services.

setx AZURE_OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"

setx AZURE_OPENAI_ENDPOINT "REPLACE_WITH_YOUR_ENDPOINT_HERE"

[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_API_KEY', 'REPLACE_WITH_YOUR_KEY_VALUE_HERE', 'User')

[System.Environment]::SetEnvironmentVariable('AZURE_OPENAI_ENDPOINT', 'REPLACE_WITH_YOUR_ENDPOINT_HERE', 'User')

echo export AZURE_OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE" >> /etc/environment && source /etc/environment

echo export AZURE_OPENAI_ENDPOINT="REPLACE_WITH_YOUR_ENDPOINT_HERE" >> /etc/environment && source /etc/environment

Create a PowerShell app

Run the following command. You need to replace YourDeploymentName with the deployment name you chose when you deployed the Whisper model. The deployment name isn't necessarily the same as the model name. Entering the model name results in an error unless you chose a deployment name that is identical to the underlying model name.

# Azure OpenAI metadata variables
$openai = @{
    api_key     = $Env:AZURE_OPENAI_API_KEY
    api_base    = $Env:AZURE_OPENAI_ENDPOINT # your endpoint should look like the following https://YOUR_RESOURCE_NAME.openai.azure.com/
    api_version = '2024-02-01' # this may change in the future
    name        = 'YourDeploymentName' #This will correspond to the custom name you chose for your deployment when you deployed a model.
}

# Header for authentication
$headers = [ordered]@{
    'api-key' = $openai.api_key
}

$form = @{ file = get-item -path './wikipediaOcelot.wav' }

# Send a completion call to generate an answer
$url = "$($openai.api_base)/openai/deployments/$($openai.name)/audio/transcriptions?api-version=$($openai.api_version)"

$response = Invoke-RestMethod -Uri $url -Headers $headers -Form $form -Method Post -ContentType 'multipart/form-data'
return $response.text

You can get sample audio files, such as wikipediaOcelot.wav, from the Azure AI Speech SDK repository at GitHub.

Important

For production, store and access your credentials using a secure method, such as The PowerShell Secret Management with Azure Key Vault. For more information, see credential security.

Output

The ocelot, Lepardus paradalis, is a small wild cat native to the southwestern United States, Mexico, and Central and South America. This medium-sized cat is characterized by solid black spots and streaks on its coat, round ears, and white neck and undersides. It weighs between 8 and 15.5 kilograms, 18 and 34 pounds, and reaches 40 to 50 centimeters 16 to 20 inches at the shoulders. It was first described by Carl Linnaeus in 1758. Two subspecies are recognized, L. p. paradalis and L. p. mitis. Typically active during twilight and at night, the ocelot tends to be solitary and territorial. It is efficient at climbing, leaping, and swimming. It preys on small terrestrial mammals such as armadillo, opossum, and lagomorphs.

Clean up resources

If you want to clean up and remove an Azure OpenAI resource, you can delete the resource. Before deleting the resource, you must first delete any deployed models.

Next steps

To learn how to convert audio data to text in batches, see Create a batch transcription.
For more examples, check out the Azure OpenAI Samples GitHub repository.

Share via

Quickstart: Speech to text with the Azure OpenAI Whisper model

Prerequisites

Set up

Retrieve key and endpoint

Environment variables

Create a REST API request and response

Output

Prerequisites

Set up

Retrieve key and endpoint

Environment variables

Passwordless authentication is recommended

Create a Python environment

Create the Python app

Output

Prerequisites

Microsoft Entra ID prerequisites

Set up

Retrieve resource information

Run the quickstart

Output

Prerequisites

Microsoft Entra ID prerequisites

Set up

Retrieve resource information

Create a sample application

Output

Prerequisites

Microsoft Entra ID prerequisites

Set up

Retrieve resource information

Create a sample application

Output

Prerequisites

Set up

Retrieve key and endpoint

Environment variables

Create a PowerShell app

Output

Clean up resources

Next steps

Feedback

Additional resources