Ingestion Client with Azure AI services

The Ingestion Client is a tool released by Microsoft on GitHub that helps you quickly deploy a call center transcription solution to Azure with a no-code approach.

Tip

You can use the tool and resulting solution in production to process a high volume of audio.

Ingestion Client uses the Azure AI Language, Azure AI Speech, Azure storage, and Azure Functions.

Get started with the Ingestion Client

An Azure account and a multi-service Azure AI services resource are needed to run the Ingestion Client.

See the Getting Started Guide for the Ingestion Client on GitHub to learn how to set up and use the tool.

Ingestion Client Features

The Ingestion Client works by connecting a dedicated Azure storage account to custom Azure Functions in a serverless fashion to pass transcription requests to the service. The transcribed audio files land in the dedicated Azure Storage container.

Important

Pricing varies depending on the mode of operation (batch vs real-time) as well as the Azure Function SKU selected. By default the tool will create a Premium Azure Function SKU to handle large volume. Visit the Pricing page for more information.

Internally, the tool uses Speech and Language services, and follows best practices to handle scale-up, retries and failover. The following schematic describes the resources and connections.

Diagram that shows the Ingestion Client Architecture.

The following Speech service feature is used by the Ingestion Client:

  • Batch speech to text: Transcribe large amounts of audio files asynchronously including speaker diarization and is typically used in post-call analytics scenarios. Diarization is the process of recognizing and separating speakers in mono channel audio data.

Here are some Language service features that are used by the Ingestion Client:

Besides Azure AI services, these Azure products are used to complete the solution:

  • Azure storage: Used for storing telephony data and the transcripts that batch transcription API returns. This storage account should use notifications, specifically for when new files are added. These notifications are used to trigger the transcription process.
  • Azure Functions: Used for creating the shared access signature (SAS) URI for each recording, and triggering the HTTP POST request to start a transcription. Additionally, you use Azure Functions to create requests to retrieve and delete transcriptions by using the Batch Transcription API.

Tool customization

The tool is built to show customers results quickly. You can customize the tool to your preferred SKUs and setup. The SKUs can be edited from the Azure portal and the code itself is available on GitHub.

Note

We suggest creating the resources in the same dedicated resource group to understand and track costs more easily.

Next steps