This article describes how to extract insights from customer conversations at a call center by using Azure AI services and Azure OpenAI Service. Use these services to improve your customer interactions and satisfaction by analyzing call intent and sentiment, extracting key entities, and summarizing call content.
Architecture
Download a PowerPoint file of this architecture.
Dataflow
A phone call between an agent and a customer is recorded and stored in Azure Blob Storage. Audio files are uploaded to an Azure Storage account via a supported method, such as the UI-based tool, Azure Storage Explorer, or a Storage SDK or API.
An Azure function is configured with one of the following triggers to start the intelligent transcription process:
Timer trigger: Configure a time-based trigger to process a batch of audio files accumulated over a specified time period.
Blob trigger: Configure a blob trigger to initiate intelligent transcription as soon as an audio file is uploaded to the blob container.
The Azure function will trigger an Azure App Service which will execute the following steps in sequence:
Call the Azure AI Speech to transcribe the files.
Optionally, save this raw file in Azure blob storage for future reference.
Pass the raw data to the Azure AI Language service to detect and redact personal data in the transcript.
Send the redacted data to the Azure OpenAI service to perform various post call analytics like understand the intent and sentiment of the call, extract entities, or summarize the conversation to evaluate the effectiveness of the call.
Store the processed output in Azure Storage for visualization or consumption by downstream applications for further processing.
Power BI can be used to visualize the post call analytics on different criteria as required by the business use case. You can also store this output in a customer relationship management (CRM), so agents have contextual information about why the customer called and can quickly solve potential problems. This process is fully automated, which saves the agents time and effort.
Components
Blob Storage is the object storage solution for raw files in this scenario. Blob Storage supports libraries for languages like .NET, Node.js, and Python. Applications can access files on Blob Storage via HTTP or HTTPS. Blob Storage has hot, cool, and archive access tiers for storing large amounts of data, which optimizes cost.
Azure OpenAI provides access to the Azure OpenAI language models, including GPT-3, Codex, and the embeddings model series, for content generation, summarization, semantic search, and natural language-to-code translation. You can access the service through REST APIs, Python SDK, or the web-based interface in the Azure OpenAI Studio.
Azure AI Speech is an AI-based API that provides speech capabilities like speech-to-text, text-to-speech, speech translation, and speaker recognition. This architecture uses the Azure AI Speech batch transcription functionality.
Azure AI Language consolidates the Azure natural-language processing services. For information about prebuilt and customizable options, see Azure AI Language available features.
Language Studio provides a UI for exploring and analyzing AI services for language features. Language Studio provides options for building, tagging, training, and deploying custom models.
Power BI is a software-as-a-service (SaaS) that provides visual and interactive insights for business analytics. It provides transformation capabilities and connects to other data sources.
Alternatives
Depending on your scenario, you can add the following workflows.
- Perform conversation summarization by using the prebuilt model in Azure AI Language.
- Azure also offers Speech Analytics which provides the entire orchestration for post call analytics in batch.
Scenario details
This solution uses Azure AI Speech to Text to convert call-center audio into written text. Azure AI Language redacts sensitive information in the conversation transcription. Azure OpenAI extracts insights from customer conversation to improve call center efficiency and customer satisfaction. Use this solution to process transcribed text, recognize and remove sensitive information, and perform analytics on the extractions like reason for the call, resolution provided or not, sentiment of the call, listing product /service offering based on the number of queries/customer complaints, and so on. Scale the services and the pipeline to accommodate any volume of recorded data.
Potential use cases
This solution provides value to organizations across multiple industries that have customer support agents. The post call analytics can help improve the company's products and services, and the effectiveness of the customer support systems. The solution applies to any organization that records conversations, including customer-facing agents, internal call centers, or support desks.
Considerations
These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that can be used to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.
Reliability
Reliability ensures your application can meet the commitments you make to your customers. For more information, see Design review checklist for Reliability.
- Find the availability service-level agreement (SLA) for each component in SLAs for online services.
- To design high-availability applications with Storage accounts, see the configuration options.
- To ensure resiliency of the compute services and datastores in this scenario, use failure mode for services like Azure Functions and Storage. For more information, see the resiliency checklist for Azure services.
Security
Security provides assurances against deliberate attacks and the abuse of your valuable data and systems. For more information, see Design review checklist for Security.
- Implement data protection, identity and access management, and network security recommendations for Blob Storage, AI services, and Azure OpenAI.
- Configure AI services virtual networks.
Cost Optimization
Cost Optimization is about looking at ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Design review checklist for Cost Optimization.
The total cost of this solution depends on the pricing tier of your services. Factors that can affect the price of each component are:
- The number of documents that you process.
- The number of concurrent requests that your application receives.
- The size of the data that you store after processing.
- Your deployment region.
For more information, see the following resources:
Use the Azure pricing calculator to estimate your solution cost.
Performance Efficiency
Performance Efficiency is the ability of your workload to meet the demands placed on it by users in an efficient manner. For more information, see Design review checklist for Performance Efficiency.
When high volumes of data are processed, it can expose performance bottlenecks. To ensure proper performance efficiency, understand and plan for the scaling options to use with the AI services autoscale feature.
The batch speech API is designed for high volumes, but other AI services APIs might have request limits, depending on the subscription tier. Consider containerizing AI services APIs to avoid slowing down large-volume processing. Containers provide deployment flexibility in the cloud and on-premises. Mitigate side effects of new version rollouts by using containers. For more information, see Container support in AI services.
Contributors
This article is maintained by Microsoft. It was originally written by the following contributors.
Principal authors:
- Dixit Arora | Senior Customer Engineer, ISV DN CoE
- Jyotsna Ravi | Principal Customer Engineer, ISV DN CoE
To see non-public LinkedIn profiles, sign in to LinkedIn.
Next steps
- What is Azure AI Speech?
- What is Azure OpenAI?
- What is Azure Machine Learning?
- Introduction to Blob Storage
- What is Azure AI Language?
- Introduction to Azure Data Lake Storage Gen2
- What is Power BI?
- Ingestion Client with AI services
- Post-call transcription and analytics