In precedenza noto come Servizi di Azure AI o Servizi cognitivi di Azure, è una raccolta di funzionalità di intelligenza artificiale predefinite all'interno della piattaforma Microsoft Foundry
Hello Domenico Zurlo,
Thank you for your question. I understand the confusion here both Real-time Speech Translation and Live Interpreter are built on top of the same underlying Speech Translation engine (via TranslationRecognizer), but they are designed for different use cases and levels of abstraction.
Let me clarify the differences in a practical way:
1. Real-time Speech Translation (via TranslationRecognizer)
This is a developer-focused, low-level SDK capability that gives you full control.
You explicitly configure:
- Source (spoken) language
- Target translation language(s)
Output:
- Translated text (Speech → Text → Translation)
- Optionally, you can connect Text-to-Speech to generate audio
You handle:
- Start/stop recognition
- Event processing
- Application logic and UI
Typical use cases:
- Custom-built applications (chat apps, transcription tools)
- Backend processing pipelines
- Scenarios where you need full control over flow and integration
Pricing Flat rate based on audio duration (e.g., per audio hour)
2. Live Interpreter
This is a higher-level, managed experience designed for real-time conversations.
Automatically detects the spoken language (no need to preconfigure source language)
Provides speech-to-speech translation with:
- Low latency
- Natural voice output
- Ability to preserve tone/style (including Personal Voice if enabled)
Handles:
- Conversation flow
- Continuous streaming
- Speaker interaction
Typical use cases:
- Live meetings (e.g., Teams scenarios)
- Customer support calls
- Classrooms, events, or multilingual conversations
Additional considerations:
- Requires special access/whitelisting (not always generally available)
- More “plug-and-play” compared to SDK-based implementation
Pricing (more granular):
- Input audio (per hour)
- Output text (per characters)
- Output audio (standard/custom voice rates)
Real-time Speech Translation (TranslationRecognizer) → Best when you want full control and are building a custom solution
Live Interpreter → Best when you need a ready-to-use, real-time conversational experience with minimal setup
Please refer this
Speech Translation overview (incl. Real-time Translation & pricing) https://learn.microsoft.com/azure/ai-services/speech-service/speech-translation
Live Interpreter details (auto-detect, personal voice) https://learn.microsoft.com/azure/ai-services/speech-service/speech-translation#live-interpreter
How to use Live Interpreter (C# sample + Personal Voice steps) https://learn.microsoft.com/azure/ai-services/speech-service/how-to-translate-speech#using-live-interpreter-for-real-time-speech-to-speech-translation-with-personal-voice
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thank you!