Ócáid
Tóg Feidhmchláir agus Gníomhairí AI
Mar 17, 9 PM - Mar 21, 10 AM
Bí ar an tsraith meetup chun réitigh AI inscálaithe a thógáil bunaithe ar chásanna úsáide fíor-dhomhanda le forbróirí agus saineolaithe eile.
Cláraigh anoisNí thacaítear leis an mbrabhsálaí seo a thuilleadh.
Uasghrádú go Microsoft Edge chun leas a bhaint as na gnéithe is déanaí, nuashonruithe slándála, agus tacaíocht theicniúil.
The Speech service provides speech to text and text to speech capabilities with a Speech resource. You can transcribe speech to text with high accuracy, produce natural-sounding text to speech voices, translate spoken audio, and use speaker recognition during conversations.
Create custom voices, add specific words to your base vocabulary, or build your own models. Run Speech anywhere, in the cloud or at the edge in containers. It's easy to speech enable your applications, tools, and devices with the Speech CLI, Speech SDK, and REST APIs.
Speech is available for many languages, regions, and price points.
Common scenarios for speech include:
Microsoft uses Speech for many scenarios, such as captioning in Teams, dictation in Office 365, and Read Aloud in the Microsoft Edge browser.
These sections summarize Speech features with links for more information.
Use speech to text to transcribe audio into text, either in real-time or asynchronously with batch transcription.
Nod
You can try real-time speech to text in Speech Studio without signing up or writing any code.
Convert audio to text from a range of sources, including microphones, audio files, and blob storage. Use speaker diarization to determine who said what and when. Get readable transcripts with automatic formatting and punctuation.
The base model might not be sufficient if the audio contains ambient noise or includes numerous industry and domain-specific jargon. In these cases, you can create and train custom speech models with acoustic, language, and pronunciation data. Custom speech models are private and can offer a competitive advantage.
With real-time speech to text, the audio is transcribed as speech is recognized from a microphone or file. Use real-time speech to text for applications that need to transcribe audio in real-time such as:
Fast transcription API is used to transcribe audio files with returning results synchronously and much faster than real-time audio. Use fast transcription in the scenarios that you need the transcript of an audio recording as quickly as possible with predictable latency, such as:
To get started with fast transcription, see use the fast transcription API.
Batch transcription is used to transcribe a large amount of audio in storage. You can point to audio files with a shared access signature (SAS) URI and asynchronously receive transcription results. Use batch transcription for applications that need to transcribe audio in bulk such as:
With text to speech, you can convert input text into human like synthesized speech. Use neural voices, which are human like voices powered by deep neural networks. Use the Speech Synthesis Markup Language (SSML) to fine-tune the pitch, pronunciation, speaking rate, volume, and more.
Speech translation enables real-time, multilingual translation of speech to your applications, tools, and devices. Use this feature for speech to speech and speech to text translation.
Language identification is used to identify languages spoken in audio when compared against a list of supported languages. Use language identification by itself, with speech to text recognition, or with speech translation.
Speaker recognition provides algorithms that verify and identify speakers by their unique voice characteristics. Speaker recognition is used to answer the question, "Who is speaking?".
Pronunciation assessment evaluates speech pronunciation and gives speakers feedback on the accuracy and fluency of spoken audio. With pronunciation assessment, language learners can practice, get instant feedback, and improve their pronunciation so that they can speak and present with confidence.
Intent recognition: Use speech to text with conversational language understanding to derive user intents from transcribed speech and act on voice commands.
You can deploy Azure AI Speech features in the cloud or on-premises.
With containers, you can bring the service closer to your data for compliance, security, or other operational reasons.
Speech service deployment in sovereign clouds is available for some government entities and their partners. For example, the Azure Government cloud is available to US government entities and their partners. Microsoft Azure operated by 21Vianet cloud is available to organizations with a business presence in China. For more information, see sovereign clouds.
The Speech Studio is a set of UI-based tools for building and integrating features from Azure AI Speech service in your applications. You create projects in Speech Studio by using a no-code approach, and then reference those assets in your applications by using the Speech SDK, the Speech CLI, or the REST APIs.
The Speech CLI is a command-line tool for using Speech service without having to write any code. Most features in the Speech SDK are available in the Speech CLI, and some advanced features and customizations are simplified in the Speech CLI.
The Speech SDK exposes many of the Speech service capabilities you can use to develop speech-enabled applications. The Speech SDK is available in many programming languages and across all platforms.
In some cases, you can't or shouldn't use the Speech SDK. In those cases, you can use REST APIs to access the Speech service. For example, use REST APIs for batch transcription and speaker recognition REST APIs.
We offer quickstarts in many popular programming languages. Each quickstart is designed to teach you basic design patterns and have you running code in less than 10 minutes. See the following list for the quickstart for each feature:
Sample code for the Speech service is available on GitHub. These samples cover common scenarios like reading audio from a file or stream, continuous and single-shot recognition, and working with custom models. Use these links to view SDK and REST samples:
An AI system includes not only the technology, but also the people who use it, the people who are affected by it, and the environment in which it's deployed. Read the transparency notes to learn about responsible AI use and deployment in your systems.
Ócáid
Tóg Feidhmchláir agus Gníomhairí AI
Mar 17, 9 PM - Mar 21, 10 AM
Bí ar an tsraith meetup chun réitigh AI inscálaithe a thógáil bunaithe ar chásanna úsáide fíor-dhomhanda le forbróirí agus saineolaithe eile.
Cláraigh anoisOiliúint
Modúl
Vytváření aplikací s podporou řeči pomocí služeb Azure AI - Training
Vytvářejte aplikace s podporou řeči pomocí služeb Azure AI.
Deimhniú
Microsoft Certified: Inženýr AI služby Azure Associate - Certifications
Návrh a implementace řešení Azure AI pomocí služeb Azure AI, Azure AI Search a Azure Open AI