Use Speech service through a Virtual Network service endpoint

Azure Virtual Network service endpoints help to provide secure and direct connectivity to Azure services over an optimized route on the Azure backbone network. Endpoints help you secure your critical Azure service resources to only your virtual networks. Service endpoints enable private IP addresses in the virtual network to reach the endpoint of an Azure service without needing a public IP address on the virtual network.

This article explains how to set up and use Virtual Network service endpoints with Speech service in Azure AI services.

This article also describes how to remove Virtual Network service endpoints later but still use the Speech resource.

To set up a Speech resource for Virtual Network service endpoint scenarios, you need to:

  1. Create a custom domain name for the Speech resource.
  2. Configure virtual networks and networking settings for the Speech resource.
  3. Adjust existing applications and solutions.

Note

Setting up and using Virtual Network service endpoints for Speech service is similar to setting up and using private endpoints. In this article, we refer to the corresponding sections of the article on using private endpoints when the procedures are the same.

Private endpoints and Virtual Network service endpoints

Azure provides private endpoints and Virtual Network service endpoints for traffic that tunnels via the private Azure backbone network. The purpose and underlying technologies of these endpoint types are similar. But there are differences between the two technologies. We recommend that you learn about the pros and cons of both before you design your network.

There are a few things to consider when you decide which technology to use:

  • Both technologies ensure that traffic between the virtual network and the Speech resource doesn't travel over the public internet.
  • A private endpoint provides a dedicated private IP address for your Speech resource. This IP address is accessible only within a specific virtual network and subnet. You have full control of the access to this IP address within your network infrastructure.
  • Virtual Network service endpoints don't provide a dedicated private IP address for the Speech resource. Instead, they encapsulate all packets sent to the Speech resource and deliver them directly over the Azure backbone network.
  • Both technologies support on-premises scenarios. By default, when they use Virtual Network service endpoints, Azure service resources secured to virtual networks can't be reached from on-premises networks. But you can change that behavior.
  • Virtual Network service endpoints are often used to restrict the access for a Speech resource based on the virtual networks from which the traffic originates.
  • For Azure AI services, enabling the Virtual Network service endpoint forces the traffic for all Azure AI services resources to go through the private backbone network. That requires explicit network access configuration. (For more information, see Configure virtual networks and the Speech resource networking settings.) Private endpoints don't have this limitation and provide more flexibility for your network configuration. You can access one resource through the private backbone and another through the public internet by using the same subnet of the same virtual network.
  • Private endpoints incur extra costs. Virtual Network service endpoints are free.
  • Private endpoints require extra DNS configuration.
  • One Speech resource can work simultaneously with both private endpoints and Virtual Network service endpoints.

We recommend that you try both endpoint types before you make a decision about your production design.

For more information, see these resources:

This article describes how to use Virtual Network service endpoints with Speech service. For information about private endpoints, see Use Speech service through a private endpoint.

Create a custom domain name

Virtual Network service endpoints require a custom subdomain name for Azure AI services. Create a custom domain by following the guidance in the private endpoint article. All warnings in the section also apply to Virtual Network service endpoints.

Configure virtual networks and the Speech resource networking settings

You need to add all virtual networks that are allowed access via the service endpoint to the Speech resource networking properties.

Note

To access a Speech resource via the Virtual Network service endpoint, you need to enable the Microsoft.CognitiveServices service endpoint type for the required subnets of your virtual network. Doing so will route all subnet traffic related to Azure AI services through the private backbone network. If you intend to access any other Azure AI services resources from the same subnet, make sure these resources are configured to allow your virtual network.

If a virtual network isn't added as allowed in the Speech resource networking properties, it won't have access to the Speech resource via the service endpoint, even if the Microsoft.CognitiveServices service endpoint is enabled for the virtual network. And if the service endpoint is enabled but the virtual network isn't allowed, the Speech resource won't be accessible for the virtual network through a public IP address, no matter what the Speech resource's other network security settings are. That's because enabling the Microsoft.CognitiveServices endpoint routes all traffic related to Azure AI services through the private backbone network, and in this case the virtual network should be explicitly allowed to access the resource. This guidance applies for all Azure AI services resources, not just for Speech resources.

  1. Go to the Azure portal and sign in to your Azure account.

  2. Select the Speech resource.

  3. In the Resource Management group in the left pane, select Networking.

  4. On the Firewalls and virtual networks tab, select Selected Networks and Private Endpoints.

    Note

    To use Virtual Network service endpoints, you need to select the Selected Networks and Private Endpoints network security option. No other options are supported. If your scenario requires the All networks option, consider using private endpoints, which support all three network security options.

  5. Select Add existing virtual network or Add new virtual network and provide the required parameters. Select Add for an existing virtual network or Create for a new one. If you add an existing virtual network, the Microsoft.CognitiveServices service endpoint is automatically enabled for the selected subnets. This operation can take up to 15 minutes. Also, see the note at the beginning of this section.

Enabling service endpoint for an existing virtual network

As described in the previous section, when you configure a virtual network as allowed for the Speech resource, the Microsoft.CognitiveServices service endpoint is automatically enabled. If you later disable it, you need to re-enable it manually to restore the service endpoint access to the Speech resource (and to other Azure AI services resources):

  1. Go to the Azure portal and sign in to your Azure account.
  2. Select the virtual network.
  3. In the Settings group in the left pane, select Subnets.
  4. Select the required subnet.
  5. A new panel appears on the right side of the window. In this panel, in the Service Endpoints section, select Microsoft.CognitiveServices in the Services list.
  6. Select Save.

Adjust existing applications and solutions

A Speech resource that has a custom domain enabled interacts with the Speech service in a different way. This is true for a custom-domain-enabled Speech resource regardless of whether service endpoints are configured. Information in this section applies to both scenarios.

Use a Speech resource that has a custom domain name and allowed virtual networks

In this scenario, the Selected Networks and Private Endpoints option is selected in the networking settings of the Speech resource and at least one virtual network is allowed. This scenario is equivalent to using a Speech resource that has a custom domain name and a private endpoint enabled.

Use a Speech resource that has a custom domain name but that doesn't have allowed virtual networks

In this scenario, private endpoints aren't enabled and one of these statements is true:

  • The Selected Networks and Private Endpoints option is selected in the networking settings of the Speech resource, but no allowed virtual networks are configured.
  • The All networks option is selected in the networking settings of the Speech resource.

This scenario is equivalent to using a Speech resource that has a custom domain name and that doesn't have private endpoints.

Use of Speech Studio

Speech Studio is a web portal with tools for building and integrating Azure AI Speech service in your application. When you work in Speech Studio projects, network connections and API calls to the corresponding Speech resource are made on your behalf. Working with private endpoints, virtual network service endpoints, and other network security options can limit the availability of Speech Studio features. You normally use Speech Studio when working with features, like custom speech, Custom neural voice and Audio Content Creation.

Reaching Speech Studio web portal from a Virtual network

To use Speech Studio from a virtual machine within an Azure Virtual network, you must allow outgoing connections to the required set of service tags for this virtual network. See details here.

Access to the Speech resource endpoint is not equal to access to Speech Studio web portal. Access to Speech Studio web portal via private or Virtual Network service endpoints is not supported.

Working with Speech Studio projects

This section describes working with the different kind of Speech Studio projects for the different network security options of the Speech resource. It's expected that the web browser connection to Speech Studio is established. Speech resource network security settings are set in Azure portal.

  1. Go to the Azure portal and sign in to your Azure account.
  2. Select the Speech resource.
  3. In the Resource Management group in the left pane, select Networking > Firewalls and virtual networks.
  4. Select one option from All networks, Selected Networks and Private Endpoints, or Disabled.

Custom speech

The following table describes custom speech project accessibility per Speech resource Networking > Firewalls and virtual networks security setting.

Note

If you allow only private endpoints via the Networking > Private endpoint connections tab, then you can't use Speech Studio with the Speech resource. You can still use the Speech resource outside of Speech Studio.

Speech resource network security setting Speech Studio project accessibility
All networks No restrictions
Selected Networks and Private Endpoints Accessible from allowed public IP addresses
Disabled Not accessible

If you select Selected Networks and private endpoints, then you will see a tab with Virtual networks and Firewall access configuration options. In the Firewall section, you must allow at least one public IP address and use this address for the browser connection with Speech Studio.

If you allow only access via Virtual network, then in effect you don't allow access to the Speech resource through Speech Studio. You can still use the Speech resource outside of Speech Studio.

To use custom speech without relaxing network access restrictions on your production Speech resource, consider one of these workarounds.

  • Create another Speech resource for development that can be used on a public network. Prepare your custom model in Speech Studio on the development resource, and then copy the model to your production resource. See the Models_CopyTo REST request with Speech to text REST API.
  • You have the option to not use Speech Studio for custom speech. Use the Speech to text REST API for all custom speech operations.

Custom voice and Audio Content Creation

You can use custom voice and Audio Content Creation Speech Studio projects only when the Speech resource network security setting is All networks.

Simultaneous use of private endpoints and Virtual Network service endpoints

You can use private endpoints and Virtual Network service endpoints to access to the same Speech resource simultaneously. To enable this simultaneous use, you need to use the Selected Networks and Private Endpoints option in the networking settings of the Speech resource in the Azure portal. Other options aren't supported for this scenario.

Learn more