Quickstart: Subscribing to audio streams from an ongoing call

Important

Functionality described on this document is currently in private preview. Private preview includes access to SDKs and documentation for testing purposes that are not yet available publicly. Apply to become an early adopter by filling out the form for preview access to Azure Communication Services.

Get started with using audio streams through Azure Communication Services Media Streaming API. This quickstart assumes you're already familiar with Call Automation APIs to build an automated call routing solution.

Prerequisites

Set up a websocket server

Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection. You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this quickstart.

Establish a call

In this quickstart we assume that you're already familiar with starting calls. If you need to learn more about starting and establishing calls, you can follow our quickstart. For the purposes of this quickstart, we'll be going through the process of starting media streaming for both incoming calls and outbound calls.

Start media streaming - incoming call

Your application will start receiving media streams once you answer the call and provide Azure Communication Services with the WebSocket information.

var mediaStreamingOptions = new MediaStreamingOptions(
    new Uri("wss://testwebsocket.webpubsub.azure.com/client/hubs/media?accesstoken={access_token}"),
      MediaStreamingTransport.WebSocket,
      MediaStreamingContent.Audio,
      MediaStreamingAudioChannel.Mixed,
    );
    var answerCallOptions = new AnswerCallOptions(incomingCallContext, callbackUri: new Uri(callConfiguration.AppCallbackUrl)) {
      MediaStreamingOptions = mediaStreamingOptions
    };
    var response = await callingServerClient.AnswerCallAsync(answerCallOptions);

Start media streaming - outbound call

Your application will start receiving media streams once you create the call and provide Azure Communication Services with the WebSocket information.

var mediaStreamingOptions = new MediaStreamingOptions(
  new Uri("wss://{yourwebsocketurl}"),
  MediaStreamingTransport.WebSocket,
  MediaStreamingContent.Audio,
  MediaStreamingAudioChannel.Mixed,
);
var createCallOptions = new CreateCallOptions(callSource, new List < PhoneNumberIdentifier > {
  target
}, new Uri(callConfiguration.AppCallbackUrl)) {
  MediaStreamingOptions = mediaStreamingOptions
};
var createCallResult = await client.CreateCallAsync(createCallOptions);

Handling media streams in your websocket server

The sample below demonstrates how to listen to media stream using your websocket server

HttpListener httpListener = new HttpListener();
httpListener.Prefixes.Add("http://localhost:80/");
httpListener.Start();
while (true)
{
    HttpListenerContext httpListenerContext = await httpListener.GetContextAsync();
    if (httpListenerContext.Request.IsWebSocketRequest)
    {
        WebSocketContext websocketContext;
        try
        {
            websocketContext = await httpListenerContext.AcceptWebSocketAsync(subProtocol: null);
        }
        catch (Exception ex)
        {
            return;
        }
        WebSocket webSocket = websocketContext.WebSocket;
        try
        {
            while (webSocket.State == WebSocketState.Open || webSocket.State == WebSocketState.CloseSent)
            {
                byte[] receiveBuffer = new byte[2048];
                var cancellationToken = new CancellationTokenSource(TimeSpan.FromSeconds(60)).Token;
                WebSocketReceiveResult receiveResult = await webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), cancellationToken);
                if (receiveResult.MessageType != WebSocketMessageType.Close)
                {
                    var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0');
                    try
                    {
                        var eventData = JsonConvert.DeserializeObject<AudioBaseClass>(data);
                        if (eventData != null)
                        {
                            if(eventData.kind == "AudioMetadata")
                            {
                                //Process audio metadata
                            }
                            else if(eventData.kind == "AudioData") 
                            {
                                //Process audio data
                                var byteArray = eventData.audioData.data;
                               //use audio byteArray as you want
                            }
                        }
                    }
                    catch { }
                }
            }
        }
        catch (Exception ex) { }
    }
}

Prerequisites

Set up a websocket server

Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection. You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this quickstart.

Establish a call

In this quickstart we assume that you're already familiar with starting calls. If you need to learn more about starting and establishing calls, you can follow our quickstart. For the purposes of this quickstart, we'll be going through the process of starting media streaming for both incoming calls and outbound calls.

Start media streaming - incoming call

Your application will start receiving media streams once you answer the call and provide Azure Communication Services with the WebSocket information.

var mediaStreamingOptions = new MediaStreamingOptions(
    "wss://{yourwebsocketurl}",
    MediaStreamingTransport.WebSocket,
    MediaStreamingContent.Audio,
    MediaStreamingAudioChannel.Mixed
);
var answerCallOptions = new AnswerCallOptions(“<incomingCallContext>”, callConfiguration.AppCallbackUrl).setMediaStreamingConfiguration(mediaStreamingOptions);

var answerCallResponse = callAutomationAsyncClient.answerCallWithResponse(answerCallOptions).block();

Start media streaming - outbound call

Your application will start receiving media streams once you create the call and provide Azure Communication Services with the WebSocket information.

var mediaStreamingOptions = new MediaStreamingOptions(
    "wss://{yourwebsocketurl}",
    MediaStreamingTransportType.WebSocket,
    MediaStreamingContentType.Audio,
    MediaStreamingAudioChannelType.Mixed
);
var createCallOptions = new CreateCallOptions(
    callSource,
    Collections.singletonList(target),
    callConfiguration.AppCallbackUrl 
);
createCallOptions.setMediaStreamingConfiguration(mediaStreamingOptions);
var answerCallResponse = callAutomationAsyncClient.createCallWithResponse(
    createCallOptions
).block();

Handling media streams in your websocket server

The sample below demonstrates how to listen to media stream using your websocket server.

public class WebsocketServer {
    public static void main(String[] args) throws IOException {
        Socket socket = null;
        InputStreamReader inputStreamReader = null;
        OutputStreamWriter outputStreamWriter = null;
        BufferedReader bufferedReader = null;
        BufferedWriter bufferedWriter = null;
        ServerSocket serverSocket = null;
        serverSocket = new ServerSocket(1234);
        while (true) {
            try {
                socket = serverSocket.accept();
                inputStreamReader = new InputStreamReader(socket.getInputStream());
                outputStreamWriter = new OutputStreamWriter(socket.getOutputStream());
                bufferedReader = new BufferedReader(inputStreamReader);
                bufferedWriter = new BufferedWriter(outputStreamWriter);
                while (!socket.isClosed()) {
                    String msgFromClient = bufferedReader.readLine();
                    //You can process the message however you want
                    System.out.println("Client:" + msgFromClient);
                    bufferedWriter.write("MSG Received");
                    bufferedWriter.newLine();
                    bufferedWriter.flush();
                }
                socket.close();
                inputStreamReader.close();
                outputStreamWriter.close();
                bufferedWriter.close();
                bufferedReader.close();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }
    }
}

Message schema

When Azure Communication Services has received the URL for your WebSocket server, it will create a connection to it. Once Azure Communication Services has successfully connected to your WebSocket server, it will send through the first data packet which contains metadata regarding the incoming media packets.

{
    "kind": <string> // What kind of data this is, e.g. AudioMetadata, AudioData.
    "audioMetadata": {
        "subscriptionId": <string>, // unique identifier for a subscription request
        "encoding":<string>, // PCM only supported
        "sampleRate": <int>, // 16000 default
        "channels": <int>, // 1 default
        "length": <int> // 640 default
    }
}

Audio streaming schema

After sending through the metadata packet, Azure Communication Services will start streaming audio media to your WebSocket server. Below is an example of what the media object your server will receive looks like.

{
    "kind": <string>, // What kind of data this is, e.g. AudioMetadata, AudioData.
    "audioData":{
        "data": <string>, // Base64 Encoded audio buffer data
        "timestamp": <string>, // In ISO 8601 format (yyyy-mm-ddThh:mm:ssZ) 
        "participantRawID": <string>, 
        "silent": <boolean> // Indicates if the received audio buffer contains only silence.
    }
}

Example of audio data being streamed

{
  "kind": "AudioData",
  "audioData": {
    "timestamp": "2022-10-03T19:16:12.925Z",
    "participantRawID": "8:acs:3d20e1de-0f28-41c5-84a0-4960fde5f411_0000000b-faeb-c708-99bf-a43a0d0036b0",
    "data": "5ADwAOMA6AD0AOIA4ADkAN8AzwDUANEAywC+ALQArgC0AKYAnACJAIoAlACWAJ8ApwCiAKkAqgCqALUA0wDWANAA3QDVAN0A8wDzAPAA7wDkANkA1QDPAPIA6QDmAOcA0wDYAPMA8QD8AP0AAwH+AAAB/QAAAREBEQEDAQoB9wD3APsA7gDxAPMA7wDpAN0A6gD5APsAAgEHAQ4BEAETARsBMAFHAUABPgE2AS8BKAErATEBLwE7ASYBGQEAAQcBBQH5AAIBBwEMAQ4BAAH+APYA6gDzAPgA7gDkAOUA3wDcANQA2gDWAN8A3wDcAMcAxwDIAMsA1wDfAO4A3wDUANQA3wDvAOUA4QDpAOAA4ADhAOYA5wDkAOUA1gDxAOcA4wDpAOEA4gD0APoA7wD9APkA6ADwAPIA7ADrAPEA6ADfANQAzQDLANIAzwDaANcA3QDZAOQA4wDXANwA1ADbAOsA7ADyAPkA7wDiAOIA6gDtAOsA7gDeAOIA4ADeANUA6gD1APAA8ADgAOQA5wDgAPgA8ADnAN8A5gDgAOoA6wDcAOgA2gDZANUAyQDPANwA3gDgAO4A8QDyAAQBEwEDAewA+gDpAN4A6wDeAO8A8QDwAO8ABAEKAQUB/gD5AAMBAwEIARoBFAEeARkBDgH8AP0A+gD8APcA+gDrAO0A5wDcANEA0QDHAM4A0wDUAM4A0wDZANQAxgDSAM4A1ADVAOMA4QDhANUA2gDjAOYA5wDrANQA5wDrAMsAxQDWANsA5wDpAOEA4QDFAMoA0QDKAMgAwgDNAMsAwgCwAKkAtwCrAKoAsACgAJ4AlQCeAKAAoQCmAKwApwCsAK0AnQCVAA==",
    "silent": false
  }
}

Stop audio streaming

Audio streaming will automatically stop when the call ends or is canceled.

Clean up resources

If you want to clean up and remove a Communication Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it. Learn more about cleaning up resources.

Next steps