Quickstart: Subscribing to audio streams from an ongoing call
Important
Functionality described on this document is currently in private preview. Private preview includes access to SDKs and documentation for testing purposes that are not yet available publicly. Apply to become an early adopter by filling out the form for preview access to Azure Communication Services.
Get started with using audio streams through Azure Communication Services Media Streaming API. This quickstart assumes you're already familiar with Call Automation APIs to build an automated call routing solution.
Prerequisites
- Azure account with an active subscription, for details see Create an account for free.
- Azure Communication Services resource. See Create an Azure Communication Services resource
- Create a new web service application using the Call Automation SDK.
- The latest .NET library for your operating system.
- Apache Maven.
- A websocket server that can receive media streams.
Set up a websocket server
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection. You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this quickstart.
Establish a call
In this quickstart we assume that you're already familiar with starting calls. If you need to learn more about starting and establishing calls, you can follow our quickstart. For the purposes of this quickstart, we'll be going through the process of starting media streaming for both incoming calls and outbound calls.
Start media streaming - incoming call
Your application will start receiving media streams once you answer the call and provide ACS with the WebSocket information.
var mediaStreamingOptions = new MediaStreamingOptions(
new Uri("wss://testwebsocket.webpubsub.azure.com/client/hubs/media?accesstoken={access_token}"),
MediaStreamingTransport.WebSocket,
MediaStreamingContent.Audio,
MediaStreamingAudioChannel.Mixed,
);
var answerCallOptions = new AnswerCallOptions(incomingCallContext, callbackUri: new Uri(callConfiguration.AppCallbackUrl)) {
MediaStreamingOptions = mediaStreamingOptions
};
var response = await callingServerClient.AnswerCallAsync(answerCallOptions);
Start media streaming - outbound call
Your application will start receiving media streams once you create the call and provide ACS with the WebSocket information.
var mediaStreamingOptions = new MediaStreamingOptions(
new Uri("wss://{yourwebsocketurl}"),
MediaStreamingTransport.WebSocket,
MediaStreamingContent.Audio,
MediaStreamingAudioChannel.Mixed,
);
var createCallOptions = new CreateCallOptions(callSource, new List < PhoneNumberIdentifier > {
target
}, new Uri(callConfiguration.AppCallbackUrl)) {
MediaStreamingOptions = mediaStreamingOptions
};
var createCallResult = await client.CreateCallAsync(createCallOptions);
Handling media streams in your websocket server
The sample below demonstrates how to listen to media stream using your websocket server
HttpListener httpListener = new HttpListener();
httpListener.Prefixes.Add("http://localhost:80/");
httpListener.Start();
while (true)
{
HttpListenerContext httpListenerContext = await httpListener.GetContextAsync();
if (httpListenerContext.Request.IsWebSocketRequest)
{
WebSocketContext websocketContext;
try
{
websocketContext = await httpListenerContext.AcceptWebSocketAsync(subProtocol: null);
}
catch (Exception ex)
{
return;
}
WebSocket webSocket = websocketContext.WebSocket;
try
{
while (webSocket.State == WebSocketState.Open || webSocket.State == WebSocketState.CloseSent)
{
byte[] receiveBuffer = new byte[2048];
var cancellationToken = new CancellationTokenSource(TimeSpan.FromSeconds(60)).Token;
WebSocketReceiveResult receiveResult = await webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), cancellationToken);
if (receiveResult.MessageType != WebSocketMessageType.Close)
{
var data = Encoding.UTF8.GetString(receiveBuffer).TrimEnd('\0');
try
{
var eventData = JsonConvert.DeserializeObject<AudioBaseClass>(data);
if (eventData != null)
{
if(eventData.kind == "AudioMetadata")
{
//Process audio metadata
}
else if(eventData.kind == "AudioData")
{
//Process audio data
var byteArray = eventData.audioData.data;
//use audio byteArray as you want
}
}
}
catch { }
}
}
}
catch (Exception ex) { }
}
}
Prerequisites
- Azure account with an active subscription, for details see Create an account for free.
- Azure Communication Services resource. See Create an Azure Communication Services resource
- Create a new web service application using the Call Automation SDK.
- Java Development Kit version 8 or above.
- Apache Maven.
Set up a websocket server
Azure Communication Services requires your server application to set up a WebSocket server to stream audio in real-time. WebSocket is a standardized protocol that provides a full-duplex communication channel over a single TCP connection. You can optionally use Azure services Azure WebApps that allows you to create an application to receive audio streams over a websocket connection. Follow this quickstart.
Establish a call
In this quickstart we assume that you're already familiar with starting calls. If you need to learn more about starting and establishing calls, you can follow our quickstart. For the purposes of this quickstart, we'll be going through the process of starting media streaming for both incoming calls and outbound calls.
Start media streaming - incoming call
Your application will start receiving media streams once you answer the call and provide ACS with the WebSocket information.
var mediaStreamingOptions = new MediaStreamingOptions(
"wss://{yourwebsocketurl}",
MediaStreamingTransport.WebSocket,
MediaStreamingContent.Audio,
MediaStreamingAudioChannel.Mixed
);
var answerCallOptions = new AnswerCallOptions(“<incomingCallContext>”, callConfiguration.AppCallbackUrl).setMediaStreamingConfiguration(mediaStreamingOptions);
var answerCallResponse = callAutomationAsyncClient.answerCallWithResponse(answerCallOptions).block();
Start media streaming - outbound call
Your application will start receiving media streams once you create the call and provide ACS with the WebSocket information.
var mediaStreamingOptions = new MediaStreamingOptions(
"wss://{yourwebsocketurl}",
MediaStreamingTransportType.WebSocket,
MediaStreamingContentType.Audio,
MediaStreamingAudioChannelType.Mixed
);
var createCallOptions = new CreateCallOptions(
callSource,
Collections.singletonList(target),
callConfiguration.AppCallbackUrl
);
createCallOptions.setMediaStreamingConfiguration(mediaStreamingOptions);
var answerCallResponse = callAutomationAsyncClient.createCallWithResponse(
createCallOptions
).block();
Handling media streams in your websocket server
The sample below demonstrates how to listen to media stream using your websocket server.
public class WebsocketServer {
public static void main(String[] args) throws IOException {
Socket socket = null;
InputStreamReader inputStreamReader = null;
OutputStreamWriter outputStreamWriter = null;
BufferedReader bufferedReader = null;
BufferedWriter bufferedWriter = null;
ServerSocket serverSocket = null;
serverSocket = new ServerSocket(1234);
while (true) {
try {
socket = serverSocket.accept();
inputStreamReader = new InputStreamReader(socket.getInputStream());
outputStreamWriter = new OutputStreamWriter(socket.getOutputStream());
bufferedReader = new BufferedReader(inputStreamReader);
bufferedWriter = new BufferedWriter(outputStreamWriter);
while (!socket.isClosed()) {
String msgFromClient = bufferedReader.readLine();
//You can process the message however you want
System.out.println("Client:" + msgFromClient);
bufferedWriter.write("MSG Received");
bufferedWriter.newLine();
bufferedWriter.flush();
}
socket.close();
inputStreamReader.close();
outputStreamWriter.close();
bufferedWriter.close();
bufferedReader.close();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
}
Message schema
When ACS has received the URL for your WebSocket server, it will create a connection to it. Once ACS has successfully connected to your WebSocket server, it will send through the first data packet which contains metadata regarding the incoming media packets.
{
"kind": <string> // What kind of data this is, e.g. AudioMetadata, AudioData.
"audioMetadata": {
"subscriptionId": <string>, // unique identifier for a subscription request
"encoding":<string>, // PCM only supported
"sampleRate": <int>, // 16000 default
"channels": <int>, // 1 default
"length": <int> // 640 default
}
}
Audio streaming schema
After sending through the metadata packet, ACS will start streaming audio media to your WebSocket server. Below is an example of what the media object your server will receive looks like.
{
"kind": <string>, // What kind of data this is, e.g. AudioMetadata, AudioData.
"audioData":{
"data": <string>, // Base64 Encoded audio buffer data
"timestamp": <string>, // In ISO 8601 format (yyyy-mm-ddThh:mm:ssZ)
"participantRawID": <string>,
"silent": <boolean> // Indicates if the received audio buffer contains only silence.
}
}
Example of audio data being streamed
{
"kind": "AudioData",
"audioData": {
"timestamp": "2022-10-03T19:16:12.925Z",
"participantRawID": "8:acs:3d20e1de-0f28-41c5-84a0-4960fde5f411_0000000b-faeb-c708-99bf-a43a0d0036b0",
"data": "5ADwAOMA6AD0AOIA4ADkAN8AzwDUANEAywC+ALQArgC0AKYAnACJAIoAlACWAJ8ApwCiAKkAqgCqALUA0wDWANAA3QDVAN0A8wDzAPAA7wDkANkA1QDPAPIA6QDmAOcA0wDYAPMA8QD8AP0AAwH+AAAB/QAAAREBEQEDAQoB9wD3APsA7gDxAPMA7wDpAN0A6gD5APsAAgEHAQ4BEAETARsBMAFHAUABPgE2AS8BKAErATEBLwE7ASYBGQEAAQcBBQH5AAIBBwEMAQ4BAAH+APYA6gDzAPgA7gDkAOUA3wDcANQA2gDWAN8A3wDcAMcAxwDIAMsA1wDfAO4A3wDUANQA3wDvAOUA4QDpAOAA4ADhAOYA5wDkAOUA1gDxAOcA4wDpAOEA4gD0APoA7wD9APkA6ADwAPIA7ADrAPEA6ADfANQAzQDLANIAzwDaANcA3QDZAOQA4wDXANwA1ADbAOsA7ADyAPkA7wDiAOIA6gDtAOsA7gDeAOIA4ADeANUA6gD1APAA8ADgAOQA5wDgAPgA8ADnAN8A5gDgAOoA6wDcAOgA2gDZANUAyQDPANwA3gDgAO4A8QDyAAQBEwEDAewA+gDpAN4A6wDeAO8A8QDwAO8ABAEKAQUB/gD5AAMBAwEIARoBFAEeARkBDgH8AP0A+gD8APcA+gDrAO0A5wDcANEA0QDHAM4A0wDUAM4A0wDZANQAxgDSAM4A1ADVAOMA4QDhANUA2gDjAOYA5wDrANQA5wDrAMsAxQDWANsA5wDpAOEA4QDFAMoA0QDKAMgAwgDNAMsAwgCwAKkAtwCrAKoAsACgAJ4AlQCeAKAAoQCmAKwApwCsAK0AnQCVAA==",
"silent": false
}
}
Stop audio streaming
Audio streaming will automatically stop when the call ends or is canceled.
Clean up resources
If you want to clean up and remove a Communication Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it. Learn more about cleaning up resources.
Next steps
- Learn more about Media Streaming.
- Learn more about Call Automation and its features.
- Learn more about Play action.
- Learn more about Recognize action.
Feedback
Submit and view feedback for