Get active speakers within a call
During an active call, you may want to get a list of active speakers in order to render or display them differently. Here's how.
Prerequisites
- An Azure account with an active subscription. Create an account for free.
- A deployed Communication Services resource. Create a Communication Services resource.
- A user access token to enable the calling client. For more information, see Create and manage access tokens.
- Optional: Complete the quickstart to add voice calling to your application
Install the SDK
Use the npm install
command to install the Azure Communication Services calling and common SDKs for JavaScript.
npm install @azure/communication-common --save
npm install @azure/communication-calling --save
Initialize required objects
A CallClient, instance is required for most call operations. Let's create a new CallClient
instance. You can configure it with custom options like a Logger instance.
When you have a CallClient
instance, you can create a CallAgent
instance by calling the createCallAgent
method on the CallClient
instance. This method asynchronously returns a CallAgent
instance object.
The createCallAgent
method uses CommunicationTokenCredential
as an argument. It accepts a user access token.
You can use the getDeviceManager
method on the CallClient
instance to access deviceManager
.
const { CallClient } = require('@azure/communication-calling');
const { AzureCommunicationTokenCredential} = require('@azure/communication-common');
const { AzureLogger, setLogLevel } = require("@azure/logger");
// Set the logger's log level
setLogLevel('verbose');
// Redirect log output to wherever desired. To console, file, buffer, REST API, etc...
AzureLogger.log = (...args) => {
console.log(...args); // Redirect log output to console
};
const userToken = '<USER_TOKEN>';
callClient = new CallClient(options);
const tokenCredential = new AzureCommunicationTokenCredential(userToken);
const callAgent = await callClient.createCallAgent(tokenCredential, {displayName: 'optional Azure Communication Services user name'});
const deviceManager = await callClient.getDeviceManager()
Dominant speakers for a call is an extended feature of the core Call
API and allows you to obtain a list of the active speakers in the call.
This is a ranked list, where the first element in the list represents the last active speaker on the call and so on.
In order to obtain the dominant speakers in a call, you first need to obtain the call dominant speakers feature API object:
const callDominantSpeakersApi = call.feature(Features.CallDominantSpeakers);
Then, obtain the list of the dominant speakers by calling dominantSpeakers
. This has a type of DominantSpeakersInfo
, which has the following members:
speakersList
contains the list of the ranked dominant speakers in the call. These are represented by their participant ID.timestamp
is the latest update time for the dominant speakers in the call.
let dominantSpeakers: DominantSpeakersInfo = callDominantSpeakersApi.dominantSpeakers;
Also, you can subscribe to the dominantSpeakersChanged
event to know when the dominant speakers list has changed
const dominantSpeakersChangedHandler = () => {
// Get the most up to date list of dominant speakers
let dominantSpeakers = callDominantSpeakersApi.dominantSpeakers;
};
callDominantSpeakersApi.on('dominantSpeakersChanged', dominantSpeakersChangedHandler);
Handle the Dominant Speaker's video streams
Your application can use the DominantSpeakers
feature to render one or more of dominant speaker's video streams, and keep updating UI whenever dominant speaker list updates. This can be achieved with the following code example.
// RemoteParticipant obj representation of the dominant speaker
let dominantRemoteParticipant: RemoteParticipant;
// It is recommended to use a map to keep track of a stream's associated renderer
let streamRenderersMap: new Map<RemoteVideoStream, VideoStreamRenderer>();
function getRemoteParticipantForDominantSpeaker(dominantSpeakerIdentifier) {
let dominantRemoteParticipant: RemoteParticipant;
switch(dominantSpeakerIdentifier.kind) {
case 'communicationUser': {
dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
return (rm.identifier as CommunicationUserIdentifier).communicationUserId === dominantSpeakerIdentifier.communicationUserId
});
break;
}
case 'microsoftTeamsUser': {
dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
return (rm.identifier as MicrosoftTeamsUserIdentifier).microsoftTeamsUserId === dominantSpeakerIdentifier.microsoftTeamsUserId
});
break;
}
case 'unknown': {
dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
return (rm.identifier as UnknownIdentifier).id === dominantSpeakerIdentifier.id
});
break;
}
}
return dominantRemoteParticipant;
}
// Handler function for when the dominant speaker changes
const dominantSpeakersChangedHandler = async () => {
// Get the new dominant speaker's identifier
const newDominantSpeakerIdentifier = currentCall.feature(Features.DominantSpeakers).dominantSpeakers.speakersList[0];
if (newDominantSpeakerIdentifier) {
// Get the remote participant object that matches newDominantSpeakerIdentifier
const newDominantRemoteParticipant = getRemoteParticipantForDominantSpeaker(newDominantSpeakerIdentifier);
// Create the new dominant speaker's stream renderers
const streamViews = [];
for (const stream of newDominantRemoteParticipant.videoStreams) {
if (stream.isAvailable && !streamRenderersMap.get(stream)) {
const renderer = new VideoStreamRenderer(stream);
streamRenderersMap.set(stream, renderer);
const view = await videoStreamRenderer.createView();
streamViews.push(view);
}
}
// Remove the old dominant speaker's video streams by disposing of their associated renderers
if (dominantRemoteParticipant) {
for (const stream of dominantRemoteParticipant.videoStreams) {
const renderer = streamRenderersMap.get(stream);
if (renderer) {
streamRenderersMap.delete(stream);
renderer.dispose();
}
}
}
// Set the new dominant remote participant obj
dominantRemoteParticipant = newDominantRemoteParticipant
// Render the new dominant remote participant's streams
for (const view of streamViewsToRender) {
htmlElement.appendChild(view.target);
}
}
};
// When call is disconnected, set the dominant speaker to undefined
currentCall.on('stateChanged', () => {
if (currentCall === 'Disconnected') {
dominantRemoteParticipant = undefined;
}
});
const dominantSpeakerIdentifier = currentCall.feature(Features.DominantSpeakers).dominantSpeakers.speakersList[0];
dominantRemoteParticipant = getRemoteParticipantForDominantSpeaker(dominantSpeakerIdentifier);
currentCall.feature(Features.DominantSpeakers).on('dominantSpeakersChanged', dominantSpeakersChangedHandler);
subscribeToRemoteVideoStream = async (stream: RemoteVideoStream, participant: RemoteParticipant) {
let renderer: VideoStreamRenderer;
const displayVideo = async () => {
renderer = new VideoStreamRenderer(stream);
streamRenderersMap.set(stream, renderer);
const view = await renderer.createView();
htmlElement.appendChild(view.target);
}
stream.on('isAvailableChanged', async () => {
if (dominantRemoteParticipant !== participant) {
return;
}
renderer = streamRenderersMap.get(stream);
if (stream.isAvailable && !renderer) {
await displayVideo();
} else {
streamRenderersMap.delete(stream);
renderer.dispose();
}
});
if (dominantRemoteParticipant !== participant) {
return;
}
renderer = streamRenderersMap.get(stream);
if (stream.isAvailable && !renderer) {
await displayVideo();
}
}
Install the SDK
Locate your project level build.gradle and make sure to add mavenCentral()
to the list of repositories under buildscript
and allprojects
buildscript {
repositories {
...
mavenCentral()
...
}
}
allprojects {
repositories {
...
mavenCentral()
...
}
}
Then, in your module level build.gradle add the following lines to the dependencies section
dependencies {
...
implementation 'com.azure.android:azure-communication-calling:1.0.0'
...
}
Initialize the required objects
To create a CallAgent
instance you have to call the createCallAgent
method on a CallClient
instance. This asynchronously returns a CallAgent
instance object.
The createCallAgent
method takes a CommunicationUserCredential
as an argument, which encapsulates an access token.
To access the DeviceManager
, a callAgent instance must be created first, and then you can use the CallClient.getDeviceManager
method to get the DeviceManager.
String userToken = '<user token>';
CallClient callClient = new CallClient();
CommunicationTokenCredential tokenCredential = new CommunicationTokenCredential(userToken);
android.content.Context appContext = this.getApplicationContext(); // From within an Activity for instance
CallAgent callAgent = callClient.createCallAgent(appContext, tokenCredential).get();
DeviceManager deviceManager = callClient.getDeviceManager(appContext).get();
To set a display name for the caller, use this alternative method:
String userToken = '<user token>';
CallClient callClient = new CallClient();
CommunicationTokenCredential tokenCredential = new CommunicationTokenCredential(userToken);
android.content.Context appContext = this.getApplicationContext(); // From within an Activity for instance
CallAgentOptions callAgentOptions = new CallAgentOptions();
callAgentOptions.setDisplayName("Alice Bob");
DeviceManager deviceManager = callClient.getDeviceManager(appContext).get();
CallAgent callAgent = callClient.createCallAgent(appContext, tokenCredential, callAgentOptions).get();
Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.
When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.
The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.
Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:
- An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
- A timestamp marking the date when this list was last modified.
In order to use the Dominant Speakers call feature for Android, the first step is to obtain the Dominant Speakers feature API object:
DominantSpeakersFeature dominantSpeakersFeature = call.feature(Features.DOMINANT_SPEAKERS);
The Dominant Speakers feature object have the following API structure:
OnDominantSpeakersChanged
: Event for listening for changes in the dominant speakers list.getDominantSpeakersInfo()
: Gets theDominantSpeakersInfo
object. This object has:getSpeakers()
: A list of participant identifiers representing the dominant speakers list.getLastUpdatedAt()
: The date when the dominant speakers list was updated.
To subscribe to changes in the Dominant Speakers list:
// Obtain the extended feature object from the call object.
DominantSpeakersFeature dominantSpeakersFeature = call.feature(Features.DOMINANT_SPEAKERS);
// Subscribe to the OnDominantSpeakersChanged event.
dominantSpeakersFeature.addOnDominantSpeakersChangedListener(handleDominantSpeakersChangedlistener);
private void handleCallOnDominantSpeakersChanged(PropertyChangedEvent args) {
// When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
DominantSpeakersInfo dominantSpeakersInfo = dominantSpeakersFeature.getDominantSpeakersInfo();
Date timestamp = dominantSpeakersInfo.getLastUpdatedAt();
List<CommunicationIdentifier> dominantSpeakers = dominantSpeakersInfo.getSpeakers();
}
Setting up
Creating the Visual Studio project
For UWP app, in Visual Studio 2022, create a new Blank App (Universal Windows)
project. After entering the project name, feel free to pick any Windows SDK greater than 10.0.17763.0
.
For WinUI 3 app, create a new project with the Blank App, Packaged (WinUI 3 in Desktop)
template to set up a single-page WinUI 3 app. Windows App SDK version 1.3 and above is required.
Install the package and dependencies with NuGet Package Manager
The Calling SDK APIs and libraries are publicly available via a NuGet package. The following steps exemplify how to find, download, and install the Calling SDK NuGet package.
- Open NuGet Package Manager (
Tools
->NuGet Package Manager
->Manage NuGet Packages for Solution
) - Click on
Browse
and then typeAzure.Communication.Calling.WindowsClient
in the search box. - Make sure that
Include prerelease
check box is selected. - Click on the
Azure.Communication.Calling.WindowsClient
package, selectAzure.Communication.Calling.WindowsClient
1.4.0-beta.1 or newer version. - Select the checkbox corresponding to the CS project on the right-side tab.
- Click on the
Install
button.
Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.
When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.
The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.
Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:
- An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
- A timestamp marking the date when this list was last modified.
In order to use the Dominant Speakers call feature for Windows, the first step is to obtain the Dominant Speakers feature API object:
DominantSpeakersCallFeature dominantSpeakersFeature = call.Features.DominantSpeakers;
The Dominant Speakers feature object have the following API structure:
OnDominantSpeakersChanged
: Event for listening for changes in the dominant speakers list.DominantSpeakersInfo
: Gets theDominantSpeakersInfo
object. This object has:Speakers
: A list of participant identifiers representing the dominant speakers list.LastUpdatedAt
: The date when the dominant speakers list was updated.
To subscribe to changes in the dominant speakers list:
// Obtain the extended feature object from the call object.
DominantSpeakersFeature dominantSpeakersFeature = call.Features.DominantSpeakers;
// Subscribe to the OnDominantSpeakersChanged event.
dominantSpeakersFeature.OnDominantSpeakersChanged += DominantSpeakersFeature__OnDominantSpeakersChanged;
private void DominantSpeakersFeature__OnDominantSpeakersChanged(object sender, PropertyChangedEventArgs args) {
// When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
DominantSpeakersInfo dominantSpeakersInfo = dominantSpeakersFeature.DominantSpeakersInfo;
DateTimeOffset date = dominantSpeakersInfo.LastUpdatedAt;
IReadOnlyList<ICommunicationIdentifier> speakersList = dominantSpeakersInfo.Speakers;
}
Set up your system
Create the Xcode project
In Xcode, create a new iOS project and select the Single View App template. This quickstart uses the SwiftUI framework, so you should set the Language to Swift and User Interface to SwiftUI.
You're not going to create unit tests or UI tests during this quickstart. Feel free to clear the Include Unit Tests and Include UI Tests text boxes.
Install the package and dependencies with CocoaPods
Create a Podfile for your application, like this:
platform :ios, '13.0' use_frameworks! target 'AzureCommunicationCallingSample' do pod 'AzureCommunicationCalling', '~> 1.0.0' end
Run
pod install
.Open
.xcworkspace
with Xcode.
Request access to the microphone
To access the device's microphone, you need to update your app's information property list with NSMicrophoneUsageDescription
. You set the associated value to a string
that will be included in the dialog that the system uses to request access from the user.
Right-click the Info.plist
entry of the project tree and select Open As > Source Code. Add the following lines in the top-level <dict>
section, and then save the file.
<key>NSMicrophoneUsageDescription</key>
<string>Need microphone access for VOIP calling.</string>
Set up the app framework
Open your project's ContentView.swift file and add an import
declaration to the top of the file to import the AzureCommunicationCalling
library. In addition, import AVFoundation
. You'll need it for audio permission requests in the code.
import AzureCommunicationCalling
import AVFoundation
Initialize CallAgent
To create a CallAgent
instance from CallClient
, you have to use a callClient.createCallAgent
method that asynchronously returns a CallAgent
object after it's initialized.
To create a call client, you have to pass a CommunicationTokenCredential
object.
import AzureCommunication
let tokenString = "token_string"
var userCredential: CommunicationTokenCredential?
do {
let options = CommunicationTokenRefreshOptions(initialToken: token, refreshProactively: true, tokenRefresher: self.fetchTokenSync)
userCredential = try CommunicationTokenCredential(withOptions: options)
} catch {
updates("Couldn't created Credential object", false)
initializationDispatchGroup!.leave()
return
}
// tokenProvider needs to be implemented by Contoso, which fetches a new token
public func fetchTokenSync(then onCompletion: TokenRefreshOnCompletion) {
let newToken = self.tokenProvider!.fetchNewToken()
onCompletion(newToken, nil)
}
Pass the CommunicationTokenCredential
object that you created to CallClient
, and set the display name.
self.callClient = CallClient()
let callAgentOptions = CallAgentOptions()
options.displayName = " iOS Azure Communication Services User"
self.callClient!.createCallAgent(userCredential: userCredential!,
options: callAgentOptions) { (callAgent, error) in
if error == nil {
print("Create agent succeeded")
self.callAgent = callAgent
} else {
print("Create agent failed")
}
})
Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.
When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.
The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.
Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:
- An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
- A timestamp marking the date when this list was last modified.
In order to use the Dominant Speakers call feature for iOS, the first step is to obtain the Dominant Speakers feature API object:
let dominantSpeakersFeature = call.feature(Features.dominantSpeakers)
The Dominant Speakers feature object have the following API structure:
didChangeDominantSpeakers
: Event for listening for changes in the dominant speakers list.dominantSpeakersInfo
: Which gets theDominantSpeakersInfo
object. This object has:speakers
: A list of participant identifiers representing the dominant speakers list.lastUpdatedAt
: The date when the dominant speakers list was updated.
To subscribe to changes in the dominant speakers list:
// Obtain the extended feature object from the call object.
let dominantSpeakersFeature = call.feature(Features.dominantSpeakers)
// Set the delegate object to obtain the event callback.
dominantSpeakersFeature.delegate = DominantSpeakersDelegate()
public class DominantSpeakersDelegate : DominantSpeakersCallFeatureDelegate
{
public func dominantSpeakersCallFeature(_ dominantSpeakersCallFeature: DominantSpeakersCallFeature, didChangeDominantSpeakers args: PropertyChangedEventArgs) {
// When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
let dominantSpeakersInfo = dominantSpeakersCallFeature.dominantSpeakersInfo
let timestamp = dominantSpeakersInfo.lastUpdatedAt
let dominantSpeakersList = dominantSpeakersInfo.speakers
}
}
Next steps
Feedback
Submit and view feedback for