Get active speakers within a call

During an active call, you may want to get a list of active speakers in order to render or display them differently. Here's how.

Prerequisites

Install the SDK

Use the npm install command to install the Azure Communication Services calling and common SDKs for JavaScript.

npm install @azure/communication-common --save
npm install @azure/communication-calling --save

Initialize required objects

A CallClient, instance is required for most call operations. Let's create a new CallClient instance. You can configure it with custom options like a Logger instance.

When you have a CallClient instance, you can create a CallAgent instance by calling the createCallAgent method on the CallClient instance. This method asynchronously returns a CallAgent instance object.

The createCallAgent method uses CommunicationTokenCredential as an argument. It accepts a user access token.

You can use the getDeviceManager method on the CallClient instance to access deviceManager.

const { CallClient } = require('@azure/communication-calling');
const { AzureCommunicationTokenCredential} = require('@azure/communication-common');
const { AzureLogger, setLogLevel } = require("@azure/logger");

// Set the logger's log level
setLogLevel('verbose');

// Redirect log output to wherever desired. To console, file, buffer, REST API, etc...
AzureLogger.log = (...args) => {
    console.log(...args); // Redirect log output to console
};

const userToken = '<USER_TOKEN>';
callClient = new CallClient(options);
const tokenCredential = new AzureCommunicationTokenCredential(userToken);
const callAgent = await callClient.createCallAgent(tokenCredential, {displayName: 'optional Azure Communication Services user name'});
const deviceManager = await callClient.getDeviceManager()

Dominant speakers for a call is an extended feature of the core Call API and allows you to obtain a list of the active speakers in the call.

This is a ranked list, where the first element in the list represents the last active speaker on the call and so on.

In order to obtain the dominant speakers in a call, you first need to obtain the call dominant speakers feature API object:

const callDominantSpeakersApi = call.feature(Features.CallDominantSpeakers);

Then, obtain the list of the dominant speakers by calling dominantSpeakers. This has a type of DominantSpeakersInfo, which has the following members:

  • speakersList contains the list of the ranked dominant speakers in the call. These are represented by their participant ID.
  • timestamp is the latest update time for the dominant speakers in the call.
let dominantSpeakers: DominantSpeakersInfo = callDominantSpeakersApi.dominantSpeakers;

Also, you can subscribe to the dominantSpeakersChanged event to know when the dominant speakers list has changed

const dominantSpeakersChangedHandler = () => {
    // Get the most up to date list of dominant speakers
    let dominantSpeakers = callDominantSpeakersApi.dominantSpeakers;
};
callDominantSpeakersApi.on('dominantSpeakersChanged', dominantSpeakersChangedHandler);

Handle the Dominant Speaker's video streams

Your application can use the DominantSpeakers feature to render one or more of dominant speaker's video streams, and keep updating UI whenever dominant speaker list updates. This can be achieved with the following code example.

// RemoteParticipant obj representation of the dominant speaker
let dominantRemoteParticipant: RemoteParticipant;
// It is recommended to use a map to keep track of a stream's associated renderer
let streamRenderersMap: new Map<RemoteVideoStream, VideoStreamRenderer>();

function getRemoteParticipantForDominantSpeaker(dominantSpeakerIdentifier) {
    let dominantRemoteParticipant: RemoteParticipant;
    switch(dominantSpeakerIdentifier.kind) {
        case 'communicationUser': {
            dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
                return (rm.identifier as CommunicationUserIdentifier).communicationUserId === dominantSpeakerIdentifier.communicationUserId
            });
            break;
        }
        case 'microsoftTeamsUser': {
            dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
                return (rm.identifier as MicrosoftTeamsUserIdentifier).microsoftTeamsUserId === dominantSpeakerIdentifier.microsoftTeamsUserId
            });
            break;
        }
        case 'unknown': {
            dominantRemoteParticipant = currentCall.remoteParticipants.find(rm => {
                return (rm.identifier as UnknownIdentifier).id === dominantSpeakerIdentifier.id
            });
            break;
        }
    }
    return dominantRemoteParticipant;
}
// Handler function for when the dominant speaker changes
const dominantSpeakersChangedHandler = async () => {
    // Get the new dominant speaker's identifier
    const newDominantSpeakerIdentifier = currentCall.feature(Features.DominantSpeakers).dominantSpeakers.speakersList[0];

     if (newDominantSpeakerIdentifier) {
        // Get the remote participant object that matches newDominantSpeakerIdentifier
        const newDominantRemoteParticipant = getRemoteParticipantForDominantSpeaker(newDominantSpeakerIdentifier);

        // Create the new dominant speaker's stream renderers
        const streamViews = [];
        for (const stream of newDominantRemoteParticipant.videoStreams) {
            if (stream.isAvailable && !streamRenderersMap.get(stream)) {
                const renderer = new VideoStreamRenderer(stream);
                streamRenderersMap.set(stream, renderer);
                const view = await videoStreamRenderer.createView();
                streamViews.push(view);
            }
        }

        // Remove the old dominant speaker's video streams by disposing of their associated renderers
        if (dominantRemoteParticipant) {
            for (const stream of dominantRemoteParticipant.videoStreams) {
                const renderer = streamRenderersMap.get(stream);
                if (renderer) {
                    streamRenderersMap.delete(stream);
                    renderer.dispose();
                }
            }
        }

        // Set the new dominant remote participant obj
        dominantRemoteParticipant = newDominantRemoteParticipant

        // Render the new dominant remote participant's streams
        for (const view of streamViewsToRender) {
            htmlElement.appendChild(view.target);
        }
     }
};

// When call is disconnected, set the dominant speaker to undefined
currentCall.on('stateChanged', () => {
    if (currentCall === 'Disconnected') {
        dominantRemoteParticipant = undefined;
    }
});

const dominantSpeakerIdentifier = currentCall.feature(Features.DominantSpeakers).dominantSpeakers.speakersList[0];
dominantRemoteParticipant = getRemoteParticipantForDominantSpeaker(dominantSpeakerIdentifier);
currentCall.feature(Features.DominantSpeakers).on('dominantSpeakersChanged', dominantSpeakersChangedHandler);

subscribeToRemoteVideoStream = async (stream: RemoteVideoStream, participant: RemoteParticipant) {
    let renderer: VideoStreamRenderer;

    const displayVideo = async () => {
        renderer = new VideoStreamRenderer(stream);
        streamRenderersMap.set(stream, renderer);
        const view = await renderer.createView();
        htmlElement.appendChild(view.target);
    }

    stream.on('isAvailableChanged', async () => {
        if (dominantRemoteParticipant !== participant) {
            return;
        }

        renderer = streamRenderersMap.get(stream);
        if (stream.isAvailable && !renderer) {
            await displayVideo();
        } else {
            streamRenderersMap.delete(stream);
            renderer.dispose();
        }
    });

    if (dominantRemoteParticipant !== participant) {
        return;
    }

    renderer = streamRenderersMap.get(stream);
    if (stream.isAvailable && !renderer) {
        await displayVideo();
    }
}

Install the SDK

Locate your project level build.gradle and make sure to add mavenCentral() to the list of repositories under buildscript and allprojects

buildscript {
    repositories {
    ...
        mavenCentral()
    ...
    }
}
allprojects {
    repositories {
    ...
        mavenCentral()
    ...
    }
}

Then, in your module level build.gradle add the following lines to the dependencies section

dependencies {
    ...
    implementation 'com.azure.android:azure-communication-calling:1.0.0'
    ...
}

Initialize the required objects

To create a CallAgent instance you have to call the createCallAgent method on a CallClient instance. This asynchronously returns a CallAgent instance object. The createCallAgent method takes a CommunicationUserCredential as an argument, which encapsulates an access token. To access the DeviceManager, a callAgent instance must be created first, and then you can use the CallClient.getDeviceManager method to get the DeviceManager.

String userToken = '<user token>';
CallClient callClient = new CallClient();
CommunicationTokenCredential tokenCredential = new CommunicationTokenCredential(userToken);
android.content.Context appContext = this.getApplicationContext(); // From within an Activity for instance
CallAgent callAgent = callClient.createCallAgent(appContext, tokenCredential).get();
DeviceManager deviceManager = callClient.getDeviceManager(appContext).get();

To set a display name for the caller, use this alternative method:

String userToken = '<user token>';
CallClient callClient = new CallClient();
CommunicationTokenCredential tokenCredential = new CommunicationTokenCredential(userToken);
android.content.Context appContext = this.getApplicationContext(); // From within an Activity for instance
CallAgentOptions callAgentOptions = new CallAgentOptions();
callAgentOptions.setDisplayName("Alice Bob");
DeviceManager deviceManager = callClient.getDeviceManager(appContext).get();
CallAgent callAgent = callClient.createCallAgent(appContext, tokenCredential, callAgentOptions).get();

Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.

When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.

The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.

Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:

  • An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
  • A timestamp marking the date when this list was last modified.

In order to use the Dominant Speakers call feature for Android, the first step is to obtain the Dominant Speakers feature API object:

DominantSpeakersFeature dominantSpeakersFeature = call.feature(Features.DOMINANT_SPEAKERS);

The Dominant Speakers feature object have the following API structure:

  • OnDominantSpeakersChanged: Event for listening for changes in the dominant speakers list.
  • getDominantSpeakersInfo(): Gets the DominantSpeakersInfo object. This object has:
    • getSpeakers(): A list of participant identifiers representing the dominant speakers list.
    • getLastUpdatedAt(): The date when the dominant speakers list was updated.

To subscribe to changes in the Dominant Speakers list:


// Obtain the extended feature object from the call object.
DominantSpeakersFeature dominantSpeakersFeature = call.feature(Features.DOMINANT_SPEAKERS);
// Subscribe to the OnDominantSpeakersChanged event.
dominantSpeakersFeature.addOnDominantSpeakersChangedListener(handleDominantSpeakersChangedlistener);

private void handleCallOnDominantSpeakersChanged(PropertyChangedEvent args) {
    // When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
    DominantSpeakersInfo dominantSpeakersInfo = dominantSpeakersFeature.getDominantSpeakersInfo();
    Date timestamp = dominantSpeakersInfo.getLastUpdatedAt();
    List<CommunicationIdentifier> dominantSpeakers = dominantSpeakersInfo.getSpeakers();
}

Setting up

Creating the Visual Studio project

For UWP app, in Visual Studio 2022, create a new Blank App (Universal Windows) project. After entering the project name, feel free to pick any Windows SDK greater than 10.0.17763.0.

For WinUI 3 app, create a new project with the Blank App, Packaged (WinUI 3 in Desktop) template to set up a single-page WinUI 3 app. Windows App SDK version 1.3 and above is required.

Install the package and dependencies with NuGet Package Manager

The Calling SDK APIs and libraries are publicly available via a NuGet package. The following steps exemplify how to find, download, and install the Calling SDK NuGet package.

  1. Open NuGet Package Manager (Tools -> NuGet Package Manager -> Manage NuGet Packages for Solution)
  2. Click on Browse and then type Azure.Communication.Calling.WindowsClient in the search box.
  3. Make sure that Include prerelease check box is selected.
  4. Click on the Azure.Communication.Calling.WindowsClient package, select Azure.Communication.Calling.WindowsClient 1.4.0-beta.1 or newer version.
  5. Select the checkbox corresponding to the CS project on the right-side tab.
  6. Click on the Install button.

Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.

When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.

The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.

Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:

  • An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
  • A timestamp marking the date when this list was last modified.

In order to use the Dominant Speakers call feature for Windows, the first step is to obtain the Dominant Speakers feature API object:

DominantSpeakersCallFeature dominantSpeakersFeature = call.Features.DominantSpeakers;

The Dominant Speakers feature object have the following API structure:

  • OnDominantSpeakersChanged: Event for listening for changes in the dominant speakers list.
  • DominantSpeakersInfo: Gets the DominantSpeakersInfo object. This object has:
    • Speakers: A list of participant identifiers representing the dominant speakers list.
    • LastUpdatedAt: The date when the dominant speakers list was updated.

To subscribe to changes in the dominant speakers list:

// Obtain the extended feature object from the call object.
DominantSpeakersFeature dominantSpeakersFeature = call.Features.DominantSpeakers;
// Subscribe to the OnDominantSpeakersChanged event.
dominantSpeakersFeature.OnDominantSpeakersChanged += DominantSpeakersFeature__OnDominantSpeakersChanged;

private void DominantSpeakersFeature__OnDominantSpeakersChanged(object sender, PropertyChangedEventArgs args) {
  // When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
  DominantSpeakersInfo dominantSpeakersInfo = dominantSpeakersFeature.DominantSpeakersInfo;
  DateTimeOffset date = dominantSpeakersInfo.LastUpdatedAt;
  IReadOnlyList<ICommunicationIdentifier> speakersList = dominantSpeakersInfo.Speakers;
}

Set up your system

Create the Xcode project

In Xcode, create a new iOS project and select the Single View App template. This quickstart uses the SwiftUI framework, so you should set the Language to Swift and User Interface to SwiftUI.

You're not going to create unit tests or UI tests during this quickstart. Feel free to clear the Include Unit Tests and Include UI Tests text boxes.

Screenshot that shows the window for creating a project within Xcode.

Install the package and dependencies with CocoaPods

  1. Create a Podfile for your application, like this:

    platform :ios, '13.0'
    use_frameworks!
    target 'AzureCommunicationCallingSample' do
        pod 'AzureCommunicationCalling', '~> 1.0.0'
    end
    
  2. Run pod install.

  3. Open .xcworkspace with Xcode.

Request access to the microphone

To access the device's microphone, you need to update your app's information property list with NSMicrophoneUsageDescription. You set the associated value to a string that will be included in the dialog that the system uses to request access from the user.

Right-click the Info.plist entry of the project tree and select Open As > Source Code. Add the following lines in the top-level <dict> section, and then save the file.

<key>NSMicrophoneUsageDescription</key>
<string>Need microphone access for VOIP calling.</string>

Set up the app framework

Open your project's ContentView.swift file and add an import declaration to the top of the file to import the AzureCommunicationCalling library. In addition, import AVFoundation. You'll need it for audio permission requests in the code.

import AzureCommunicationCalling
import AVFoundation

Initialize CallAgent

To create a CallAgent instance from CallClient, you have to use a callClient.createCallAgent method that asynchronously returns a CallAgent object after it's initialized.

To create a call client, you have to pass a CommunicationTokenCredential object.

import AzureCommunication

let tokenString = "token_string"
var userCredential: CommunicationTokenCredential?
do {
    let options = CommunicationTokenRefreshOptions(initialToken: token, refreshProactively: true, tokenRefresher: self.fetchTokenSync)
    userCredential = try CommunicationTokenCredential(withOptions: options)
} catch {
    updates("Couldn't created Credential object", false)
    initializationDispatchGroup!.leave()
    return
}

// tokenProvider needs to be implemented by Contoso, which fetches a new token
public func fetchTokenSync(then onCompletion: TokenRefreshOnCompletion) {
    let newToken = self.tokenProvider!.fetchNewToken()
    onCompletion(newToken, nil)
}

Pass the CommunicationTokenCredential object that you created to CallClient, and set the display name.

self.callClient = CallClient()
let callAgentOptions = CallAgentOptions()
options.displayName = " iOS Azure Communication Services User"

self.callClient!.createCallAgent(userCredential: userCredential!,
    options: callAgentOptions) { (callAgent, error) in
        if error == nil {
            print("Create agent succeeded")
            self.callAgent = callAgent
        } else {
            print("Create agent failed")
        }
})

Dominant Speakers is an extended feature of the core Call object that allows the user to monitor the most dominant speakers in the current call. Participants can join and leave the list based on how they are performing in the call.

When joined to a group call consisting of multiple participants, the calling SDKs identify which meeting participants are currently speaking. Active speakers identify which participants are being heard in each received audio frame. Dominant speakers identify which participants are currently most active or dominant in the group conversation, though their voice is not necessarily heard in every audio frame. The set of dominant speakers can change as different participants take turns speaking, video subscription requests based on dominant speaker logic can be implemented.

The main idea is that as participants join, leave, climb up or down in this list of participants, the client application can take this information and customize the call experience accordingly. For example, the client application can show the most dominant speakers in the call in a different UI to separate from the ones that are not participating actively in the call.

Developers can receive updateds and obtain information about the most Dominant Speakers in a call. This information is being a represented as:

  • An ordered list of the Remote Participants that represents the Dominant Speakers in the call.
  • A timestamp marking the date when this list was last modified.

In order to use the Dominant Speakers call feature for iOS, the first step is to obtain the Dominant Speakers feature API object:

let dominantSpeakersFeature = call.feature(Features.dominantSpeakers)

The Dominant Speakers feature object have the following API structure:

  • didChangeDominantSpeakers: Event for listening for changes in the dominant speakers list.
  • dominantSpeakersInfo: Which gets the DominantSpeakersInfo object. This object has:
    • speakers: A list of participant identifiers representing the dominant speakers list.
    • lastUpdatedAt: The date when the dominant speakers list was updated.

To subscribe to changes in the dominant speakers list:

// Obtain the extended feature object from the call object.
let dominantSpeakersFeature = call.feature(Features.dominantSpeakers)
// Set the delegate object to obtain the event callback.
dominantSpeakersFeature.delegate = DominantSpeakersDelegate()

public class DominantSpeakersDelegate : DominantSpeakersCallFeatureDelegate
{
    public func dominantSpeakersCallFeature(_ dominantSpeakersCallFeature: DominantSpeakersCallFeature, didChangeDominantSpeakers args: PropertyChangedEventArgs) {
        // When the list changes, get the timestamp of the last change and the current list of Dominant Speakers
        let dominantSpeakersInfo = dominantSpeakersCallFeature.dominantSpeakersInfo
        let timestamp = dominantSpeakersInfo.lastUpdatedAt
        let dominantSpeakersList = dominantSpeakersInfo.speakers
    }
}

Next steps