Share via

Missing ACS call diagnostics

Anonymous
2025-09-24T11:18:14.85+00:00

Hi there,

We use ACS and we use ACS call diagnostics tab and ACS callsummaries extract to detect afterwards if a video call in our app was really made with ACS to report on the feature.

My process:

  • When the first user joins the page where their visio call is happening, I save the created call_id in my DB.
  • Then we extract the diagnostics data back to our data lake and we reconcile based on this ID, check if there was at least 2 participants for x minutes

Problems

  1. the biggest reason for this ticket: sometimes the call_id that I saved in my DB does not return any result in the Call Diagnostics tab in the ACS azure resource, despite being in the right time range selected.
  2. Sometimes also, the 2 call participants join the call, which trigger a save of the call_id, but then leave and rejoin which creates a separate call with a separate call_id.
    I have then the wrong call_id and my success report is affected

Questions:

  1. Can you help in priority with problem 1.
  2. Can you see a better way to do this ?

Thanks
Lucas

Azure Communication Services

2 answers

Sort by: Most helpful
  1. Anonymous
    2025-09-29T09:14:28.18+00:00

    Hi there!

    Missing call-ids:

    • we have only one ACS resource here for our prod environment
    • so I could get a call-id from the server and yet it could not be registered in the logs ? How could we have non-ACS endpoints ? We are strictly using the SDK with our app.
    • we already send the logs in an analytics workspace and look in there. No trace of my correlationId provided by the SDK
    • we also save participantIds, even if we don't really care here, just trying to check if two people joined this specific call
    • I am aware of the delay to get into the diagnostics/summaries, I am talking about calls from 2 to 28 days ago
    • I don't think my call_id stays the same even though I try to use groupId. If both users disconnect and reconnect, I am pretty sure it creates a new call_id (aka correlation_id ?). Only when only one user disconnects and the other stays on, then the call lives on ?

    New questions you can focus on:

    1. I am actually trying to use groupIds when joining a call. We had even changed our internal id, called 'roomId' below into a uuid to make it compliant.

    ====> there is absolutely no trace of a groupId either in the CallSummary nor the CallDiagnostics. How can I retrieve it to correlate ? That would probably solve all my problems

    Here's our code

    this.call = this._callAgent.join({ groupId: this.roomId }, { videoOptions, audioOptions })
    
    1. Is there a minimum and maximum version of the SDK you recommend ? we are communication-calling 1.36.1 and we had difficulties upgrading further in august

    Was this answer helpful?


  2. Praneeth Maddali 10,460 Reputation points Microsoft External Staff Moderator
    2025-09-24T12:54:06.0466667+00:00

    Hi @Lucas Lehembre
    Thank you for your thorough explanation—it's evident you're establishing a strong reporting pipeline for ACS video calls, though these reconciliation gaps are proving to be challenging. I'll start by addressing your main concern (missing call_id entries in the Call Diagnostics tab), discuss the rejoin issue, and then propose a more effective strategy for consistent tracking. I'll support these suggestions with relevant Microsoft documentation and actionable steps.ps.

    Possible Reasons Why call_id May Not Show Up in Call Diagnostics (Even Within the Correct Time Range):

    The call_id, also known as correlationId in ACS logs, is a unique identifier assigned to each call. It connects all participant events and endpoints within that session and is created at the start of the call, staying the same throughout. However, logs are not saved by default, and there are several reasons why a call_id might be missing:

    • Diagnostic Settings Not Enabled or Incomplete: ACS will not store call data unless you set up Azure Monitor diagnostic settings to send logs to a Log Analytics workspace. Without this configuration, or if some log categories are missing, Call Diagnostics will not receive any data.
    • Log Processing Delays: After a call ends, it may take several hours for the data to appear in the Azure portal and Call Diagnostics tab. If you check immediately, the logs may not be available yet.
    • Privacy Constraints Across Resources: When participants connect from different ACS resources, such as separate Azure subscriptions, privacy policies may restrict access—so you might only see partial or no data for calls spanning multiple resources.
    • Other Gaps: Logs are not generated retroactively; they only start once diagnostic settings are enabled. Very short or aborted calls might not have complete entries, and calls involving non-ACS endpoints (like PSTN) may have incomplete diagnostics.

    Quick Troubleshooting Steps:

    1. Verify or Enable Diagnostic Settings:
      • Navigate to your ACS resource in the Azure portal, then select Diagnostic settings under Monitoring and choose Add diagnostic setting.
      • Select All logs (such as CallDiagnostics, CallSummary, etc.) and send them to a Log Analytics workspace.
      • Repeat this for each ACS resource ID. After enabling, test with a new call.
      • For more information, see: Enable logging in diagnostic settings.
    2. Wait and Refresh: Once you make a test call, wait 4-6 hours, then refresh the Call Diagnostics tab (under Monitoring in your ACS resource) and search by call_id.
    3. Check for Multi-Resource Calls: If your application uses multiple ACS resources, enable diagnostics on all of them and review the shared Log Analytics workspace. Use KQL queries like ACSCallSummary | where correlationId == "your-call-id" | summarize by resourceId to identify any cross-resource issues.
    4. View Hidden Columns for Additional Details: In the Call Search or Overview tab, select Edit columns and enable DiagnosticOptions. This will display any custom tags set in your SDK, which can help with filtering or identifying calls.
    5. Query Logs Directly: You can also run a KQL query directly in Log Analytics instead of using the portal tab.
         ACSCallDiagnostics | where correlationId == "your-call-id" | take 10
         
         
      
      If the field is empty, please confirm the settings. For more information about the schema, including the callId structure, refer to the details provided. If these steps do not resolve the issue, please obtain the MS-CV ID from your client SDK using diagnosticInfo in CallClientOptions and provide it to Azure support for further investigation.

    How to Managing rejoins that result in separate call_ids:

    This appears to be related to how your client-side call flow is implemented. In the ACS Calling SDK, the call.id (your call_id) remains constant throughout the call instance—it doesn't change if participants leave and rejoin. When someone rejoins, the existing call context should be reused (such as groupMembers for group calls or roomId for rooms). Each join generates a new participantId, but the same correlationId and endpointId (for the user or device) are maintained.

    This issue may occur if your app logic handles a rejoin as starting a new call, such as using callAgent.startCall() instead of call.join() with the original context. Doing so generates a new call_id, which prevents proper reconciliation.

    Solution Steps:

    1. Modify Rejoin Logic: When a user rejoins, ensure the original call context is provided to call.join().
         // Original call creation (save this groupMembers/roomId, not just call.id)
         const groupCallLocator = { groupMembers: [user1Id, user2Id] };
         const call = await callAgent.startCall(groupCallLocator);
         // On rejoin (e.g., after disconnect), reuse locator
         const rejoinCall = await callAgent.join({ groupMembers: groupCallLocator.groupMembers });
         // rejoinCall.id should match original call.id
         
         
      
      For added robustness, store the locator (group or room ID) in your database along with the call_id.
    2. Manage Disconnect/Reconnect Events: Subscribe to call.stateChanged and call.disconnected events to monitor when participants leave or rejoin, without needing to restart the call. Use call.addParticipant() for participants who join late, rather than initiating a new call.
    3. Check Logs: After a participant rejoins, review the call summary logs—rejoins will have the same correlationId but a different participantId. Example query: ACSCallSummary | where endpointId == "user-endpoint" | order by callStartTime.

    A Better Way to Track and Report Call Success

    Depending only on a single call_id can be unreliable, especially when there are delays or users rejoining. It's better to use a multi-key approach that incorporates user identities, timestamps, and aggregated logs. This method naturally accommodates rejoins through endpointId and lessens the reliance on exact call_id matches.

    Recommended Process:

    1. When First Joining: Save not only the call_id, but also:
      • User identities (CommunicationUserIdentifier for both participants).
      • Approximate start time.
      • Call context (such as group or room ID).
    2. Data Extraction & Reconciliation in Data Lake:
      • Retrieve all ACS logs by enabling diagnostic settings for all 8 categories.
      • Query logs using user identities and a time range (for example, within 30 minutes of the saved timestamp).
             ACSCallSummary
             | where participantId in ("user1-id", "user2-id")
             | where callStartTime between (datetime(2025-09-24T10:00:00Z) .. datetime(2025-09-24T11:00:00Z))
             | summarize TotalDuration = sum(callDuration), NumParticipants = dcount(participantId), CallIds = make_set(correlationId) by correlationId
             | where NumParticipants >= 2 and TotalDuration >= (x * 60)  // x minutes in seconds
             
             
        
      • This combines data across rejoins (where multiple participantIds are linked to a single endpointId) and marks success if any associated call meets your threshold.
      • You can use endpointId to monitor user sessions across different calls if necessary, as it is reused in native SDKs.
    3. Benefits & Tips:
      • Supports rejoins: Duration is calculated by summing across participantIds within the same correlationId.
      • Improved resilience to delays: Use wider time windows when querying.
      • Cost-efficient: Track Log Analytics ingestion, start with minimal retention and optimize as needed.
      • For near real-time reporting, use Event Grid to capture call events such as CallStarted and ParticipantsUpdated.

    reference:
    https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/call-diagnostics-log-schema

    https://learn.microsoft.com/en-us/azure/communication-services/concepts/troubleshooting-info?tabs=csharp%2Cjavascript%2Cdotnet

    https://learn.microsoft.com/en-us/azure/communication-services/how-tos/calling-sdk/manage-calls?pivots=platform-web

    https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/call-summary-log-schema

    https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/voice-and-video-logs

    https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/voice-and-video-logs#best-practices

    Kindly let us know if the above helps or you need further assistance on this issue.

     

    Please "upvote" if the information helped you. This will help us and others in the community as well.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.