Hi @Lucas Lehembre
Thank you for your thorough explanation—it's evident you're establishing a strong reporting pipeline for ACS video calls, though these reconciliation gaps are proving to be challenging. I'll start by addressing your main concern (missing call_id entries in the Call Diagnostics tab), discuss the rejoin issue, and then propose a more effective strategy for consistent tracking. I'll support these suggestions with relevant Microsoft documentation and actionable steps.ps.
Possible Reasons Why call_id May Not Show Up in Call Diagnostics (Even Within the Correct Time Range):
The call_id, also known as correlationId in ACS logs, is a unique identifier assigned to each call. It connects all participant events and endpoints within that session and is created at the start of the call, staying the same throughout. However, logs are not saved by default, and there are several reasons why a call_id might be missing:
- Diagnostic Settings Not Enabled or Incomplete: ACS will not store call data unless you set up Azure Monitor diagnostic settings to send logs to a Log Analytics workspace. Without this configuration, or if some log categories are missing, Call Diagnostics will not receive any data.
- Log Processing Delays: After a call ends, it may take several hours for the data to appear in the Azure portal and Call Diagnostics tab. If you check immediately, the logs may not be available yet.
- Privacy Constraints Across Resources: When participants connect from different ACS resources, such as separate Azure subscriptions, privacy policies may restrict access—so you might only see partial or no data for calls spanning multiple resources.
- Other Gaps: Logs are not generated retroactively; they only start once diagnostic settings are enabled. Very short or aborted calls might not have complete entries, and calls involving non-ACS endpoints (like PSTN) may have incomplete diagnostics.
Quick Troubleshooting Steps:
- Verify or Enable Diagnostic Settings:
- Navigate to your ACS resource in the Azure portal, then select Diagnostic settings under Monitoring and choose Add diagnostic setting.
- Select All logs (such as CallDiagnostics, CallSummary, etc.) and send them to a Log Analytics workspace.
- Repeat this for each ACS resource ID. After enabling, test with a new call.
- For more information, see: Enable logging in diagnostic settings.
- Wait and Refresh: Once you make a test call, wait 4-6 hours, then refresh the Call Diagnostics tab (under Monitoring in your ACS resource) and search by call_id.
- Check for Multi-Resource Calls: If your application uses multiple ACS resources, enable diagnostics on all of them and review the shared Log Analytics workspace. Use KQL queries like ACSCallSummary | where correlationId == "your-call-id" | summarize by resourceId to identify any cross-resource issues.
- View Hidden Columns for Additional Details: In the Call Search or Overview tab, select Edit columns and enable DiagnosticOptions. This will display any custom tags set in your SDK, which can help with filtering or identifying calls.
- Query Logs Directly: You can also run a KQL query directly in Log Analytics instead of using the portal tab.
ACSCallDiagnostics | where correlationId == "your-call-id" | take 10
If the field is empty, please confirm the settings. For more information about the schema, including the callId structure, refer to the details provided. If these steps do not resolve the issue, please obtain the MS-CV ID from your client SDK using diagnosticInfo in CallClientOptions and provide it to Azure support for further investigation.
How to Managing rejoins that result in separate call_ids:
This appears to be related to how your client-side call flow is implemented. In the ACS Calling SDK, the call.id (your call_id) remains constant throughout the call instance—it doesn't change if participants leave and rejoin. When someone rejoins, the existing call context should be reused (such as groupMembers for group calls or roomId for rooms). Each join generates a new participantId, but the same correlationId and endpointId (for the user or device) are maintained.
This issue may occur if your app logic handles a rejoin as starting a new call, such as using callAgent.startCall() instead of call.join() with the original context. Doing so generates a new call_id, which prevents proper reconciliation.
Solution Steps:
- Modify Rejoin Logic: When a user rejoins, ensure the original call context is provided to call.join().
// Original call creation (save this groupMembers/roomId, not just call.id)
const groupCallLocator = { groupMembers: [user1Id, user2Id] };
const call = await callAgent.startCall(groupCallLocator);
// On rejoin (e.g., after disconnect), reuse locator
const rejoinCall = await callAgent.join({ groupMembers: groupCallLocator.groupMembers });
// rejoinCall.id should match original call.id
For added robustness, store the locator (group or room ID) in your database along with the call_id.
- Manage Disconnect/Reconnect Events: Subscribe to call.stateChanged and call.disconnected events to monitor when participants leave or rejoin, without needing to restart the call. Use call.addParticipant() for participants who join late, rather than initiating a new call.
- Check Logs: After a participant rejoins, review the call summary logs—rejoins will have the same correlationId but a different participantId. Example query: ACSCallSummary | where endpointId == "user-endpoint" | order by callStartTime.
A Better Way to Track and Report Call Success
Depending only on a single call_id can be unreliable, especially when there are delays or users rejoining. It's better to use a multi-key approach that incorporates user identities, timestamps, and aggregated logs. This method naturally accommodates rejoins through endpointId and lessens the reliance on exact call_id matches.
Recommended Process:
- When First Joining: Save not only the call_id, but also:
- User identities (CommunicationUserIdentifier for both participants).
- Approximate start time.
- Call context (such as group or room ID).
- Data Extraction & Reconciliation in Data Lake:
- Retrieve all ACS logs by enabling diagnostic settings for all 8 categories.
- Query logs using user identities and a time range (for example, within 30 minutes of the saved timestamp).
ACSCallSummary
| where participantId in ("user1-id", "user2-id")
| where callStartTime between (datetime(2025-09-24T10:00:00Z) .. datetime(2025-09-24T11:00:00Z))
| summarize TotalDuration = sum(callDuration), NumParticipants = dcount(participantId), CallIds = make_set(correlationId) by correlationId
| where NumParticipants >= 2 and TotalDuration >= (x * 60) // x minutes in seconds
- This combines data across rejoins (where multiple participantIds are linked to a single endpointId) and marks success if any associated call meets your threshold.
- You can use endpointId to monitor user sessions across different calls if necessary, as it is reused in native SDKs.
- Benefits & Tips:
- Supports rejoins: Duration is calculated by summing across participantIds within the same correlationId.
- Improved resilience to delays: Use wider time windows when querying.
- Cost-efficient: Track Log Analytics ingestion, start with minimal retention and optimize as needed.
- For near real-time reporting, use Event Grid to capture call events such as CallStarted and ParticipantsUpdated.
reference:
https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/call-diagnostics-log-schema
https://learn.microsoft.com/en-us/azure/communication-services/concepts/troubleshooting-info?tabs=csharp%2Cjavascript%2Cdotnet
https://learn.microsoft.com/en-us/azure/communication-services/how-tos/calling-sdk/manage-calls?pivots=platform-web
https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/call-summary-log-schema
https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/voice-and-video-logs
https://learn.microsoft.com/en-us/azure/communication-services/concepts/analytics/logs/voice-and-video-logs#best-practices
Kindly let us know if the above helps or you need further assistance on this issue.
Please "upvote" if the information helped you. This will help us and others in the community as well.