Troubleshoot issues with advanced diagnostics queries with Azure Cosmos DB for NoSQL

APPLIES TO: NoSQL

In this article, we'll cover how to write more advanced queries to help troubleshoot issues with your Azure Cosmos DB account by using diagnostics logs sent to Azure Diagnostics (legacy) and resource-specific (preview) tables.

For Azure Diagnostics tables, all data is written into one single table. Users specify which category they want to query. If you want to view the full-text query of your request, see Monitor Azure Cosmos DB data by using diagnostic settings in Azure to learn how to enable this feature.

For resource-specific tables, data is written into individual tables for each category of the resource. We recommend this mode because it:

  • Makes it much easier to work with the data.
  • Provides better discoverability of the schemas.
  • Improves performance across both ingestion latency and query times.

Common queries

Common queries are shown in the resource-specific and Azure Diagnostics tables.

Top N(10) queries ordered by Request Unit (RU) consumption in a specific time frame

let topRequestsByRUcharge = CDBDataPlaneRequests 
| where TimeGenerated > ago(24h)
| project  RequestCharge , TimeGenerated, ActivityId;
CDBQueryRuntimeStatistics
| project QueryText, ActivityId, DatabaseName , CollectionName
| join kind=inner topRequestsByRUcharge on ActivityId
| project DatabaseName , CollectionName , QueryText , RequestCharge, TimeGenerated
| order by RequestCharge desc
| take 10

Requests throttled (statusCode = 429) in a specific time window

let throttledRequests = CDBDataPlaneRequests
| where StatusCode == "429"
| project  OperationName , TimeGenerated, ActivityId;
CDBQueryRuntimeStatistics
| project QueryText, ActivityId, DatabaseName , CollectionName
| join kind=inner throttledRequests on ActivityId
| project DatabaseName , CollectionName , QueryText , OperationName, TimeGenerated

Queries with the largest response lengths (payload size of the server response)

let operationsbyUserAgent = CDBDataPlaneRequests
| project OperationName, DurationMs, RequestCharge, ResponseLength, ActivityId;
CDBQueryRuntimeStatistics
//specify collection and database
//| where DatabaseName == "DBNAME" and CollectionName == "COLLECTIONNAME"
| join kind=inner operationsbyUserAgent on ActivityId
| summarize max(ResponseLength) by QueryText
| order by max_ResponseLength desc

RU consumption by physical partition (across all replicas in the replica set)

CDBPartitionKeyRUConsumption
| where TimeGenerated >= now(-1d)
//specify collection and database
//| where DatabaseName == "DBNAME" and CollectionName == "COLLECTIONNAME"
// filter by operation type
//| where operationType_s == 'Create'
| summarize sum(todouble(RequestCharge)) by toint(PartitionKeyRangeId)
| render columnchart

RU consumption by logical partition (across all replicas in the replica set)

CDBPartitionKeyRUConsumption
| where TimeGenerated >= now(-1d)
//specify collection and database
//| where DatabaseName == "DBNAME" and CollectionName == "COLLECTIONNAME"
// filter by operation type
//| where operationType_s == 'Create'
| summarize sum(todouble(RequestCharge)) by PartitionKey, PartitionKeyRangeId
| render columnchart  

Next steps