Troubleshoot issues with advanced diagnostics queries with Azure Cosmos DB for Apache Gremlin

APPLIES TO: NoSQL MongoDB Cassandra Gremlin

In this article, we'll cover how to write more advanced queries to help troubleshoot issues with your Azure Cosmos DB account by using diagnostics logs sent to Azure Diagnostics (legacy) and resource-specific (preview) tables.

For Azure Diagnostics tables, all data is written into one single table. Users specify which category they want to query. If you want to view the full-text query of your request, see Monitor Azure Cosmos DB data by using diagnostic settings in Azure to learn how to enable this feature.

For resource-specific tables, data is written into individual tables for each category of the resource. We recommend this mode because it:

  • Makes it much easier to work with the data.
  • Provides better discoverability of the schemas.
  • Improves performance across both ingestion latency and query times.

Common queries

Common queries are shown in the resource-specific and Azure Diagnostics tables.

Top N(10) Request Unit (RU) consuming requests or queries in a specific time frame

CDBGremlinRequests
| project PIICommandText, ActivityId, DatabaseName , CollectionName
| join kind=inner topRequestsByRUcharge on ActivityId
| project DatabaseName , CollectionName , PIICommandText , RequestCharge, TimeGenerated
| order by RequestCharge desc
| take 10

Requests throttled (statusCode = 429) in a specific time window

CDBGremlinRequests
| project PIICommandText, ActivityId, DatabaseName , CollectionName
| join kind=inner throttledRequests on ActivityId
| project DatabaseName , CollectionName , PIICommandText , OperationName, TimeGenerated

Queries with large response lengths (payload size of the server response)

CDBGremlinRequests
//specify collection and database
 //| where DatabaseName == "DB NAME" and CollectionName == "COLLECTIONNAME"
| join kind=inner operationsbyUserAgent on ActivityId
| summarize max(ResponseLength) by PIICommandText
| order by max_ResponseLength desc

RU consumption by physical partition (across all replicas in the replica set)

CDBPartitionKeyRUConsumption
| where TimeGenerated >= now(-1d)
//specify collection and database
//| where DatabaseName == "DB NAME" and CollectionName == "COLLECTIONNAME"
// filter by operation type
//| where operationType_s == 'Create'
| summarize sum(todouble(RequestCharge)) by toint(PartitionKeyRangeId)
| render columnchart

RU consumption by logical partition (across all replicas in the replica set)

CDBPartitionKeyRUConsumption
| where TimeGenerated >= now(-1d)
//specify collection and database
//| where DatabaseName == "DB NAME" and CollectionName == "COLLECTIONNAME"
// filter by operation type
//| where operationType_s == 'Create'
| summarize sum(todouble(RequestCharge)) by PartitionKey, PartitionKeyRangeId
| render columnchart  

Next steps