Get SQL query execution metrics and analyze query performance using .NET SDK
APPLIES TO:
NoSQL
This article presents how to profile SQL query performance on Azure Cosmos DB. This profiling can be done using QueryMetrics
retrieved from the .NET SDK and is detailed here. QueryMetrics is a strongly typed object with information about the backend query execution. These metrics are documented in more detail in the Tune Query Performance article.
Set the FeedOptions parameter
All the overloads for DocumentClient.CreateDocumentQuery take in an optional FeedOptions parameter. This option is what allows query execution to be tuned and parameterized.
To collect the NoSQL query execution metrics, you must set the parameter PopulateQueryMetrics in the FeedOptions to true
. Setting PopulateQueryMetrics
to true will make it so that the FeedResponse
will contain the relevant QueryMetrics
.
Get query metrics with AsDocumentQuery()
The following code sample shows how to do retrieve metrics when using AsDocumentQuery() method:
// Initialize this DocumentClient and Collection
DocumentClient documentClient = null;
DocumentCollection collection = null;
// Setting PopulateQueryMetrics to true in the FeedOptions
FeedOptions feedOptions = new FeedOptions
{
PopulateQueryMetrics = true
};
string query = "SELECT TOP 5 * FROM c";
IDocumentQuery<dynamic> documentQuery = documentClient.CreateDocumentQuery(Collection.SelfLink, query, feedOptions).AsDocumentQuery();
while (documentQuery.HasMoreResults)
{
// Execute one continuation of the query
FeedResponse<dynamic> feedResponse = await documentQuery.ExecuteNextAsync();
// This dictionary maps the partitionId to the QueryMetrics of that query
IReadOnlyDictionary<string, QueryMetrics> partitionIdToQueryMetrics = feedResponse.QueryMetrics;
// At this point you have QueryMetrics which you can serialize using .ToString()
foreach (KeyValuePair<string, QueryMetrics> kvp in partitionIdToQueryMetrics)
{
string partitionId = kvp.Key;
QueryMetrics queryMetrics = kvp.Value;
// Do whatever logging you need
DoSomeLoggingOfQueryMetrics(query, partitionId, queryMetrics);
}
}
Aggregating QueryMetrics
In the previous section, notice that there were multiple calls to ExecuteNextAsync method. Each call returned a FeedResponse
object that has a dictionary of QueryMetrics
; one for every continuation of the query. The following example shows how to aggregate these QueryMetrics
using LINQ:
List<QueryMetrics> queryMetricsList = new List<QueryMetrics>();
while (documentQuery.HasMoreResults)
{
// Execute one continuation of the query
FeedResponse<dynamic> feedResponse = await documentQuery.ExecuteNextAsync();
// This dictionary maps the partitionId to the QueryMetrics of that query
IReadOnlyDictionary<string, QueryMetrics> partitionIdToQueryMetrics = feedResponse.QueryMetrics;
queryMetricsList.AddRange(partitionIdToQueryMetrics.Values);
}
// Aggregate the QueryMetrics using the + operator overload of the QueryMetrics class.
QueryMetrics aggregatedQueryMetrics = queryMetricsList.Aggregate((curr, acc) => curr + acc);
Console.WriteLine(aggregatedQueryMetrics);
Grouping query metrics by Partition ID
You can group the QueryMetrics
by the Partition ID. Grouping by Partition ID allows you to see if a specific Partition is causing performance issues when compared to others. The following example shows how to group QueryMetrics
with LINQ:
List<KeyValuePair<string, QueryMetrics>> partitionedQueryMetrics = new List<KeyValuePair<string, QueryMetrics>>();
while (documentQuery.HasMoreResults)
{
// Execute one continuation of the query
FeedResponse<dynamic> feedResponse = await documentQuery.ExecuteNextAsync();
// This dictionary is maps the partitionId to the QueryMetrics of that query
IReadOnlyDictionary<string, QueryMetrics> partitionIdToQueryMetrics = feedResponse.QueryMetrics;
partitionedQueryMetrics.AddRange(partitionIdToQueryMetrics.ToList());
}
// Now we are able to group the query metrics by partitionId
IEnumerable<IGrouping<string, KeyValuePair<string, QueryMetrics>>> groupedByQueryMetrics = partitionedQueryMetrics
.GroupBy(kvp => kvp.Key);
// If we wanted to we could even aggregate the groupedby QueryMetrics
foreach(IGrouping<string, KeyValuePair<string, QueryMetrics>> grouping in groupedByQueryMetrics)
{
string partitionId = grouping.Key;
QueryMetrics aggregatedQueryMetricsForPartition = grouping
.Select(kvp => kvp.Value)
.Aggregate((curr, acc) => curr + acc);
DoSomeLoggingOfQueryMetrics(query, partitionId, aggregatedQueryMetricsForPartition);
}
LINQ on DocumentQuery
You can also get the FeedResponse
from a LINQ Query using the AsDocumentQuery()
method:
IDocumentQuery<Document> linqQuery = client.CreateDocumentQuery(collection.SelfLink, feedOptions)
.Take(1)
.Where(document => document.Id == "42")
.OrderBy(document => document.Timestamp)
.AsDocumentQuery();
FeedResponse<Document> feedResponse = await linqQuery.ExecuteNextAsync<Document>();
IReadOnlyDictionary<string, QueryMetrics> queryMetrics = feedResponse.QueryMetrics;
Expensive Queries
You can capture the request units consumed by each query to investigate expensive queries or queries that consume high throughput. You can get the request charge by using the RequestCharge property in FeedResponse
. To learn more about how to get the request charge using the Azure portal and different SDKs, see find the request unit charge article.
string query = "SELECT * FROM c";
IDocumentQuery<dynamic> documentQuery = documentClient.CreateDocumentQuery(Collection.SelfLink, query, feedOptions).AsDocumentQuery();
while (documentQuery.HasMoreResults)
{
// Execute one continuation of the query
FeedResponse<dynamic> feedResponse = await documentQuery.ExecuteNextAsync();
double requestCharge = feedResponse.RequestCharge
// Log the RequestCharge how ever you want.
DoSomeLogging(requestCharge);
}
Get the query execution time
When calculating the time required to execute a client-side query, make sure that you only include the time to call the ExecuteNextAsync
method and not other parts of your code base. Just these calls help you in calculating how long the query execution took as shown in the following example:
string query = "SELECT * FROM c";
IDocumentQuery<dynamic> documentQuery = documentClient.CreateDocumentQuery(Collection.SelfLink, query, feedOptions).AsDocumentQuery();
Stopwatch queryExecutionTimeEndToEndTotal = new Stopwatch();
while (documentQuery.HasMoreResults)
{
// Execute one continuation of the query
queryExecutionTimeEndToEndTotal.Start();
FeedResponse<dynamic> feedResponse = await documentQuery.ExecuteNextAsync();
queryExecutionTimeEndToEndTotal.Stop();
}
// Log the elapsed time
DoSomeLogging(queryExecutionTimeEndToEndTotal.Elapsed);
Scan queries (commonly slow and expensive)
A scan query refers to a query that wasn't served by the index, due to which, many documents are loaded before returning the result set.
Below is an example of a scan query:
SELECT VALUE c.description
FROM c
WHERE UPPER(c.description) = "BABYFOOD, DESSERT, FRUIT DESSERT, WITHOUT ASCORBIC ACID, JUNIOR"
This query's filter uses the system function UPPER, which isn't served from the index. Executing this query against a large collection produced the following query metrics for the first continuation:
QueryMetrics
Retrieved Document Count : 60,951
Retrieved Document Size : 399,998,938 bytes
Output Document Count : 7
Output Document Size : 510 bytes
Index Utilization : 0.00 %
Total Query Execution Time : 4,500.34 milliseconds
Query Preparation Times
Query Compilation Time : 0.09 milliseconds
Logical Plan Build Time : 0.05 milliseconds
Physical Plan Build Time : 0.04 milliseconds
Query Optimization Time : 0.01 milliseconds
Index Lookup Time : 0.01 milliseconds
Document Load Time : 4,177.66 milliseconds
Runtime Execution Times
Query Engine Times : 322.16 milliseconds
System Function Execution Time : 85.74 milliseconds
User-defined Function Execution Time : 0.00 milliseconds
Document Write Time : 0.01 milliseconds
Client Side Metrics
Retry Count : 0
Request Charge : 4,059.95 RUs
Note the following values from the query metrics output:
Retrieved Document Count : 60,951
Retrieved Document Size : 399,998,938 bytes
This query loaded 60,951 documents, which totaled 399,998,938 bytes. Loading this many bytes results in high cost or request unit charge. It also takes a long time to execute the query, which is clear with the total time spent property:
Total Query Execution Time : 4,500.34 milliseconds
Meaning that the query took 4.5 seconds to execute (and this was only one continuation).
To optimize this example query, avoid the use of UPPER in the filter. Instead, when documents are created or updated, the c.description
values must be inserted in all uppercase characters. The query then becomes:
SELECT VALUE c.description
FROM c
WHERE c.description = "BABYFOOD, DESSERT, FRUIT DESSERT, WITHOUT ASCORBIC ACID, JUNIOR"
This query is now able to be served from the index.
To learn more about tuning query performance, see the Tune Query Performance article.
References
Next steps
Feedback
Submit and view feedback for