How can I improve query performance in Azure Data Explorer for high-volume application logs?

pr2380 105 Reputation points
2025-10-16T18:33:02.59+00:00

I'm analyzing web platform logs stored in an AppLogs table with columns like LogID, Timestamp, EventType, and Details. Most queries filter by EventType or time ranges to track errors.

During periods of high log volume, I’m seeing issues like:

Slow query responses (up to 15 seconds, target is <2 seconds)

Errors such as “Query exceeded resource limits

I've already:

Indexed EventType

Applied ingestion-time policies

However, queries over large datasets are still sluggish, affecting real-time dashboards.

What specific steps or best practices can I follow to troubleshoot and optimize query performance in ADX under high load?

Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
0 comments No comments
{count} votes

Answer accepted by question author
  1. Pratyush Vashistha 5,120 Reputation points Microsoft External Staff Moderator
    2025-10-16T18:46:13.45+00:00

    Hello pr2380!

    To resolve “Query exceeded resource limits,” you can optimize query patterns by narrowing the dataset with precise filters (e.g., where Timestamp > ago(1d) and EventType == 'Error') to reduce scan scope. Use the summarize operator for aggregations instead of full table scans, a method that cut response times to under 2 seconds in past projects. Additionally, you can increase the cluster’s cache size or scale out to add nodes for higher query throughput.

    For slow performance, you can enhance the ingestion-time policy to partition data by Timestamp (e.g., daily extents), improving query efficiency for time-based filters. Update the table’s caching policy to retain hot data for 30 days, ensuring frequently accessed logs are readily available.

    Best Practices which you can follow are as follows:

    • You can use materialized views for pre-aggregated log metrics to speed up dashboards.
    • You can set query limits to prevent resource-intensive operations.
    • You can configure Azure Monitor alerts for query durations above 5 seconds.
    • You can periodically review extent sizes to optimize partitioning.
    • You can use ingestion batching to streamline log data loading.

    References:

    https://learn.microsoft.com/en-us/kusto/query/best-practices?view=microsoft-fabric

    https://learn.microsoft.com/en-us/azure/data-explorer/ingest-data-overview

    If this answers your query, do click UpVote`, and, if you have any further query do let us know.

    Thanks

    Pratyush

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Marcin Policht 67,980 Reputation points MVP Volunteer Moderator
    2025-10-16T18:46:42.9166667+00:00

    Here are a few options you can try:

    1. Review query patterns - slow queries often come from scanning too much data unnecessarily. Consider:
    • Time filters first: Always filter by Timestamp or ingestion time at the very start (ADX performs better when the filter reduces the dataset early). For example:
        AppLogs
        | where Timestamp >= ago(1h)
        | where EventType == "Error"
      
    • Avoid contains or regex over large columns if possible; use has/== for exact matches.
    • Summarize early: Aggregate before doing joins or complex operations.
        AppLogs
        | where Timestamp >= ago(1h)
        | summarize count() by EventType
      
    1. Leverage materialized views and caching

    For dashboards and repeated queries:

    • Materialized views:
        create materialized-view AppLogs_ErrorCount
        on table AppLogs
        {
            summarize ErrorCount = count() by bin(Timestamp, 1m), EventType
        }
      
      • Speeds up repeated queries.
      • Reduces scan over raw logs.
    • Query result caching: ADX caches query results for repeated queries — useful for dashboards.
    • Pre-aggregations: If you track error trends by hour/minute, maintain a summary table instead of scanning raw logs each time.
    1. Review resource and service-level optimizations
    • Cluster sizing: Check if your cluster is under-provisioned for peak load; consider scale-out for more nodes or scale-up for more powerful nodes (this is useful for queries that process high-cardinality data).
    • Set maxmemory hints:
        AppLogs
        | hint.strategy=shuffle
        | summarize count() by EventType
      
    • Query throttling: Use set query_take_max_records or set query_max_memory_consumption_per_node to prevent hitting cluster limits.
    1. Optimize ingestion and retention
    • Batching: Larger ingestion batches reduce overhead.
    • Retention policies: Keep raw logs only as long as needed; older data can move to cold storage and be queried separately.
    • Hot/cold separation: Use hot cache for last N days to improve dashboard performance.

    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.