Share via

Number an kind of queries with Direct Query to Azure Databricks

Michael Riedmüller 20 Reputation points
2026-02-04T10:46:33.18+00:00

Using Direct Query against Azure Datebricks from a Power BI semantic model takes too long from the point of view of the customer and generates too many queries.
The customer thinks that there should be more suitable set of queries to obtain the data required in the report.

We appreciate help in two possible variants:

  • verify the model and relationsships are suited for that task
  • explanation about the queries and the reasons for their structure
Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.

0 comments No comments

Answer accepted by question author

SAI JAGADEESH KUDIPUDI 3,205 Reputation points Microsoft External Staff Moderator
2026-02-04T12:40:14.4+00:00

Hi Michael Riedmüller,
It sounds like you're experiencing performance issues with Direct Query to Azure Databricks from a Power BI semantic model. You're looking for help in two areas: verifying the model and relationships and getting an explanation about the structure of the generated queries.

Here are some suggestions to address your concerns:
Verify Model and Relationships
Check Relationships: Ensure that the relationships in your Power BI model are properly defined. Look for any unnecessary relationships which might lead to multiple queries being fired or confusion in data retrieval.
Use Stargate: Consider leveraging stargate patterns in your model design to aggregate data, which might help reduce the number of queries fired.
Optimize Database Schema: If your data structure is not optimized, it may result in inefficient query performance. Ensure that your tables are well-indexed and that relationships are properly set up.
Explanation of Queries and Structure
Query Performance Insights: You can run diagnostics to determine if there are known performance issues affecting your SQL Analytics:
Use the query profile tools in Databricks to visualize the query execution which can help identify bottlenecks.
Review the execution plan using the explain Command to understand how Databricks is interpreting your query.
Common Issues:
Ensure you aren’t blocking operations that can lead to increased response times.
Look into known issues related to SQL queries in Databricks SQL that can affect query performance.
Using Caching: Implementing query caching can improve the performance of frequently executed queries. Follow the guidelines on query caching in Databricks.

Here are a couple of additional suggestions that might help you:
Analyze the SQL Query: Review the SQL scripts generated. Sometimes, transforming aggregate queries into smaller steps can help optimize performance.
Adaptive Query Execution: If it's not already enabled, consider using Adaptive Query Execution (AQE) settings to help optimize query plans dynamically.
Follow-Up Questions

To provide more tailored advice, I have a few questions:

  1. Can you share more details about the specific model and the types of relationships you are using in Power BI?
  2. Are there specific queries that are taking an unexpectedly long time to execute, or is it a general issue?
  3. Have you already tried any performance optimization techniques, such as caching or changing the query structure? If so, which ones?

This response was AI-generated. If the issue is not resolved, please follow up in the comments so we can assist further. Hope this helps! Let me know if you need further clarification or help!
References

Was this answer helpful?

1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. AI answer

    2026-02-04T10:47:45.6133333+00:00

    Using DirectQuery with Azure Databricks can lead to performance issues if not optimized properly. Here are some considerations and recommendations to improve the efficiency of your queries:

    1. Model and Relationships: Ensure that your model is designed efficiently. Here are some tips:
      • Hide One-Side Columns: If you have relationships, hide the one-side column of relationships to avoid unnecessary joins in queries. This can help reduce the complexity of the generated SQL.
      • Set Relationships to Enforce Integrity: Use the Assume Referential Integrity property to improve query performance. This allows Power BI to generate more efficient INNER JOIN queries instead of OUTER JOIN queries.
      • Limit Bi-Directional Filtering: Avoid using bi-directional relationship filtering unless necessary, as it can lead to inefficient queries.
      • Limit Parallel Queries: Control the maximum number of connections that DirectQuery opens for each underlying data source to manage the load on Azure Databricks.
    2. Query Structure: The structure of the queries generated by DirectQuery can significantly impact performance. Here are some factors to consider:
      • Response Time: Ensure that the underlying data source can return results in five seconds or less. Queries taking longer than 30 seconds can lead to a poor user experience.
      • Row Limits: Be aware of the one-million-row limit for data returned from cloud data sources. Ensure that your queries are designed to stay within this limit to avoid failures.
      • Query Folding: Confirm that query folding is happening. This means that transformations in Power Query should be pushed down to the source database to minimize the amount of data transferred.
      • Indexing: Ensure that appropriate indexes are created on join and filter columns in Azure Databricks to improve query performance.

    By verifying the model design and optimizing the query structure, you can enhance the performance of your DirectQuery reports in Power BI.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.