Редагувати

Поділитися через


Use DirectQuery in Power BI Desktop

When you connect to any data source with Power BI Desktop, you can import a copy of the data. For some data sources, you can also connect directly to the data source without importing data by using DirectQuery. This article explains the differences between Import and DirectQuery connectivity modes and tells you how to connect to data sources using DirectQuery. It also covers the considerations and limitations of using DirectQuery, such as performance and security.

To determine whether a data source supports DirectQuery, view the full listing of available data sources found in the article Connectors in Power Query, which also applies to Power BI. Select the article that describes the data source you're interested in from the list of supported connectors, then see the section in that connector's article titled Capabilities supported. If DirectQuery isn't listed in that section for the data source's article, DirectQuery isn't supported for that data connector.

Here are the differences between using Import and DirectQuery connectivity modes:

  • Import: A copy of the data from the selected tables and columns imports into Power BI Desktop. As you create or interact with visualizations, Power BI Desktop uses the imported data. To see underlying data changes after the initial import or the most recent refresh, you must import the full semantic model again to refresh the data.

  • DirectQuery: No data imports into Power BI Desktop. For relational sources, you can select tables and columns to appear in the Power BI Desktop Data pane. For multidimensional sources like SAP Business Warehouse (SAP BW), the dimensions and measures of the selected cube appear in the Data pane. As you create or interact with visualizations, Power BI Desktop queries the underlying data source, so you're always viewing current data.

With DirectQuery, when you create or interact with a visualization, you must query the underlying source. The time that's needed to refresh the visualization depends on the performance of the underlying data source. If the data needed to service the request was recently requested, Power BI Desktop uses the recent data to reduce the time required to show the visualization. Selecting Refresh from the Home ribbon refreshes all visualizations with current data.

Many data modeling and data transformations are available when using DirectQuery, although with some performance-based limitations. For more information about DirectQuery benefits, limitations, and recommendations, see DirectQuery in Power BI.

DirectQuery benefits

Some benefits of using DirectQuery include:

  • DirectQuery lets you build visualizations over very large semantic models, where it would be infeasible to import all the data with pre-aggregation.

  • DirectQuery reports always use current data. Seeing underlying data changes requires you to refresh the data, and reimporting large semantic models to refresh data could be infeasible.

  • The 1-GB semantic model limitation doesn't apply with DirectQuery.

Connect using DirectQuery

To connect to a data source with DirectQuery:

  1. In the Home group of the Power BI Desktop ribbon, select Get data, and then select a data source that DirectQuery supports, such as SQL Server.

  2. In the dialog box for the connection, under Data Connectivity mode, select DirectQuery.

Import and DirectQuery options, SQL Server Database dialog, Power BI Desktop

Publish to the Power BI service

You can publish DirectQuery reports to the Power BI service, but you need to take extra steps for the Power BI service to open the reports.

  • To connect the Power BI service to DirectQuery data sources other than Azure SQL Database, Azure Synapse Analytics (formerly SQL Data Warehouse), Amazon Redshift, and Snowflake Data Warehouse, install an on-premises data gateway and register the data source.

  • If you used DirectQuery with cloud sources like Azure SQL Database, Azure Synapse, Amazon Redshift, or Snowflake Data Warehouse, you don't need an on-premises data gateway. You still must provide credentials for the Power BI service to open the published report. Without credentials, an error occurs when you try to open a published report or explore a semantic model created with a DirectQuery connection.

To provide credentials for opening the report and refreshing the data:

  1. In the Power BI service, go to the workspace and locate the semantic model that uses DirectQuery in the workspace content list.

  2. Select the More options three horizontal dots icon next to the name of the semantic model, then choose Settings.

  3. Under Data source credentials, provide the credentials to connect to the data source.

Note

If you used DirectQuery with an Azure SQL Database that has a private IP address, you need to use an on-premises gateway.

Considerations and limitations

Some Power BI Desktop features aren't supported in DirectQuery mode, or they have limitations. Some capabilities in the Power BI service, such as quick insights, also aren't available for semantic models that use DirectQuery. When you decide whether to use DirectQuery, consider these feature limitations. Also consider the following factors:

Performance and load considerations

DirectQuery sends all requests to the source database, so the required refresh time for visuals depends on how long the underlying source takes to return results. Five seconds or less is the recommended response time for receiving requested data for visuals. Refresh times over 30 seconds produce an unacceptably poor experience for users consuming the report. A query that takes longer than four minutes times out in the Power BI service, and the user receives an error.

Load on the source database also depends on the number of Power BI users who consume the published report, especially if the report uses row-level security (RLS). The refresh of a non-RLS dashboard tile shared by multiple users sends a single query to the database, but refreshing a dashboard tile that uses RLS requires one query per user. The increased queries significantly increase load and potentially affect performance.

1 million-row limit

DirectQuery defines a 1 million-row limit for data returned from cloud data sources, which are any data sources that aren't on-premises. On-premises sources are limited to a defined payload of about 4 MB per row, depending on proprietary compression algorithm, or 16 MB for the entire visual. Premium capacities can set different maximum row limits, as described in the blog post Power BI Premium new capacity settings.

Power BI creates queries that are as efficient as possible, but some generated queries might retrieve too many rows from the underlying data source. For example, this situation can occur with a simple chart that includes a high cardinality column with the aggregation option set to No Calculation. The visual must have only columns with a cardinality below 1 million, or must apply the appropriate filters.

The row limit doesn't apply to aggregations or calculations used to select the semantic model DirectQuery returns, only to the rows returned. For example, the query that runs on the data source can aggregate 10 million rows. As long as the data returned to Power BI is less than 1 million rows, the query can accurately return the results. If the data is over 1 million rows, Power BI shows an error, except in Premium capacity with different admin-set limits. The error states: The resultset of a query to external data source has exceeded the maximum allowed size of '1000000' rows.

Security considerations

By default, all users who consume a published report in the Power BI service connect to the underlying data source by using the credentials entered after publication. This situation is the same as for imported data. All users see the same data, regardless of any security rules that the underlying source defines.

If you need per-user security implemented with DirectQuery sources, either use RLS or configure Kerberos constrained authentication against the source. Kerberos isn't available for all sources. For more information, see Row-level security (RLS) with Power BI and Configure Kerberos-based SSO from Power BI service to on-premises data sources.

Other DirectQuery limitations

Some other limitations of using DirectQuery include:

  • If the Power Query Editor query is overly complex, an error occurs. To fix the error, you must either delete the problematic step in Power Query Editor, or switch to Import mode. Multidimensional sources like SAP BW can't use the Power Query Editor.

  • Automatic date/time hierarchy is unavailable in DirectQuery. DirectQuery mode doesn't support date column drilldown by year, quarter, month, or day.

  • For table or matrix visualizations, there's a 125-column limit for results that return more than 500 rows from DirectQuery sources. These results display a scroll bar in the table or matrix that lets you fetch more data. In that situation, the maximum number of columns in the table or matrix is 125. If you must include more than 125 columns in a single table or matrix, consider creating measures that use MIN, MAX, FIRST, or LAST, because they don't count against this maximum.

  • You can't change from Import to DirectQuery mode. You can switch from DirectQuery mode to Import mode if you import all the necessary data. It's not possible to switch back, mostly because of the feature set that DirectQuery doesn't support. DirectQuery models over multidimensional sources, like SAP BW, can't be switched from DirectQuery to Import mode either, because of the different treatment of external measures.

  • Calculated tables and calculated columns that reference a DirectQuery table from a data source with single sign-on (SSO) authentication are supported in the Power BI service with an assigned shareable cloud connection and/or granular access control.