Use DirectQuery in Power BI Desktop
When you connect to any data source with Power BI Desktop, you can import a copy of the data. For some data sources, you can also connect directly to the data source without importing data by using DirectQuery. For a full listing of data sources that support DirectQuery, see Data sources supported by DirectQuery.
Here are the differences between using import and DirectQuery connectivity modes:
Import: A copy of the data from the selected tables and columns imports into Power BI Desktop. As you create or interact with visualizations, Power BI Desktop uses the imported data. To see underlying data changes after the initial import or the most recent refresh, you must import the full dataset again to refresh the data.
DirectQuery: No data imports into Power BI Desktop. For relational sources, you can select tables and columns to appear in the Power BI Desktop Fields list. For multidimensional sources like SAP Business Warehouse (SAP BW), the dimensions and measures of the selected cube appear in the Fields list. As you create or interact with visualizations, Power BI Desktop queries the underlying data source, so you're always viewing current data.
With DirectQuery, when you create or interact with a visualization, you must query the underlying source. The time that's needed to refresh the visualization depends on the performance of the underlying data source. If the data needed to service the request was recently requested, Power BI Desktop uses the recent data to reduce the time required to show the visualization. Selecting Refresh from the Home ribbon refreshes all visualizations with current data.
Many data modeling and data transformations are available when using DirectQuery, although with some performance-based limitations. For more information about DirectQuery benefits, limitations, and recommendations, see DirectQuery in Power BI.
Some benefits of using DirectQuery include:
DirectQuery lets you build visualizations over very large datasets, where it would be unfeasible to import all the data with pre-aggregation.
DirectQuery reports always use current data. Seeing underlying data changes requires you to refresh the data, and reimporting large datasets to refresh data could be unfeasible.
The 1-GB dataset limitation doesn't apply with DirectQuery.
Connect using DirectQuery
To connect to a data source with DirectQuery:
In the Home group of the Power BI Desktop ribbon, select Get data, and then select a data source that DirectQuery supports, such as SQL Server.
In the dialog box for the connection, under Data connectivity mode, select DirectQuery.
Publish to the Power BI service
You can publish DirectQuery reports to the Power BI service, but you need to take extra steps for the Power BI service to open the reports.
To connect the Power BI service to DirectQuery data sources other than Azure SQL Database, Azure Synapse Analytics (formerly SQL Data Warehouse), Amazon Redshift, and Snowflake Data Warehouse, install an on-premises data gateway and register the data source.
If you used DirectQuery with cloud sources like Azure SQL Database, Azure Synapse, Amazon Redshift, or Snowflake Data Warehouse, you don't need an on-premises data gateway. You still must provide credentials for the Power BI service to open the published report. Without credentials, an error occurs when you try to open a published report or explore a dataset created with a DirectQuery connection.
To provide credentials for opening the report and refreshing the data:
In the Power BI service, select the gear icon at upper-right and choose Settings.
On the Settings page, select the Datasets tab, and choose the dataset that uses DirectQuery.
Under Data source connection, provide the credentials to connect to the data source.
If you used DirectQuery with an Azure SQL Database that has a private IP address, you need to use an on-premises gateway.
Considerations and limitations
Some Power BI Desktop features aren't supported in DirectQuery mode, or they have limitations. Some capabilities in the Power BI service, such as quick insights, also aren't available for datasets that use DirectQuery. When you decide whether to use DirectQuery, consider these feature limitations. Also consider the following factors:
Performance and load considerations
DirectQuery sends all requests to the source database, so the required refresh time for visuals depends on how long the underlying source takes to return results. Five seconds or less is the recommended response time for receiving requested data for visuals. Refresh times over 30 seconds produce an unacceptably poor experience for users consuming the report. A query that takes longer than four minutes times out in the Power BI service, and the user receives an error.
Load on the source database also depends on the number of Power BI users who consume the published report, especially if the report uses row-level security (RLS). The refresh of a non-RLS dashboard tile shared by multiple users sends a single query to the database, but refreshing a dashboard tile that uses RLS requires one query per user. The increased queries significantly increase load and potentially affect performance.
One-million row limit
DirectQuery defines a one-million row limit for data returned from cloud data sources, which are any data sources that aren't on-premises. On-premises sources are limited to a defined payload of about 4 MB per row, depending on proprietary compression algorithm, or 16 MB for the entire visual. Premium capacities can set different maximum row limits, as described in the blog post Power BI Premium new capacity settings.
Power BI creates queries that are as efficient as possible, but some generated queries might retrieve too many rows from the underlying data source. For example, this situation can occur with a simple chart that includes a high cardinality column with the aggregation option set to Don't Summarize. The visual must have only columns with a cardinality below 1 million, or must apply the appropriate filters.
The row limit doesn't apply to aggregations or calculations used to select the dataset DirectQuery returns, only to the rows returned. For example, the query that runs on the data source can aggregate 10 million rows. As long as the data returned to Power BI is less than 1 million rows, the query can accurately return the results. If the data is over 1 million rows, Power BI shows an error, except in Premium capacity with different admin-set limits. The error states: The resultset of a query to external data source has exceeded the maximum allowed size of '1000000' rows.
By default, all users who consume a published report in the Power BI service connect to the underlying data source by using the credentials entered after publication. This situation is the same as for imported data. All users see the same data, regardless of any security rules that the underlying source defines.
If you need per-user security implemented with DirectQuery sources, either use RLS or configure Kerberos-constrained authentication against the source. Kerberos isn't available for all sources. For more information, see Row-level security (RLS) with Power BI and Configure Kerberos-based SSO from Power BI service to on-premises data sources.
Other DirectQuery limitations
Some other limitations of using DirectQuery include:
If the Power Query Editor query is overly complex, an error occurs. To fix the error, you must either delete the problematic step in Power Query Editor, or switch to import mode. Multidimensional sources like SAP BW can't use the Power Query Editor.
Automatic date/time hierarchy is unavailable in DirectQuery. DirectQuery mode doesn't support date column drilldown by year, quarter, month, or day.
For table or matrix visualizations, there's a 125-column limit for results that return more than 500 rows from DirectQuery sources. These results display a scroll bar in the table or matrix that lets you fetch more data. In that situation, the maximum number of columns in the table or matrix is 125. If you must include more than 125 columns in a single table or matrix, consider creating measures that use
LAST, because they don't count against this maximum.
You can't change from import to DirectQuery mode. You can switch from DirectQuery mode to import mode if you import all the necessary data. It's not possible to switch back, mostly because of the feature set that DirectQuery doesn't support. DirectQuery models over multidimensional sources, like SAP BW, can't be switched from DirectQuery to import mode either, because of the different treatment of external measures.
Calculated tables and calculated columns that reference a DirectQuery table from a data source with single sign-on (SSO) authentication aren't supported in the Power BI service.
Submit and view feedback for