DirectQuery Mode (SSAS Tabular)
Analysis Services lets you create tabular models and reports that retrieve data and aggregates directly from a relational database system, using DirectQuery mode. Commonly cited benefits of using DirectQuery mode include the ability to query very large data sets that cannot fit in memory, and having data refreshed in real time.
This topic introduces the differences between standard tabular models that reside only in memory and tabular models that can query a relational data source, and explains how you can author and deploy a model for use in DirectQuery mode.
Sections in this topic:
Overview of DirectQuery Mode
Authoring Models for Use in DirectQuery Mode
Data Sources for DirectQuery Models
Validation and Design Restrictions for DirectQuery Mode
Formula Compatibility
Connecting to DirectQuery Models
Security
DirectQuery Properties
Related Topics and Tasks
Overview of DirectQuery Mode
By default, tabular models use an in-memory cache to store and query data. Because tabular models use data that resides in memory, even complex queries can be incredibly fast. However, there are some drawbacks to using cached data:
Data is not refreshed when the source data changes. You must process the model to get updates to the data.
When you turn off the computer that hosts the model, the cache is saved to disk and must be reopened when you load the model or open the PowerPivot file. The save and load operations can be time-consuming.
In contrast, a tabular model in DirectQuery mode uses data that is stored in a SQL Server database, or in a SQL Server PDW data warehouse. At design time, you import all or a small sample of the data into the cache and build your model as usual. When you are ready to deploy the model, you change the operating mode to DirectQuery. After you change the operating mode, any queries against the model will use the specified relational data source (either SQL Server or SQL Server PDW), not the cached data.
When you create reports or queries against the model, you use DAX, but the DAX queries are translated by Analysis Services into equivalent Transact-SQL statements against the specified relational data source.
There are many advantages to deploying a model using DirectQuery mode:
It is possible to have a model over data sets that are too large to fit in memory on the Analysis Services server.
The data is guaranteed to be up-to-date, and there is no extra management overhead of having to maintain a separate copy of the data. Changes to the underlying source data can be immediately reflected in queries against the data model.
DirectQuery can take advantage of provider-side query acceleration, such as that provided by xVelocity memory optimized column indexes. xVelocity columnstore indexes are provided in both SQL Server 2012 and SQL Server PDW, to support improved DirectQuery performance.
Any security enforced by the back-end database is guaranteed to be enforced, using row-level security. In contrast, if you are using cached data, it can be difficult to ensure that the cache is secured exactly as on the server.
If the model contains complex formulas that might require multiple queries, Analysis Services can perform optimization to ensure that the query plan for the query executed against the back-end database will be as efficient as possible.
Designing Models for DirectQuery Mode
All tabular models are authored by using the model designer in SQL Server Data Tools (SSDT), which by default always creates models in memory. This means that if you are working with data that is too large to fit into memory, you should plan to work on a smaller subset of data during design time, and access the full data source only after the model is deployed. There are several ways you can work with a smaller subset of data:
When you add data to the model using the Table Import wizard, use a filter to import only a subset of data into the cache used by the workspace database. After you deploy the model, you can edit the query definition to remove or alter the filter.
Use a staging data source, and design the model while the staging tables contain only a subset of data.
Add partitions to your tables, to manage the amount of data that needs to be processed at any one time.
When you are ready to switch to DirectQuery mode, you can change a property that enables DirectQuery mode. For more information, see Enable DirectQuery Design Mode (SSAS Tabular).
When you do this, the model designer automatically configures the workspace database to run in a hybrid mode that lets you continue to work with the cached data. The model designer will also notify you of any features in your model that are incompatible with DirectQuery mode. The following list summarizes the main requirements to keep in mind:
Data sources: DirectQuery models can only use data from a single relational data source. The source can be either an instance of SQL Server, or an instance of SQL Server PDW. After DirectQuery mode has been enabled for the model, you can use no other types of data in the model designer, including tables added by copy-paste operations. All other import options are disabled. Moreover, any tables included in a query must be part of the specified data source. See Data Sources for more information.
Support for calculated columns: Calculated columns are not supported for DirectQuery models. However, you can create measures and KPIs, which operate over sets of data. See the section on validation for more information.
Limited use of DAX functions: Some DAX functions cannot be used in DirectQuery mode, so you must either replace them with other functions, or create the values using derived columns in the data source. The model designer provides design-time validation for any errors that arise when you create formulas that are incompatible with DirectQuery mode. See the following sections for more information: Validation.
Formula compatibility: In certain known cases, the same formula can return different results in a cached or hybrid model compared to a DirectQuery model that uses only the relational data store. These differences are a consequence of semantic differences between the xVelocity in-memory analytics (VertiPaq) engine and SQL Server. For more information about these differences, see this section: Formula Compatibility.
Security: You can use different methods to secure models depending on how they are deployed. Cached data for tabular models is secured by using the security model of the Analysis Services instance. DirectQuery models can be secured by using roles, but you can also use the security defined in the relational data store. The model can be configured so that users who open a report based on a DirectQuery-only model can see only the data that is allowed to them under their permissions in SQL Server. See this section for more information: Security.
Client restrictions: When a model is in DirectQuery mode, it can only be queried by using DAX. You cannot use MDX to create queries. This means that you cannot use the Excel Pivot Client, because Excel uses MDX.
However, you can create queries against a DirectQuery model in SQL Server Management Studio if you use a DAX table query as part of an XMLA Execute statement, For more information, see DAX Query Syntax Reference.
When you have resolved all the design issues and tested your model, you are ready for deployment. At this point, you can set the preferred method for answering queries against the model. Do you want users to have access to the cache, or always use only the relational data source?
If you deploy the model in a hybrid mode, the cache is still available and can be used for queries. A hybrid mode provides you with many options:
When both the cache and the relational data source are available, you can set the preferred connection method, but ultimately the client controls which source is used, using the DirectQueryMode connection string property.
You can also configure partitions on the cache in such a way that the primary partition used for DirectQuery mode is never processed and must always reference the relational source. There are many ways to use partitions to optimize the model design and reporting experience. For more information, see Partitions and DirectQuery Mode (SSAS Tabular).
After the model has been deployed, you can change the preferred connection method. For example, you might use a hybrid mode for testing, and switch the model over to DirectQuery only mode only after thoroughly testing any reports or queries that use the model. For more information, see Set or Change the Preferred Connection Method for DirectQuery.
Data Sources for DirectQuery Models
As soon as you change the design environment to enable DirectQuery mode, the data sources for the workspace database are validated to ensure that they come from a single relational data source. Data from other sources, including copy-pasted data, is not allowed in DirectQuery models.
If you intend to use the model in DirectQuery mode, you must ensure that all the data you need for reporting is stored in the specified data source. If the data you need for modeling is not available in that source, consider use of Integration Services or other data warehousing tools to import the data into the database or data warehouse that serves as the DirectQuery data source.
Validation and Design Restrictions for DirectQuery Mode
When you author a model for use in DirectQuery mode, you must initially load some portion of the data into the cache. If the data you will eventually use is too large to fit into memory, you can use the Preview & Filter option in the Table Import wizard to select a subset of data, or write a SQL script to get the data that you want. Another option is to stage a subset of the data in a data warehouse, and then increase the data store after the design phase is finished.
Przestroga |
---|
Because DirectQuery mode does not support the use of calculated columns, you must remove any calculated columns. You can convert some calculated columns to measures, but in other cases you might build derived columns as part of your data import query or script, or add them in the relational data source. |
To view and resolve validation errors, open the Error List in SQL Server Data Tools. Critical errors that prevent use of DirectQuery mode are displayed on the Errors tab. You must fix these errors before changing to DirectQuery mode. The validation errors that are more difficult to resolve typically are related to formulas that are not supported in DirectQuery mode. See the section, Formula Compatibility, for an overview of errors related to formulas and calculated columns.
The following list describes other considerations to keep in mind when authoring a model for DirectQuery access:
When in DirectQuery only mode, the results in a report can vary depending on the security context of the user who is viewing the results. You should test models with different credentials to ensure that users get the expected results.
If you configure a model to operate in hybrid mode, which allows the use of either the cache or data from the relational source, you should be aware of the possibility that clients might see different results, depending on the mode specified in the connection string. If you need to ensure that your report users see only data from the relational source, you must clear the cache or change the model to DirectQueryOnly.
Formula Compatibility for DirectQuery Models
Some models might contain formulas that are not supported in DirectQuery mode, and the model must be redesigned to prevent validation errors. Restrictions on formulas that are supported in DirectQuery mode include the following:
Calculated columns are not supported in any tabular model that has DirectQuery mode enabled, not even hybrid models. If you need calculated columns for a model, consider converting them to derived columns by using Transact-SQL in your import definition.
DirectQuery models do support the use of DAX formulas for use in measures, which are converted to set-based operations against the relational data store. Measures that you can create by using implicit measures are also supported.
Not all functions are supported. Because Analysis Services converts all DAX formulas and measure definitions into SQL statements when querying a DirectQuery model, any formula containing elements that cannot be converted into Transact-SQL will trigger validation errors on the model. For example, time intelligence functions are not supported. Even functions that are supported might behave differently, such as statistical functions. For a complete list of compatibility issues, see Formula Compatibility in DirectQuery Mode.
Some formulas in the model might validate when you switch the model to DirectQuery mode, but return different results when executed against the cache vs. the relational data store. This is because calculations against the cache use the semantics of the xVelocity in-memory analytics (VertiPaq) engine, which contains many features meant to emulate the behavior of Excel, whereas queries against data stored in the relational data store necessarily use the semantics of SQL Server. For a list of DAX functions that might return different results when the model is deployed to real-time, see Formula Compatibility in DirectQuery Mode.
Connecting to DirectQuery Models
Clients that use MDX as the query language cannot connect to models that use DirectQuery mode. If you attempt to create an MDX query against a DirectQuery model, you will get an error indicating that the cube cannot be found, or has not been processed. You can create queries against DirectQuery models by using Power View, DAX formulas, or XMLA queries. For more information about how you can perform ad hoc queries against tabular models, see Tabular Model Data Access.
If you are using a hybrid model, you can specify whether users connect to the cache or use DirectQuery data by specifying the connection string property, DirectQueryMode.
Security in DirectQuery Mode
During model authoring, you specify the permissions that are used to retrieve the source data. This will often be your own credentials, or an account used for development. However, when you switch the model to use DirectQuery mode, the security context is more complex:
Consider whether users have the necessary level of access to the data in the relational data store.
Users who view the same model or report might see different data, depending on the user’s security context.
If the model cache has been preserved, the cache is secured using the Analysis Services security model (roles). The cache might contain data that the model designer is privileged to see but the user is not. Model and report designers should either clear the cache, or secure this data by controlling access via roles.
A model that answers queries from the cache cannot impersonate the current user when connecting to the data source. If you want to impersonate the current user when connecting to the data source, you must use DirectQuery mode.
If your report model requires security, you have two options: you can either use Analysis Services roles, or you can set row-level permissions on the data source. Security in the relational data source is used to control access to tables, and column-level security is not supported. Therefore, if users in one region do not have permission to view sales figures from different regions, a report that includes a measure based on the Sales table would return blanks or an error.
The impersonation settings property specifies the credentials used when you are connecting to a model using DirectQuery, either for a DirectQuery only model or for a hybrid model answering queries using DirectQuery. The property has the following values:
Default
Uses the credentials specified in the import wizard to connect to the data source. This can be a specific Windows user or the service account.ImpersonateCurrentUser
Uses the credentials of the current user to connect to the data source.
For information on how to set these properties, see DirectQuery Deployment Scenarios (SSAS Tabular).
DirectQuery Properties
The following table lists the properties that you can set in SQL Server Data Tools, and in SQL Server Management Studio, to enable DirectQuery and to control the source of data used for queries against the model.
Property name |
Description |
---|---|
DirectQueryMode property |
This property enables use of DirectQuery mode in the model designer. You must set this property to On to change any of the other DirectQuery properties. For more information, see Enable DirectQuery Design Mode (SSAS Tabular). |
QueryMode property |
This property specifies the default query method for a DirectQuery model. You set this property in the model designer when you deploy the model, but you can override it later. The property has these values:
For more information, see Set or Change the Preferred Connection Method for DirectQuery. |
DirectQueryMode property |
After the model has been deployed, you can change the preferred query data source for a DirectQuery model, by changing this property in SQL Server Management Studio Like the previous property, this property specifies the default data source for the model, and has these values:
For more information, see Set or Change the Preferred Connection Method for DirectQuery. |
Impersonation Settings property |
This property defines the credentials used to connect to the relational data source at query time. You can set this property in the model designer, and you can change the value later, after the model has been deployed. Note that these credentials are used only for answering queries against the relational data store; they are not the same as the credentials used for processing the cache of a hybrid model. Impersonation cannot be used when the model is used within memory only. The setting ImpersonateCurrentUser, is invalid unless the model is using DirectQuery mode. |
Additionally, if your model includes partitions, you must choose one partition to use as the source for queries in DirectQuery mode. For more information, see Partitions and DirectQuery Mode (SSAS Tabular).
Related Topics and Tasks
Topic |
Description |
---|---|
Describes how partitions are used in models configured for DirectQuery mode. |
|
Describes restrictions and compatibility requirements on the formulas that you can use in models configured for DirectQuery mode. |
|
Describes how you can change the design-time environment so that it supports using DirectQuery mode. |
|
Describes how to change the DirectQuery partition. |
|
Set or Change the Preferred Connection Method for DirectQuery |
Describes how to set or change the connection method for models configured for DirectQuery. |
Describes DirectQuery deployment scenarios. |
|
Configure In-Memory or DirectQuery Access for a Tabular Model Database |
Describes some common DirectQuery configurations. |
Explains how to clear the cache of the tabular model. |