DirectQuery 101

DirectQuery is to tabular modeling what ROLAP is to multidimensional modeling. It’s a way to get query results from your model in real time without processing data into the model first. You can get started with DirectQuery by perusing some of our MSDN documentation on DirectQuery. Also, Edward Melomed did a DirectQuery overview webcast.

Here are some fast facts about DirectQuery, in Q&A format:

Q: Why do I have to set DirectQuery modes twice, once on the Model.bim file properties and once in the deployment properties ?

A: You set the DirectQuery mode on the BIM file to turn the designer into DirectQuery editing mode. This changes your modeling environment. Once this mode is on, the designer will enforce for you all of the DirectQuery mode restrictions (single SQL server data source, no calc columns, etc), so there are no nasty surprises for you at deployment time.

The deployment property tells the deployment server which storage mode you want to use for the database. This property allows you to tell the deployment server whether or not to keep a VertiPaq cache for your DirectQuery model, and if so, whether the cache is the default place from which to fetch query results.

The reason why there’s two properties – when editing a DirectQuery model, the workspace database is always in VertiPaq with DirectQuery mode. The grid view always shows data from the VertiPaq cache. There is no way to make the workspace database talk directly to the data source. To make this distinction clear (workspace database does one thing, but deployment database does another), we made two properties.

Q: What is the difference between VertiPaq with DirectQuery and DirectQuery with VertiPaq? Which one should I pick?

A: These two modes are the documentation calls “hybrid modes”. Both hybrid modes support DirectQuery and VertiPaq queries. The difference between VertiPaq with DirectQuery and DirectQuery with VertiPaq is the default place from which the results are fetched. The former uses the VertiPaq cache by default, the latter uses DirectQuery to go directly to the data source by default.

If you have a model where you want to have Crescent use DirectQuery and another MDX issuing client (such as Excel) hit the VertiPaq cache, you must use DirectQuery with VertiPaq mode for Denali. Crescent doesn’t allow you to switch back and forth between DirectQuery mode on the same model.

Q: What is the difference between “VertiPaq with DirectQuery” and InMemoryWithDirectQuery? The query mode I pick in BIDS is not the same as the one I see in SSMS.

A: For CTP3, the query mode in BIDS is not the same as the one you see in SSMS. SSMS exposes the name of the enum exposed through AMO and the DDL, whereas BIDS has “pretty printed” the string and inserted the branding for the in-memory VertiPaq engine. The BIDS strings will change in the future.

The following table shows the mapping between the values for CTP3:

Query Mode in BIDS DirectQueryMode in SSMS DDL
DirectQuery DirectQuery <ddl300_300:DirectQueryMode>     DirectQuery </ddl300_300:DirectQueryMode>
VertiPaq with DirectQuery InMemoryWithDirectQuery <ddl300_300:DirectQueryMode>     InMemoryWithDirectQuery </ddl300_300:DirectQueryMode>
DirectQuery with VertiPaq DirectQueryWithInMemory <ddl300_300:DirectQueryMode>     DirectQueryWithInMemory </ddl300_300:DirectQueryMode>

 

Q: What are the advantages of using one of the hybrid modes over using DirectQuery only mode? What are the disadvantages?

A: One advantage of using a hybrid mode is that you can serve clients that issue MDX queries (most of them) and clients that issue DAX queries (Crescent) from the same model. In Denali, only clients that issue DAX queries can use DirectQuery. Clients such as Excel which issue MDX queries must fetch results from the VertiPaq cache.

One disadvantage – if you have a hybrid mode, you must process the database. If your data doesn’t fit in memory, you have a problem.

Q: What is the connection string property to flip between VertiPaq and DirectQuery modes for models in hybrid mode?

A: The DirectQueryMode connection string property allows you to toggle between modes. This property takes one of three values:

  • Default
  • DirectQuery
  • InMemory

This connection string property doesn’t work in CTP3. Greg Galloway filed a Connect item on this, the issue has been resolved as fixed and the fix should be in the Denali release.

Pretend you are in the future where this connection string property is useful. Let’s also assume you have deployed a model in DirectQuery with VertiPaq mode for the Crescent + Excel scenario described above. The connection string you would use in Excel is something like:

Provider=MSOLAP.5;Integrated Security=SSPI;Persist Security Info=True;Data Source=.;Initial Catalog=DirectQuery test;DirectQueryMode=InMemory;

Q: DirectQuery models can have multiple partitions. Does that mean I can query a model and have it return most of my results from the VertiPaq cache and some of my results directly from the data source?

A: Not in Denali, no. Query results come either from the VertiPaq cache or directly from the data source. They never come from both at the same time.

Q: What is a DirectQuery only partition? Why would I have one?

A: A DirectQuery only partition is a partition that can never be processed. It is always empty. There are two reasons to have a DirectQuery only partition:

  • You have a DirectQuery only model. Although the engine allows you to process partitions in a DirectQuery only model, it makes no sense to do so. You can set the partition to be DirectQuery only so the partition is never processed.
  • You want to have a table with multiple partitions in a hybrid model. The partition definition for the DirectQuery partition would overlap with the partition definitions for the VertiPaq cache. Therefore, to avoid processing errors, the DirectQuery partition is set to DirectQuery only (which shows up as “Never Process” in BIDS)

Q: Can you give me an example of how I would configure a DirectQuery only partition?

A: Sure. I’ll do it step-by-step in the next post.

[Edit 9/15 - the planned post got pre-empted from other user requested content. The post is ready, see Configuring multiple partitions for a DirectQuery model]