Configuring multiple partitions in a DirectQuery with VertiPaq model, step by step
In my DirectQuery 101 post, I promised an example of configuring a model with DirectQuery only partitions. Here it is.
Imagine for a moment that AdventureWorks has turned into a fast moving mega-conglomerate, with transactions written to FactInternetSales all day long for years and years. Your job is to implement a model that requires up to the minute reporting on all data in Crescent, plus reporting on all data in Excel. It’s fine for Excel users to see the data updated nightly. However, Crescent users want to see the data in real time. How would you configure a model that satisfies these requirements?
The solution here: because Crescent users want real-time data access and you have Excel users as well, you need a model running in DirectQuery with VertiPaq (known to the engine as DirectQueryWithInMemory) mode. To satisfy the processing requirements, you must have multiple partitions for the model. The partitions will be designed as follows:
- There is one DirectQuery partition that queries the entire fact table. This partition is never processed.
- There are many partitions for the VertiPaq cache. The historical data is in a large partition which is not frequently processed. Incremental processing happens only on the most recent partition.
So let’s get started by creating a new tabular model and turning DirectQuery mode on. Create a new tabular project. Click the Model.bim file and view its properties. Change the DirectQuery Mode property from Off to On:
We can now construct the model. Let’s import FactInternetSales and its related tables from the AdventureWorksDWDenali data source. Once this is done, create a measure on the model. I like to make the Sum of Sales Amount measure on FactInternetSales using the autosum feature for quick testing.
Now let’s configure our partitions for the FactInternetSales table. Open the Partition Manager, here is what you see:
We have one partition in the table. Because there is only one partition in the table, it is marked as the DirectQuery partition (you can tell by the prefix in the partition name). Let’s add some more partitions and configure the DirectQuery partition appropriately.
To keep things simple, I will create only three partitions for the data that Excel users can see. I will have 2001-2010 data all in one partition, the data from Jan – Aug 2011 in a second partition, and current data (from Sept 2011 – present) in a third partition. Here is what that looks like:
You can see that the three new partitions are intended for use only with the VertiPaq cache. You know this because they are not prefixed with DirectQuery in front of their names, and because the “Set as DirectQuery” button is now enabled.
Now that we’ve configured our VertiPaq partitions, let’s configure the DirectQuery partition. Select the FactInternetSales partition in the Partition Manager. Change the name to “All Sales” to better reflect its purpose. Now let’s change its processing option.
The current value for Processing Option is “Allow partition to be processed” (which maps to the engine’s DirectQueryUsage property value InMemoryWithDirectQuery). We can’t use this processing option any more. This is because we now have overlapping partitions, and processing will either fail or return duplicate rows. Also, there is no need to ever process the “All Sales” partition, because it is only used by Crescent in DirectQuery mode. Excel users will get the data from other partitions. Let’s switch to “Never process this partition” (which maps to the DirectQueryUsage property value of DirectQueryOnly):
With that done, press OK. We have successfully configured our partitions.
Now we're ready to deploy our model. Before we deploy, we must change the deployment mode for the model to DirectQuery with VertiPaq so both Crescent and Excel can query the model. Right click the project file (in my case, TabularProject13.smproj) and select Properties. Change the Query Mode from VertiPaq to DirectQuery with VertiPaq:
With that done, deploy. Now we have a partitioned model for use with both Crescent and Excel. Enjoy.
[Edited 9/22 to correct an error - thanks Teo Lachev for finding it]