DAX: The Language
This is a collaborative blog post by @konstantinos and @smoupre resulting from the forum discussion topic “Aggregating Duration/Time”. http://community.powerbi.com/t5/Desktop/Aggregating-Duration-Time/m-p/13350/highlight/true#M3358
Introduction
For users coming from an Excel background, it is easy to fall into the trap of using DAX in the same manner as Excel’s formula language. This is a natural choice as both Excel’s formula language and DAX are function-heavy languages. However, while DAX cannot be considered a true programming language in the sense of something like C# or Java due to the lack of some of the hallmarks of a true programming language (such as looping), DAX does blur the lines and is much more powerful than Excel’s formula language. Specifically, DAX supports comments and variables within the language itself and the use of these features can vastly improve the maintainability and supportability of a solution.
Scenario
Consider a scenario presented in Correlation, Seasonality and Forecasting with Power BI. In that article multiple “transient” columns and measures were used. These columns and measures were really simply a series of steps used to arrive at the ultimate desired measure, such as Correlation. In the article, the steps for calculating Correlation were given as:
- In forecasting, create new measure: AverageX = AVERAGEX(ALL(forecasting),[Sales])
- In wages, create new measure: AverageY = AVERAGEX(ALL(wages),[Wages])
- In forecasting, create new column: CorrelateX = [Sales] - [AverageX]
- In wages, create new column: CorrelateY = [Wages] - [AverageY]
- In forecasting create new column: CorrelateX*CorrelateX = [CorrelateX] * [CorrelateX]
- In forecasting create new column: CorrelateX*CorrelateY = [CorrelateX] * RELATED(wages[CorrelateY])
- In forecasting create new column: CorrelateY*CorrelateY = RELATED(wages[CorrelateY]) * RELATED(wages[CorrelateY])
- In forecasting, create new measure: SumCorrelateX*CorrelateX = SUMX(ALL(forecasting),[CorrelateX*CorrelateX])
- In forecasting, create new measure: SumCorrelateX*CorrelateY = SUMX(ALL(forecasting),[CorrelateX*CorrelateY])
- In forecasting, create new measure: SumCorrelateY*CorrelateY = SUMX(ALL(forecasting),[CorrelateY*CorrelateY])
- In forecasting, create new measure: Correlation = [SumCorrelateX*CorrelateY] / SQRT([SumCorrelateX*CorrelateX]*[SumCorrelateY*Cor relateY])
As one can see, the ultimate goal is to simply get to a measure called Correlation. All other columns and measures are simply a means to an end. Unfortunately, all of these columns and measures in effect clutter up the workspace and if not hidden from the end user could confuse the end user as to their ultimate purpose. Furthermore, as the author, if attempting to revisit the solution six months from now, one would have to, in effect, reverse engineer the solution by tracing back from Correlation the calculations for all other transient columns and measures.
The Power of DAX
Luckily, the powerful language-like features of DAX mentioned previously, comments and variables, help us improve our solution dramatically.
Instead of the 11 measures and columns created above, we could do this entirely within 2 custom columns and a single measure, with detailed explanations as to what is being calculated and why:
- In forecasting, create the following custom column:
CorrelateX2 =
// Gregory J Deckler - 2/18/2016
// Create a new column that subtracts the Sales from the average of Sales
//
// First, calculate the average of Sales, making sure to remove any filters on forecasting table
VAR AverageX2 = AVERAGEX(ALL(forecasting),[Sales])
// Return the Sales from the current row minus the average of all Sales
RETURN ( [Sales] - AverageX2)
- In wages, create the following custom column:
CorrelateY2 =
// Gregory J Deckler - 2/18/2016
// Create a new column that subtracts the Wages from the average of Wages
//
// First, calculate the average of Wages, making sure to remove any filters on wages table
VAR AverageY2 = AVERAGEX(ALL(wages),[Wages])
// Return the Wages from the current row minus the average of all Wages
RETURN ([Wages] - AverageY2)
- Create the following measure:
Correlation2 =
// Gregory J Deckler - 2/18/2016
// This measure calculates the correlation between Sales and Wages using previously created custom columns
// [CorrelationX2] (Sales) and [CorrelationY2) (Wages)
// [CorrelationX2] and [CorrelationY2] calculate the difference of the current Sales or Wages from the average
//
// First, sum the product of the difference in averages for Sales squared
VAR SumCorrelateX2SumCorrelateX2 = SUMX(ALL(forecasting),[CorrelateX2]*[CorrelateX2])
// Next, sum the product of the difference in averages for Sales multiplied by the related difference in averages for Wages
VAR SumCorrelateX2SumCorrelateY2 = SUMX(ALL(forecasting),[CorrelateX] * RELATED(wages[CorrelateY]))
// Then, sum the product of the difference in averages for Wages squared
VAR SumCorrelateYCorrelateY = SUMX(ALL(forecasting),RELATED(wages[CorrelateY]) * RELATED(wages[CorrelateY]))
// Return the correlation calculated using the formula:
// Sum the product of the difference in averages for Sales multiplied by the related difference in averages for Wages
// divided by the square root of the product
// the sum of the product of the difference in averages for Sales squared multiplied by
// the sum of the product of the difference in averages for Wages squared
RETURN (
SumCorrelateX2SumCorrelateY2 / SQRT(SumCorrelateX2SumCorrelateX2*SumCorrelateYCor<wbr style="box-sizing: inherit; -webkit-user-select: text;">relateY)
)
Let’s break down these formulas to point out the powerful DAX features that you may or may not be aware of and distinguish DAX as a quasi-language much more so than Excel formulas.
Comments
Comments in DAX begin with “//”. Once DAX sees this indicator, everything up until the end of the line (line break, use Alt-Enter) within the formula is considered a comment. This means that we can place comments on a line by themselves or after a particular bit of code, as with the Seconds calculation above where we use ROUNDUP.
Variables
Variables in DAX are created through the use of a function, perhaps rather ignobly designated to the category of “Other”, called VAR. VAR cannot be used independently and must be accompanied by the use of the nearly completely undocumented RETURN function. In fact, RETURN is only documented as part of the documentation on the VAR function: https://msdn.microsoft.com/en-us/library/mt243785.aspx.
Conclusion
DAX includes powerful programming-language-like features such as comments and variables that provide superior functionality over Excel’s formula language. To get equivalent functionality in Excel, one must resort to VBScript.
These powerful constructs within DAX allow for the creation of solutions that are superior in every respect with regards to readability, maintainability and supportability over time. In the final solution, all components and intermediary steps that comprise the final Correlation2 measure are entirely contained within that measure and two custom columns instead of 11 custom columns and measures, allowing one to easily digest exactly how that measure is calculated without the need for disjointed intermediary columns or reverse engineering.