Create a Calculated Column
This topic shows you how to create a calculated column based on data in the DAX sample workbook, and includes information about using AutoComplete when building a formula. For more information about formulas, see Build Formulas for Calculations.
Understanding Calculated Columns
A calculated column is a column that you add to an existing PowerPivot table. Instead of pasting or importing values in the column, you create a DAX formula that defines the column values. The calculated column can be used in PivotTable or PivotChart as you would any other data column.
The formulas in calculated columns are much like the formulas that you create in Excel. Unlike in Excel, however, you cannot create different formulas for different rows in a table; instead, the DAX formula is automatically applied to the entire column.
When a column contains a formula, the value is computed for each row. The results are calculated for the column as soon as you create the formula. Column values are then recalculated as necessary, such as when the underlying data is refreshed.
You can create calculated columns that are based on measures and other calculated columns. For example, you might create one calculated column to extract a number from a string of text, and then use that number in another calculated column.
Creating a Calculated Column
A calculated column is based on data that you have added into an existing table. For example, you might choose to concatenate values, perform addition, extract substrings, or compare the values in other fields. To add a calculated column, you must have already added at least one table in your PowerPivot workbook.
This example demonstrates the use of AutoComplete to build a simple formula for use in a new calculated column. The formula is as follows:
=EOMONTH([StartDate],0])
The formula extracts the month from the StartDate column in the Promotion table in the DAX sample workbook. It then calculates the end of the month value for each row in the Promotion table. The second parameter specifies the number of months before or after the month in StartDate; in this case, 0 means the same month. For example, if the value in the StartDate column is 6/1/2001, the value in the calculated column will be 6/30/2001.
For information about the sample workbook, see Get Sample Data for PowerPivot.
Note
In Windows Vista and Windows 7, features in the PowerPivot window are available on a ribbon, which is discussed in this topic. In Windows XP, features are available from a set of menus. If you are using Windows XP and want to see how the menu commands relate to the ribbon commands, see The PowerPivot UI in Windows XP.
To create a calculated column using AutoComplete |
|
Naming a Calculated Column
By default new calculated columns are added to the right of other columns in a worksheet, and the column is automatically assigned the default name of CalculatedColumn1, CalculatedColumn2, and so forth. You can rearrange and rename columns after they are created. Be aware of the following restrictions on changes to calculated columns:
Each column name must be unique within a table.
Avoid names that have already been used for measures within the same workbook. Although it is possible for a measure and a calculated column to have the same name, if names are not unique you can easily get calculation errors. To avoid accidentally invoking a measure, when referring to a column always use a fully qualified column reference.
When you rename a calculated column, any formulas that rely on the existing column must be updated. Unless you are in manual update mode, updating the results of formulas takes place automatically. However, this operation might take some time.
There are some characters that cannot be used within the names of columns, or in the names of objects in a PowerPivot workbook. For more information, see "Naming Requirements" in DAX Syntax Specification for PowerPivot.
To rename or edit an existing calculated column |
|
Understanding the Performance of Calculated Columns
The formula for a calculated column can be more resource-intensive than the formula used for a measure. One reason is that the result for a calculated column is always calculated for each row in a table, whereas a measure is only calculated for the cells that are used in the PivotTable or PivotChart.
For example, a table with a million rows will always have a calculated column with a million results, and a corresponding effect on performance. However, a PivotTable generally filters data by applying row and column headings; therefore, the measure is calculated only for the subset of data in each cell of the PivotTable.
A formula has dependencies on the objects that are referenced in the formula, such as other columns or expressions that evaluate values. For example, a calculated column that is based on another column, or a calculation that contains an expression with a column reference, cannot be evaluated until the other column is evaluated. By default, automatic refresh is enabled in workbooks; therefore, all such dependencies can affect performance while values are updated and formulas refreshed.
To avoid performance issues when you create calculated columns, follow these guidelines:
Rather than create a single formula that contains many complex dependencies, create the formulas in steps, with results saved to columns, so that you can validate the results and assess performance.
Modification of data will often require that calculated columns be recalculated. You can prevent this by setting the recalculation mode to manual; however, if any values in the calculated column are incorrect the column will be grayed out, until you refresh and recalculate the data.
If you change or delete relationships between tables, formulas that use columns in those tables will become invalid.
If you create a formula that contains a circular or self-referencing dependency, an error will occur.
See Also
Concepts
Other Resources
Key Concepts in DAX