Create and manage aggregations

Completed

When aggregating data, you summarize that data and present it in at a higher level of granularity. For example, you can summarize sales data and group it by date, customer, product, and so on. The aggregation process reduces the table sizes in the semantic model, allowing you to focus on important data and helping to improve the query performance.

Diagram shows theory behind aggregating data.

Your organization might decide to use aggregations in their semantic models for the following reasons:

  • You work with large volumes of data. In this case, aggregations provide better query performance and help you analyze and reveal the insights over this large data. Aggregated data is cached and, therefore, uses a fraction of the resources that are required for detailed data.
  • You experience a slow data refresh. In this case, aggregations help to speed up the refresh process. The smaller cache size reduces the refresh time, so data gets to users faster. Instead of refreshing what could be millions of rows, you refresh a smaller amount of data instead.
  • You have a large semantic model. In this case, aggregations help to reduce and maintain the size of your model.
  • You anticipate future growth of your semantic model. In this case, you can use aggregations as a proactive step toward future proofing your semantic model by lessening the potential for performance and refresh issues and overall query problems.

Continuing with the Tailwind Traders scenario, you have taken several steps to optimize the performance of the semantic model, but the IT team has informed you that the file size is still too large. The file size is currently 1 gigabyte (GB), so you need to reduce it to around 50 megabytes (MB). During your performance review, you identified that the previous developer didn't use aggregations in the semantic model, so you now want to create aggregations for the sales data to reduce the file size and further optimize the performance.

Create aggregations

Before you create aggregations, you should decide on the level of granularity on which you want to create them. In this example, you want to aggregate the sales data at the day level.

When you decide on the grain, the next step is to decide on how you want to create the aggregations. You can create aggregations in different ways and each method will yield the same results, for example:

  • If you have access to the database, you can create a table (or view) and then import it into Power BI Desktop.
  • In Power BI Desktop, you can use Power Query to create the aggregations step-by-step.

In this example, you open a query in Power Query and notice that the data has not been aggregated; it has over 999 rows, as illustrated the following screenshot.

Screenshot displays total number of rows before aggregation.

You want to group by the OrderDate column and summarize the OrderQuantity and SalesAmount columns. Start by selecting Choose Columns on the Home ribbon tab. In the window that opens, select the columns that you want in the aggregation, and then select OK.

Screenshot shows how to choose columns when aggregating data.

When the selected columns display on the page, select the Group By option on the Home ribbon tab. In the window that opens, select the column that you want to group by (OrderDate) and enter a name for the new column (OnlineOrdersCount).

Select the Advanced option, and then select the Add aggregation button to configure another column row. Enter a name for the aggregation column, select the operation of the column, and then select the column to which you want to link the aggregation. Repeat these steps until you have added all the aggregations, and then select OK.

Screenshot shows how to use the Group by button.

It might take a few minutes for a preview of your aggregation to display, but when it does, you'll see how the data has been transformed. The data will be aggregated into each date, and you will be able to see the values for the orders count and the respective sum of the sales amount and order quantity.

Screenshot of the resulting table after aggregation.

Select the Close and Apply button to close Power Query Editor and apply the changes to your semantic model. In Power BI Desktop, on the Home ribbon tab, select Refresh. Observe the screen because a brief message will display the number of rows that your semantic model has loaded. This number of rows should be significantly less than the number that you started with. You can also see this number when you open Power Query Editor again, as illustrated in the following screenshot. In this example, the number of rows was reduced to 30.

Screenshot of the total rows and columns after aggregation.

Remember, you started with 999+ rows. Using aggregation has significantly reduced the number of rows in your semantic model, which means that Power BI has less data to refresh and your model should perform better.

Manage aggregations

You can later manage aggregations in Power BI Desktop to make changes to their behavior, if required.

You can open the Manage Aggregations window from any view in Power BI Desktop. In the Data pane, right-click the table and then select Manage aggregations.

Screenshot shows the manage aggregations window.

For each aggregation column, you can select an option from the Summarization dropdown list and make changes to the selected detail table and column. When you are finished managing the aggregations, select Apply All.

For more detailed information on how to create and manage aggregations, see Use aggregations in Power BI Desktop.