Capacity planning in Power BI embedded analytics

Calculating the type of capacity you need for a Power BI embedded analytics deployment can be complicated. The capacity you need depends on several parameters, some of which are hard to predict.

Some of the things to consider when planning your capacity are:

  • The data models you're using.
  • The number and complexity of required queries.
  • The hourly distribution of your application usage.
  • Data refresh rates.
  • Other usage patterns that are hard to predict.


This article explains how to plan what capacity you need and how to do a load testing assessment for Power BI embedded analytics A-SKUs.

When planning your capacity, take the following steps:

  1. Optimize your performance and resource consumption.
  2. Determine your minimum SKU.
  3. Assess your capacity load.
  4. Set up your capacity autoscale.

Optimize your performance and resource consumption

Before you start any capacity planning or load testing assessment, optimize the performance and resource consumption (especially the memory footprint) of your reports and semantic models.

To optimize your performance, follow the guidelines in the following resources:

For a detailed tutorial on optimizing performance, see the Optimize a model for performance in Power BI training module.

Determine your minimum SKU

The following table summarizes all the limitations that are dependent on the capacity size. To determine the minimum SKU for your capacity, check the Max memory (GB) column under the Semantic model header. Also, keep in mind the current limitations.

SKU Capacity Units (CU) Power BI SKU Power BI v-cores
F2 2 N/A N/A
F4 4 N/A N/A
F8 8 EM1/A1 1
F16 16 EM2/A2 2
F32 32 EM3/A3 4
F64 64 P1/A4 8
F128 128 P2/A5 16
F256 256 P3/A6 32
F5121 512 P4/A7 64
F10241 1,024 P5/A8 128
F20481 2,048 N/A N/A

1 These SKUs aren't available in all regions. To request using these SKUs in regions where they're not available, contact your Microsoft account manager.

Assess your capacity load

To test or assess your capacity load:

  1. Create a Premium Power BI Embedded capacity in Azure for the testing. Use a subscription associated with the same Microsoft Entra tenant as your Power BI tenant and a user account that's signed in to that same tenant.​

  2. Assign the workspace (or workspaces) you'll use to test to the Premium capacity you created. You can assign a workspace in one of the following ways:

  3. As the capacity admin, install the Microsoft Fabric Capacity Metrics app. Provide the capacity ID and time (in days) to monitor, and then refresh the data.

  4. Use the Power BI Capacity Load Assessment Tool to assess your capacity needs. This GitHub repository also includes a video walk-through. Use this tool carefully: test with up to a few dozen concurrent simulated users and extrapolate for higher concurrent loads (hundreds or thousands, depending on your needs.) For more information, see Assess your capacity load. Alternatively, use other load testing tools, but treat the iFrame as a black box and simulate user activity via JavaScript code.

  5. Use the Microsoft Fabric Capacity Metrics app that you​ installed in step 3 to monitor the capacity utilization incurred via the load testing tool. Alternatively, you can monitor the capacity by checking the Premium metrics by using alerts in Azure Monitor.

Consider using a larger SKU for your capacity if the actual CPU incurred on your capacity by the load testing is approaching the capacity limit.

Set up autoscale

You can use the following autoscaling technique to elastically resize your A-SKU capacity to address its current memory and CPU needs.

  • Use the Capacities Update API to scale the capacity SKU up or down. To see how to use the API to create your own scripts for scaling up and down, see a runbook PowerShell script capacity scale-up sample.

  • Use Monitor alerts to track the following Power BI Embedded capacity metrics:

    • Overload (1 if your capacity's CPU has surpassed 100 percent and is in an overloaded state, otherwise 0)
    • CPU (percentage of CPU utilization)
    • CPU Per Workload if specific workloads (like paginated reports) are used
  • Configure the Monitor alerts so that when these metrics hit the specified values, a script run is triggered that scales the capacity up or down.

For example, you can create a rule that invokes the scale-up capacity runbook to update the capacity to a higher SKU if the overload is 1 or if the CPU value is 95 percent. You can also create a rule that invokes a scale-down capacity runbook script to update the capacity to a lower SKU if the CPU value drops below 45 or 50 percent.

You can also invoke scale-up and scale-down runbooks programmatically on demand before and after a semantic model is refreshed. This approach ensures your capacity has enough RAM (GB) for large semantic models that use that capacity.