This article explains how the Power BI service delivers high availability and provides business continuity and disaster recovery to its users. After reading this article, you should have a better understanding of how high availability is achieved, under what circumstances Power BI performs a failover, and what to expect from the service when it fails over.
What does "high availability" mean for Power BI?
Power BI is fully managed software as a service (SaaS). Power BI is resilient to infrastructure failures so that users can always access their reports. For information about SLAs, see Licensing Resources and Documents.
Power BI uses Azure availability zones to protect Power BI reports, applications, and data from datacenter failures. Availability zones are automatically applied and used for Power BI. Availability zones are fault-isolated locations within an Azure region that provide three or more distinct and unique locations within an Azure region that have redundant power, cooling, and networking. Availability zones allow Power BI customers to run critical applications with higher availability and fault tolerance to datacenter failures. Availability zones provide customers with the ability to withstand datacenter failures without triggering a Power BI failover.
For more information, see What are Azure regions and availability zones?
What is a Power BI failover?
Power BI maintains multiple instances of each component in Azure datacenters (also known as regions) to guarantee business continuity. If there's an outage, or Power BI becomes inaccessible or inoperable in a region, Power BI fails all its components in that region to a backup instance. The failover restores availability and operability to the Power BI service instance in a new region usually within the same geographic location. For more information, see the Microsoft Trust Center.
A failed-over Power BI service instance supports only read operations, which means the following operations aren't supported during failover: refreshes, report publish operations, dashboard or report modifications, and other operations that require changes to Power BI metadata (for example, inserting a comment in a report). Read operations, such as displaying dashboards and displaying reports (that aren't based on DirectQuery or Live Connect to on-premises data sources) continue to function normally.
How are backup instances kept in sync with my data?
All Power BI service components regularly sync their backup instances. There's a 15-minute targeted point-in-time sync for any content uploaded or changed in Power BI. If there's a failover, Power BI uses Azure storage geo-redundant replication and Azure SQL geo redundant replication to guarantee backup instances exist in other regions, and can be used.
Where are the failover clusters located?
Backup instances reside within the same geographic location (geo) that you select when your organization signs up for Power BI, except where noted in the Microsoft Trust Center. A geo can contain several regions, and Microsoft might replicate data to any of the regions within a specific geo for data resiliency. Microsoft doesn't replicate or move customer data outside the geo. For a mapping of the geos offered by Power BI and the regions within them, see the Microsoft Trust Center.
How does Microsoft decide to fail over?
There are two different systems that indicate when a failover might be required:
- External and internal monitoring probes indicate a lack of availability or inability to operate properly. Indications might be based on outages detected in Power BI components or one or more of the services that Power BI depends on in a region.
- The Microsoft Azure central operations team reports on critical outages in a region.
In both cases, Power BI executive team members decide to fail over. The decision isn't automated. After the decision is made, failover is automatic.
How do I know Power BI is in failover mode?
A notification is posted on the Power BI support page. Notification information includes the major operations that aren't available, including publish, refresh, create dashboard, duplicate dashboard, and permission changes.
How long does it take Power BI to fail over?
For a zone failover, Power BI can complete the failover process in approximately 30 seconds and continues to function without limitations. For a regional failover, Power BI takes approximately 15 minutes to become operational again after the decision is made that a failover is required. The time to identify that a failover is required varies, based on the scenario that caused the failover.
Power BI uses Azure Storage GEO replication to perform the failover. Such replications usually have a return point of 15 minutes, however, Power BI can't guarantee a timeframe. For more information, see Azure storage redundancy.
What happens to workspaces and reports if my Premium capacity becomes unavailable?
If a Premium capacity becomes unavailable, workspaces and reports remain accessible and visible to all.
When does my Power BI instance return to the original region?
Power BI service instances return to their original region when the issue that caused the failover is resolved. Check the Power BI support page: When the issue is resolved, the Power BI team removes the notification that describes the failover. At that point, operations should be back to normal.
Am I responsible for the availability of my Power BI solution?
If the Power BI solution used in your organization involves one of the following elements, you must take measures to guarantee that the solution remains highly available:
- If your organization uses Power BI Premium, ensure that the Premium capacity is sized to meet the load demands of your deployment. To help you plan for and meet this requirement, see the Power BI Premium Planning and Deployment white paper. To help with monitoring, new features are regularly added to the admin portal in Power BI and the Power BI Premium Capacity Metrics app.
- If your organization accesses on-premises data sources by using the on-premises data gateway, you must set up the gateway to support high availability, see Manage on-premises data gateway high availability clusters and load balancing. Use this guidance whether you're refreshing reports in import mode, or accessing data or data models by using DirectQuery or Live Connect.
Do gateways function in failover mode?
No. Data required from on-premises data sources (any reports and dashboards based on Direct Query and Live Connect) doesn't work during a failover. The gateway configuration doesn't change though. When the Power BI instance returns to its original state, the gateways return to normal functions.
If there's an extreme disaster in a primary region that prevents you from restoring a gateway for a considerable duration, the failed-over primary region allows read and write operations, so you can redeploy and configure a gateway against the new region.
You can choose to install a new gateway on a different machine or take over an existing gateway. Taking over the existing gateway should be simpler, because all the data sources associated with the old gateway are carried over to the new one.