Share via


Focus on... Azure Planned Maintenance!

"Azure periodically performs updates to improve the #reliability, #performance, and #security of the host infrastructure for virtual machines.  All you need to get up to speed, in one post!"

During a recent customer workshop, as we explored and started to map out their cloud journey, I delved deep into a really good discussion regarding how Microsoft manages the underlying fabric of the Azure platform, vs. how these would be done in typical on premises environments. One of the advantages gained by customers moving to Azure is the need to manage/patch/update the physical infrastructure is removed, along with all the maintenance and management of each of these components making up the space, power, servers, storage, network etc. typically associated with a data centre environment - within Azure, this maintenance still needs to occur, however this is managed by Microsoft.

Over the past year, a number of announcements have been made increasing the level transparency to customers of how these updates are performed, allowing our customers to better manage the availability of their core services and workloads. I wanted to take the opportunity to collate as many of theses resources currently available into this single 'Focus on...' post, such that anyone can quickly skill-up in understanding how to take advantage of these new capabilities, such as the Planned Maintenance and Scheduled Events features within your own Azure deployments.

> Bookmark this short URL! https://aka.ms/focuson/apm > Last Updated: 23rd January 2018 (periodically updated as a reference / index to relevant resources)

 

>> Introducing... Azure Planned Maintenance!

Azure periodically performs updates to improve the reliability, performance, and security of the host infrastructure for virtual machines. These updates range from patching software components in the hosting environment (like operating system, hypervisor, and various agents deployed on the host), upgrading networking components, to hardware decommissioning. The majority of these updates are performed without any impact to the hosted virtual machines. However, there are cases where updates do have an impact:

  • If the maintenance does not require a reboot, Azure uses in-place migration to pause the VM while the host is updated;
  • If maintenance requires a reboot, you get a notice of when the maintenance is planned. In these cases, you'll also be given a time window where you can start the maintenance yourself, at a time that works for you.

There have been a number of improvements to the planned maintenance experience in Azure, including better visibility and control of maintenance events that impact virtual machine availability - this introductory video covers how to create alerts, discover which virtual machines are scheduled for maintenance, and proactively start the maintenance using the Azure portal, REST API, Azure PowerShell, or Azure CLI.

 

Video: Virtual Machine Planned Maintenance

 

During a communicated window, customers can choose to start maintenance on their virtual machines. If you do not utilize the window, the virtual machines will be rebooted automatically during a scheduled maintenance window (which is visible to you). Starting the maintenance will result in the VM being redeployed to an already-updated host. While doing so, the content of the local (temporary) drive will be lost.

Native cloud applications running in a cloud service, availability set, or virtual machines scale set, are resilient to planned maintenance since only a single update domain is impacted at any given time.

You may want to use proactive-redeploy in the following cases:

  • Your application runs on a single virtual machine and you need to apply maintenance during off-hours;
  • You need to coordinate the time of the maintenance as part of your SLA;
  • You need more than 30 minutes between each VM restart even within an availability set;
  • You wish to take down the entire application (multiple tiers, multiple update domains) in order to complete the maintenance faster.

Scheduled Events is one of the subservices under Azure Metadata Service that surfaces information regarding upcoming events (for example, reboot). Scheduled events give your application sufficient time to perform preventive tasks to minimize the effect of such events. Being part of the Azure Metadata Service, scheduled events are surfaced using a REST Endpoint from within the VM. The information is available via a Non-routable IP so that it is not exposed outside the VM.

 

Video: Using Azure Scheduled Events to Prepare for VM Maintenance

 

>> Documentation

For over 18 months now, docs.microsoft.com has been running as our new unified technical documentation experience; to learn more check out our blog post: /en-us/teamblog/introducing-docs-microsoft-com. For additional documentation on Microsoft products or services, please visit MSDN (https://msdn.microsoft.com/) or TechNet (https://technet.microsoft.com/).

 

Planned Maintenance Documentation /

There are a number of useful articles to be aware of, dependent on the operating system of your virtual machine, as there will be some specific differences in how you can query the metadata service for upcoming scheduled events.

For Windows Virtual Machines:

For Linux Virtual Machines:

An industry-wide, hardware-based security vulnerability was disclosed on January 3 - additional guidance was published to docs.microsoft.com to cover off background and frequently asked questions related to the Azure Planned Maintenance activities:

 

Azure Architecture Center /en-us/azure/architecture/

The Azure Architecture Center is the official centre for guidance, blueprints, patterns, and best practices for building solutions with Microsoft Azure, curated by the Microsoft patterns & practices team. Specifically in the context of mitigating the potential impact of maintenance events, applications should look to take advantage of high availability options, such as availability sets and availability zones (in preview at the time of writing):

There are also a number of Cloud Design Patterns regarding availability and resiliency, which where possible should be architected into your application. Availability defines the proportion of time that the system is functional and working. It will be affected by system errors, infrastructure problems, malicious attacks, and system load. It is usually measured as a percentage of uptime. Cloud applications typically provide users with a service level agreement (SLA), which means that applications must be designed and implemented in a way that maximizes availability.

Resiliency is the ability of a system to gracefully handle and recover from failures. The nature of cloud hosting, where applications are often multi-tenant, use shared platform services, compete for resources and bandwidth, communicate over the Internet, and run on commodity hardware means there is an increased likelihood that both transient and more permanent faults will arise. Detecting failures, and recovering quickly and efficiently, is necessary to maintain resiliency.

 

>> Updates & Roadmap

As the Azure platform continues to evolve, be aware of these sites so you can subscribe to the latest updates and feature releases.

 

Azure Blog https://azure.microsoft.com/en-us/blog/

Hear from Azure experts and developers about the latest information, insights, announcements, and news in the Microsoft Azure blog.

 

Azure Updates Blog https://azure.microsoft.com/en-us/updates/

In addition to the Azure Blog, further detail on all updates into Azure are available on the Azure Updates Blog.

  • Preview: Azure Service Health (10th July 2017)
    https://azure.microsoft.com/en-us/updates/azure-service-health-preview/
    "Azure Service Health Preview provides guidance and support when issues in Azure services affect you. It provides timely and personalized information about the impact of service issues and helps you prepare for upcoming planned maintenance."

 

Azure Roadmap https://azure.microsoft.com/en-us/roadmap/

As Azure continues to grow, you will want to stay informed. The product roadmap is the place to find out what’s new and what’s coming next. Let us know what you think by providing feedback and voting on items. You can also subscribe to notifications, so you’ll always be the in the know.

 

>> Podcasts

Listening to Podcasts can be a great way to keep up to date, especially when you're out and about, perhaps in the car on the way to work for example. While much of the Channel 9 content is also available in audio format, there are a small number of podcasts that have touched on Planned Maintenance in the past.

 

Microsoft Cloud Show https://www.microsoftcloudshow.com/

Whether you are new to the cloud, old hat or just starting to consider what the cloud can do for you this podcast is the place to find all the latest and greatest news and information on what's going on in the cloud universe. Join long time Microsoft aficionados and SharePoint experts Andrew Connell and Chris Johnson as they dissect the noise and distil it down, read between the lines and offer expert opinion on what is really going on. Just the information … no marketing … no BS, just two dudes telling you how they see it.

 

>> Presentations

Throughout the year, Microsoft hosts a number of public events allowing both in-person and online attendance, while common to all is on-demand access to the recordings of most, if not all sessions presented. These are often given by the engineering teams working closely on the Azure platform itself, or by experienced architects who are working deep in the field in implementing Azure services to solve customer's business challenges.

 

Ignite 2017 - 25th to 29th September 2017 https://myignite.microsoft.com/videos/

Microsoft Ignite brings together the best of previously individual conferences - Microsoft Management Summit; Microsoft Exchange, SharePoint, Lync, Project, and TechEd conferences - into a single annual event, last held 25th to 29th September 2017 and showcases the company’s enterprise products and services, while providing incredibly valuable IT training. It also provides plentiful opportunities for IT professionals to get together for collaboration and networking.

 

Tuesday with Corey
https://channel9.msdn.com/Shows/Tuesdays-With-Corey

Corey Sanders answers your questions about Microsoft Azure - Virtual Machines, Web Sites, Mobile Services, Dev/Test etc. If you have a question, Corey will find the answer!

 

Azure Friday
https://channel9.msdn.com/Shows/Azure-Friday

Join Scott Hanselman as he engages one-on-one with the engineers who build the services that power Microsoft Azure as they demo capabilities, answer Scott's questions, and share their insights. Follow us at: friday.azure.com.

  • Using Azure Scheduled Events to Prepare for VM Maintenance (27th July 2017)
    https://channel9.msdn.com/Shows/Azure-Friday/Using-Azure-Scheduled-Events-to-Prepare-for-VM-Maintenance
    "Eric Radzikowski joins Scott Hanselman on Azure Friday to discuss how developers can increase application availability by using Azure Scheduled Events to prepare for virtual machine maintenance."
  • Azure Service Health (15th September 2017)
    https://channel9.msdn.com/Shows/Azure-Friday/Azure-Service-Health
    "Dushyant Gill joins Scott Hanselman to talk about Azure Service Health. When issues in Azure services affect your business-critical resources, Azure Service Health notifies you and your teams, helps you understand the impacts of the issue, and keeps you updated as the issue is resolved. It also helps you prepare for planned maintenance and changes that could affect the availability of your resources."
  • Virtual Machine Planned Maintenance (18th September 2017)
    https://channel9.msdn.com/Shows/Azure-Friday/Virtual-Machine-Planned-Maintenance
    "Ziv Rafalovich joins Scott Hanselman to talk about improvements to the planned maintenance experience in Azure, including better visibility and control of maintenance events that impact virtual machine availability. Learn how to create alerts, discover which virtual machines are scheduled for maintenance, and proactively start the maintenance using the Azure portal, REST API, Azure PowerShell, or Azure CLI."

 

Microsoft Azure on YouTube https://www.youtube.com/channel/UC0m-80FnNY2Qb7obvTL_2fA

Supporting videos and material are also posted independently to YouTube.

  • Azure Service Health - Planned Maintenance (3rd November 2017)
    https://www.youtube.com/watch?v=vgYqm-Y74y8
    "Stay informed and prepare for maintenance activities in Azure to minimize their impact on business critical applications."

 

>> Code Samples

Various sample and introductory code snippets to take advantage of Planned Maintenance and Scheduled Events functionality.

 

Azure Code Samples https://azure.microsoft.com/en-us/resources/samples/

Learn to interact with Azure services through code. A number of code samples are published via the Azure Code Samples library.

All Azure Code samples are available via GitHub.

Additionally, the following code sample can be found directly on GitHub:

 

Azure Quickstart Templates https://azure.microsoft.com/en-us/resources/templates/

Deploy Azure resources through the Azure Resource Manager with community contributed templates to get more done. Deploy, learn, fork and contribute back. With Resource Manager, you can create a template (in JSON format) that defines the infrastructure and configuration of your Azure solution. By using a template, you can repeatedly deploy your solution throughout its lifecycle and have confidence your resources are deployed in a consistent state.

All Azure Quickstart Templates are available via GitHub.

 

>> Community

There are a large number of users of Azure out in the community, with many taking the time to document and share their experiences of using the Azure services. I've included a selection of individuals and articles here, but please let me know if you've found and can recommend other good resources.

 

Bob Rouderbush https://roudybob.blog/

As a Cloud Solution Architect here at Microsoft, Bob Rouderbush maintains a personal blog on roudyblb.blog.

 

Daniel Petri https://www.petri.com/

Launched by Daniel Petri in 1999, the The Petri IT Knowledgebase has served as a leading content and community resource for IT professionals and system administrators for more than 15 years.

 

CUGC - Citrix User Group Community https://www.mycugc.org/

For the users, by the users, CUGC are dedicated to helping members and businesses excel. Members are technology professionals interested in maximizing the value of Citrix and partner products.

  • NetScaler HA on Microsoft Azure “Planned Maintenance” (19th October 2017)
    https://www.mycugc.org/blog/netscaler-ha-on-microsoft-azure-planned-maintenance
    "Citrix NetScaler High Availability on Microsoft Azure has never been an easy subject, especially after Microsoft supported multi IP/NICs on Azure Virtual Machines a couple of months ago. The debate still rages on today about how NetScaler HA should be configured, nevertheless, a recent announcement by Microsoft on a New Planned Maintenance Experience for Azure Virtual Machines could change all that. Let's discuss NetScaler HA options on Azure before diving into the New Planned Maintenance Experience 'Proactive-Redeploy.'"

 

Bert Wolters https://www.azureman.com/

The personal blog of Bert Wolters, MVP, currently working as a Technical Consultant at inspark in The Netherlands.