Share via


Stability and Change - Why and How to adopt WaaS.

Hello everyone,

Today I am writing you after my visit to the DevOps Days in Singapore.
The intention of this article is to give you a very brief overview of the current changes in IT, which you really should be aware of:

  • Why is Windows 10 not just a simple operating system as the previous one and why you really should be aware of this.
  • What does Digital Transformation mean and how does Windows 10 come into play?
  • How does Windows as a Service and Modern IT come into play here?
  • How can you adopt Windows as a Service and Windows 10 in a long-term perspective?

IT Measurement:

First of all let us think about, how IT performance can be measured.
For this I quote the current 'State of DevOps report 2017' (left), as well the report from Forrester 'The Digital Business Imperative' (right):



Okay let us recap - the left statement surely is directed completely into the DevOps area - especially when building your own tools. The quintessence of the right statement is definitely:"Digitize Your Business Strategy" and not just "to add an app here or a site there".

Both statements can be concluded to two very essential points:

"Stability" is always tied to IT-performance, because it has ever been one of the most important requirements to IT. You know the sentences like "never touch a running system".  IT just needs to work. But then - why "Change", and what is actually meant with "Change"?

When doing your Digital Transformation you want to digitize your business strategy and bring in an effective business value.
In a sentence: You want to leverage IT not to just being a cost factor, but to make value out of it.  

This is led by bringing in these "Changes" to adopt Modern IT, DevOps, Modern Apps, Everything as a Service and many more. "Change" includes to being up to date, having the most current applications, drivers and OS; to have the possibility to adopt new technologies and services just as they arrive. "Change" also ensures a very secure environment, because you regularly apply patches and updates and also use the current security features. "Change" in this slide is a requirement for all, what it takes to place in the value-leveraging technologies, features and services.  

Therefore, the IT gets the requirement to adopt an increasing amount of changes in faster paces, but still ensuring the stability. But what does "stability" actually mean? Historically, it means that you don´t encounter any issues at any time. But in DevOps we know that you have to bring also some additional other aspects into play - like "remediation time". It´s not only about preventing any issues, but being able to remediate occuring issues very fast.


Windows as a Service

Now, having this information - let us take the turn over to Windows as a Service, which exactly addresses this all:

Windows 10 with its releasing pace of two times per year ensures the most current and completely patched system, which therefore results in an increased security level. In addition, it is continuously delivering new features in terms for security and administration, but is also allowing to adopt new CPU architectures with their features, and leveraging performance and security on a hardware layer. With Edge, the UWP, MDM and MFA Windows  is addressing modern websites, applications and services, which ensures secure authentication and being manageable anywhere on the world just over the Internet. And, because of having a current OS with current technologies in place, you have always the possibility and agility to jump on the Emerging Technologies without long introduction times.

In a sentence: Windows 10 is the operating system for Modern IT.  


Why is Microsoft doing this?

I have been very often confronted with sentences like 'Why is Microsoft doing this? '.

But - is it actually 'us' making the change?

Take a look at IT: Development has moved to agile approaches 10 years ago. Every other operating system is having the same approaches some years now, or even since the beginning. We are speaking about DevOps, NoOps, serverless applications and many more. Hackers are improving day by day, leveraging their attacking frameworks with artificial intelligence, and - in fact - as we can see with all the Ransomware attacks going on and being much too successful, the adoption of "Evergreen" seems to be still very bad at our customers. In the same time now, GDPR is getting ready to becoming active and many customers are still figuring out, what GDPR actually means.
In fact, we are the very last ones moving our OS on the train of continuously"Change" and it needs to be there, because IT is moving there.


Long-term aspects for Windows as a Service

One of the challenges, which you might have read in my previous articles but you definitely encountered by yourself, is the adoption of the two Feature Updates per year and transforming your processes and technologies to reduce the recurring amount of work to an absolute minimum.

I will show you here, how this can be achieved:

In this slide you can find three main areas:

  • Information-based Analysis
  • Proactive Testing
  • Reactive Testing

-> Information-based Analysis gives you safeness and control with a high certainty.
By continuously having the insights for your environment, you know about errors, before you encounter them. You should use this approach to find and fix the biggest issues and to always being in control of your environment.

-> Proactive Testing gives you safeness and control with a high certainty.
Manual testing can be even better than just working with the information from Upgrade Readiness, because you can walk through check-lists and ensuring stability. But in an testing environment applications can behave differently and you have to invest a lot of resources. Many customers find also automated testing very promising, but remind yourself that many errors cannot be even found by automated testing. You should use proactive tests for LOB applications, Upgrade tests and (may be) applications, which are known to be problematic or self-developed.

-> Reactive Testing returns the most accurate results by creating the less effort.
Reactive testing is very unconvential in IT today. Why? Because the impression for stability has been completely different in the past 20 years. Though, with the previous provided information, and speaking about Windows 10 - reactive testing creates only very small and controlled downtimes. You define the impacted computer collections and after encountering an issue, you can just retrieve the information of the blocking application/s and roll back your Windows environment for the dedicated users to the previous version. Afterwards you can just pause the deployment of the upgrade, fix the error, and then just turn the deployment on again.

Many of my customers start here:

Cloud services - or in detail - Upgrade Readiness is not adopted and reactive testing is just not allowed.

But as you can see the arrow, the intention needs to be to push the higher percentage numbers in a long-term perspective to the right down corner. You could achieve this by focusing the proactive testing approach only on the most important and focused applications, and in the same space enabling Upgrade Readiness and establishing a slight approach for reactive testing. (just in case) You will need to tune your recurring processes and manual steps as I described in my previous articles.

Then, the next step could look like this:

At this point you still continue to work on your processes and try to automate every task, which needs to take place every release, and therefore reducing the human workload. With the coming releases you will see that the problems with the Feature Updates aren´t that painful and you can continue moving to the reactive approach and just focus the manual workload for the important and necessary things:

This will reduce your effective resource costs and allows you to push the tranformation in your environment. Though, keep in mind that there are no "Best Practices". Something that worked for one customer, can be completely uneffective for the other ones. You have to find a layout, which works well for you and you should never stop improving it.

In addition, if you are not allowed to use Upgrade Readiness by any mean, this circumstance would increase the amount of needed human resources a lot, because the percentages need to be split off to the two other areas. The same applies for Reactive Testing. Keep in mind that Proactive Testing in terms of resources is the worst area to invest the percentages in.

Before finishing now, I hope that you find this information useful and would be very happy, if you could share your thoughts and improvements.

How is WaaS beeing adopted in your company and how well (or bad) are you doing?

And yes - I know that it is hard.
In a transformation, the easiest way to fail is, to handle it as an incremental change.
And - the transformation will come. Either you will bring the transformation, or the transformation will come to you.
(Trust me - the last part will come with a lot of pain in its backpack.)

 

All the best,


David das Neves

Premier Field Engineer, EMEA, Germany
Windows Client, PowerShell, Security

Comments

  • Anonymous
    January 10, 2018
    Very practically written article. The way the change and stability explained here are just amazing.
  • Anonymous
    March 12, 2018
    Perhaps this guidance works in a controlled lab scenario. But in real life, many of us are finding this advice difficult to follow and still find success.Take for example this quote from the article:"Keep in mind that Proactive Testing in terms of resources is the worst area to invest the percentages in."In real life, reactive testing would be the worst area to invest the percentages in. This is because reactive testing causes the user churn and ties up multiple resources (for example, the user’s productivity, the user’s Win10 device, the service desk, and possibly third tier support). Users were more productive in the days before WaaS. Back in those days, they had a stable and reliable computer running Windows 7 SP1 where their applications just worked and disruptions were kept at a minimum.Another quote from the article describes challenges:"... transforming your processes and technologies to reduce the recurring amount of work to an absolute minimum."Yes, this is a significant challenge. It was much less time consuming rolling out service packs for previous generation OSes (Win7 and WinXP) versus trying to keep up with Win10's bi-annual changes, which is a full time job for us. With Win10, we reinstall the operating system over again twice a year. We still proactively test our most important applications and often need to delay rolling out the latest feature upgrades for months because of it. Reactive testing is unacceptable because it involves us crossing our fingers hoping that nothing has broken when feature upgrades are pushed out.
    • Anonymous
      March 12, 2018
      Hi John,Thank you for your comment.First, let me explain that this model is working in real life, but you have to prepare it. I am having customers in all stages - from 'just started' up to 'doing the fine tuning'. In the beginning it is always hard, but at some point it will become just a recurring process like quality updates. This process which is run through every 6 months should be automated and planned as much as you can. In advance for sure. Take a look here:https://blogs.msdn.microsoft.com/daviddasneves/2017/08/12/automating-windows-as-a-service/You need to prepare a proper app compat approach, which means that you categorize your applications in importance / risk. Take a read here: https://blogs.msdn.microsoft.com/cjacks/2016/09/12/windows-10-app-compat-strategy/In addition, you don't want to use the reactive approach against big collections. A 'ring', which we very often explain, is a dedicated collection of machines at a specified time. You know this technique actually with your updates in Windows 7,where you are applying your updates to a test group in advance. A feature update can be treated somehow similar, but you need to catch all the variations as your applications,network segments or OUs. The better you create your rings, the better it will work out. You know in addition which applications are very likely to break with your risk value. Normally these are security applications like AV, endpoint protection, encryption software and everything else, which works with filter drivers. It is very unlikely that you will find applications by your own, which are not working for many reasons. We don't touch the Win32 API too much anymore, the upgrade comes with a validation db and many other technical reasons. You start every 6 months with this process as soon as possible. If you need to wait 1-3 months for a blocking application, this will not interrupt the process at all. All the other information is in the automate article. I will share with you an extensive slide deck later to show you the complete approach. It is actually a very easy calculation. If you have a compatibility rate of 99% and you want to identify all the incompatible applications, you will need to test all of them. In reactive testing you will only work on the applications, which created the problems and have not been mitigated by proactive testing and information based analysis. For the user itself it is an page of maximum 30 minutes and the earlier you find incompatible apps, the better. Though, you could work with Upgrade Readiness. It is very unlikely to find applications, which are not found via the Upgrade Readiness tool and worked on one Windows 10 version but stopped working in the next one. Windows 7 to windows 10 is a normal migration / project. Windows 10 to Windows 10 is a process, which can be automated like this. Take also a look here: http://aka.ms/w10linksKeep also in mind that the upgraded machines which might have blocking problems can be rolled back in a short time frame. Hope this helps. All the best, David