April 2017
Volume 32 Number 4
[Containers]
Bringing Docker To Windows Developers with Windows Server Containers
By Taylor Brown | April 2017
For the last couple of years Docker and containers have been one of the hottest topics in dev circles, and in enterprises, around the world. The release of Windows Server 2016 last fall added a lot to the conversation by opening containers to Windows developers. How did the world of Windows and Docker come together? It started during the gorgeous Puget Sound summer of 2014 as the Windows Base team embarked on a new project that would ultimately become Windows Server Containers. This is the story behind the code, and a glimpse into what it was like to build one of the top new features in Windows Server 2016.
History of Containers and the Root of Docker
In 2013, containers quickly started generating interest at the keyboard of Solomon Hykes, who at the time was the CTO and founder of a Platform-as-a-Service (PaaS) startup, DotCloud. Hykes took a set of relatively obscure and difficult-to-use Linux kernel features and brought them together under an open source tool he called Docker. He wasn’t intentionally trying to become the king of containers, but was looking for a solution to a problem that plagued DotCloud: How could developers provide code that worked the same way on their servers as it did in their working environment?
A real problem for services like DotCloud stemmed from the extensive and diverse set of software applications customers wanted to deploy—software built with different development processes, different patch cycles and requirements, written in different languages (both code and spoken), and with different dependencies. Hardware virtualization—virtual machines (VMs)—was the best tool available, but it presented challenges in shipping software from developer laptops to production. Either you had to use fully configured VMs from the developer, which made scalability and management difficult, or you had to make deployment tools and scripts to take stock VMs and install the developer’s applications, which isn’t very flexible and can be fragile.
Hykes believed Docker was the answer to this problem and, looking back, he was on to something. However, he wasn’t the first cloud service to look to containers; in fact, it was the needs of a different cloud service that kick-started the whole idea—Google. In 2006, a Linux kernel patch submitted by Rohit Seth, an engineer at Google, added support for grouping processes together under a common set of resource controls in a feature he called cgroups. Seth’s description of that patch starts off with: “Commodity HW is becoming more powerful. This is giving opportunity to run different workloads on the same platform for better HW resource utilization”(bit.ly/2mhatrp). Although cgroups solved the problem of resource isolation, they didn’t solve inconsistent distribution, which is why Docker uses not only cgroups but also another slice of Linux technology: namespaces.
Namespaces were introduced into the Linux kernel in 2002, providing a way to control what resources a process can see and what those resources are called. Namespaces are quite different from access controls because the process doesn’t even know the resources exist or that it’s using a version of them. A simple example of this is the process list: there could be 20 processes running on a server, yet a process running within a namespace might see only five of those processes with the rest hidden from view. Another example might be for a process to think it’s reading from the root directory when in fact it’s been virtualized from another separate location. It’s the combination of cgroups and namespaces and Copy-on-Write (CoW) file-system technologies into an easy-to-use open source product that became the foundation of Docker.
By mid-2013, the Docker toolset that Hykes and his team built began to take off, becoming one of the top trending projects on GitHub and formally launching the Docker brand. Hykes’ focus shifted from DotCloud to Docker and he ultimately spun off the DotCloud business while remaining the CTO of Docker Inc.
Windows Server Containers
During the same period that Docker was gaining notice in Linux circles, the Windows Base team had been looking at ways to isolate and increase the efficiency of Microsoft Azure services that executed customer or third-party code. A Microsoft research prototype code-named “Drawbridge” provided one avenue of investigation; the project had built a process isolation container leveraging a library OS (bit.ly/2aCOQxP). Unfortunately, Drawbridge had limitations relating to maintainability, performance and application compatibility, making it ill-suited as a general-purpose solution. Another even earlier prototype technology referred to as server silos initially seemed worth investigating. Silos expanded on the existing Windows Job Objects approach, which provides process grouping and resource controls (similar to cgroups in Linux) (bit.ly/2lK1AbI). What the server silos prototype added was an isolated execution environment that included file system, registry and object namespaces (similar to namespaces in Linux). The server silos prototype had been shelved years earlier in favor of VMs but would be reimagined as the foundation of Windows Server Containers.
The server silo prototype code hadn’t been looked at in years. It didn’t even compile, let alone function, and it was prototype code written to prove the technique was viable in Windows, but far from production-ready. The team had a choice—start over from scratch or attempt to resurrect the prototype and start from there. We chose the latter. When the prototype was first developed, it was only a small team of developers proving that the technology was viable, but now the full force of the Windows engineering team was behind the project. Architects and engineers from across Windows were drafted to help. The storage team built the file system virtualization; the networking team built the network isolation; the kernel team built the memory management and scheduling abstractions; and so on.
Some big architectural questions remained; in particular, how would we handle system processes? In Linux, a container often runs just a single process that shares the system services in the kernel with the host and other containers. However, to improve serviceability and security, Windows has been moving code out of the kernel and into user mode processes for many years. This represented an issue for the team: Either we could share all the system services, requiring changes to all the system services to make them aware of containers, or we could start a new copy of the user mode system services in each container. This was a difficult decision—we worried about the density and startup time impact of starting new instances of all the user mode services in each container. On the other side, we worried about the complexity and ongoing cost of updating all the system services in Windows, both for us and for developers outside of Windows. In the end we landed on a mix of the two approaches—a select set of services was made container-aware, but most services run in each container.
The impact on density was minimal because the containers share read-only memory with each other and the host, so only the private memory is per-container. Startup time was a significant challenge, however, calling this decision into question many times; when we first demonstrated Windows Server Containers in the keynote of Build 2015, it took several seconds to start, in large part because of the startup time of the system services. However, the Windows Server performance team was on the case. They profiled, analyzed and worked with teams across Windows to make their services faster and reduce dependencies to improve parallelism. The result of this effort not only made container startup faster but actually improved Windows startup time, as well. (If your Xbox or Surface started booting faster last year, you can thank containers.) Container startup went from about seven to eight seconds to a sub-second startup time in less than a year, and this trajectory to reduce startup time continues even today.
Hyper-V Isolation
Often, the first question I get regarding Hyper-V isolation is something like, “Don’t containers provide isolation already? So why do I need Hyper-V?” Containers do provide isolation and for most scenarios that isolation is likely completely sufficient. However, the risk is that if an attacker is able to compromise the kernel it could potentially break out of the container and impact other containers or the host. With kernel exploits being relatively common in Windows (typically several per year), the risk for services like Azure Automation or Azure Machine Learning that consume and execute end-user or third-party code on a shared infrastructure is too high to rely on just kernel isolation. Teams building and operating these types of services either had to manage the density and startup cost of full VMs or build different security and isolation techniques. What was needed was a general-purpose isolation mechanism that was hostile to intruders yet multi-tenant safe: Windows Server Containers with Hyper-V isolation.
The team was already hard at work on Windows Server Containers, and this provided a great experience and management model for teams building the services. By coupling the technology with the well-tested isolation of Hyper-V, we could provide the security required. However, we needed to solve the startup time and density challenges traditionally associated with VMs.
Hyper-V, like most virtualization platforms, was designed to run guests with a variety of OSes both old and new. With the goal of behaving as much like hardware as possible, to achieve these objectives the solution most virtualization platforms chose was emulating common hardware. As virtualization became commonplace, however, OSes were “enlightened” (specifically modified to operate well as a guest VM) such that much of the emulation was no longer required. A good example of this is Hyper-V Generation 2 VMs, which discard emulation in favor of improved startup time and performance, but still achieve the same objective of behaving the same as if the guest was running directly on hardware (bit.ly/2lPpdAg).
For containers, we had a different need and different goals: We didn’t need to run any older OSes and we knew exactly what the workload inside the VM was going to be—a container. So we built a new type of VM, one that was designed to run a container. To address the need for a fast startup time we built cloning technology. This was always a challenge for traditional VMs because the OS becomes specialized with things like hostnames and identity, which can’t easily be changed without a reboot. But because containers have their own hostname and identity, that was no longer an issue. Cloning also helped with the density challenge, but we had to go further: We needed memory sharing.
There are two approaches to sharing memory. You can look for memory that’s common across multiple VMs and effectively de-duplicate (though memory randomization technology in most kernels makes this difficult). Or you can follow the same approach the kernel does by separating read-only (public) memory from read-write (private) memory. The latter typically requires that the memory manager in guest VMs interact with each other, which is counter to the isolation requirement. However, by changing the way the VMs boot and access files, we found a way where the host doesn’t have to trust the guest and the guests don’t have to trust each other. Instead of the VM booting from and accessing files from a virtual hard disk, it boots and accesses its files directly from the host file system. This means that the host can provide the same sharing of read-only (public) memory. This was the key to improving density by several orders of magnitude and it put us on a path to continue improving density for many years to come.
The other value we discovered with Hyper-V isolation is that by running a different kernel for the container for developers building containerized applications on their Windows 10 machines, we could still run the server kernel, ensuring their applications would work the same way in production as they do on the development machines. Thus, with the Windows 10 Anniversary Update, we enabled Windows Server Containers with Hyper-V isolation and worked with Docker on Docker for Windows to take full advantage of the new technology for developers.
Docker and Windows Server Containers
One question remained—how would users interact with this new platform technology? In the Linux world Docker had been garnering praise and was quickly becoming the de facto standard for container management. Why not enable users to use Windows Server Containers the same way? That fall I flew down to San Francisco to meet with Docker, unsure what the company would think of a Windows-based container and whether it would be interested in building on top of Windows at all. I was in for a surprise: Solomon thought the Windows container idea was great! But would the company build on top of it? That conversation changed the face of the project completely. Solomon simply said, “You know Docker is open source, you can add the code to make it work on Windows and we’ll help,” and we did just that. Since then, John Howard, a software engineer on the Hyper-V team, has become a maintainer of the Docker project and, in fact, has climbed to fourth all-time code contributor (bit.ly/2lAmaZX). Figure 1 shows the basic architecture of containers and Docker across Windows and Linux.
Figure 1 Comparing the Basic Architecture of Containers and Docker Across Windows and Linux
Bringing It All Together
Four months ago at Microsoft Ignite, we launched Windows Server 2016 and announced the expansion of our partnership with Docker, which means it will provide its commercially supported version of the Docker Engine at no additional charge to Windows Server customers. Since then, it’s been a whirlwind of activity. Customers like Tyco have been using Docker and Windows Server Containers to revolutionize the way they build software and to modernize existing applications, all with the same platform (bit.ly/2dWqIFM). Visual Studio 2017 has fully integrated tooling for Windows and Linux containers, including F5 debugging, and Visual Studio Code has Dockerfile and compose support baked right in. Both Azure and Amazon’s container services have added support for Windows Server Containers and well more than 1 million Windows-based container images have been pulled from Docker Hub. To achieve end-to-end security and orchestration, Docker Datacenter is the platform for developers and sysadmins to build, ship and run distributed applications anywhere. With Docker, organizations shrink application delivery from months to minutes, frictionlessly move workloads between datacenters and the cloud and achieve 20 times greater efficiency in their use of computing resources.
When I took on containers I knew it was going to be a high-stress project. I knew it was going to take some long nights, some working weekends and a lot of effort, but it’s worth it because it helped millions of developers build more apps faster. I also knew it was going to be a lot of fun and that it had the opportunity to really change the way people developed and ran applications on Windows. It’s been more fun than I could have ever expected and, while it was also more work than I anticipated, I wouldn’t trade this experience for anything. I recall one weekend early in the project, looking out the window of my office as I worked at a gorgeous, sunny summer day and thinking to myself, “I sure hope people are gonna use this stuff …”
Taylor Brown is a principal program management lead in the Windows and Devices Group at Microsoft. As a member of the Base Windows engineering team he’s responsible for Windows Server Developer strategy, as well as focusing specifically on container technologies, including Windows Server Containers. Brown started his career in Windows working on the 1394/Firewire stack for Windows 2003, then working on ACPI/power management for Windows Server 2003 SP1 before joining the newly formed virtual machine team. Since then he has contributed to every VM technology shipped by Microsoft including Virtual PC, Virtual Server and every version of Hyper-V, making him recognized as an industry expert in virtualization technologies. Reach him at taylorb@microsoft.com.
Thanks to the following technical expert for reviewing this article: David Holladay