About Early Technology Adoption (Dogfooding)
The Microsoft Com Engineering Operations (MSCOM Ops) team works closely with several Microsoft product groups, including the Windows Server, SQL Server, and Internet Information Services (IIS) teams. www.microsoft.com is currently the fifth largest Web presence on the Internet. We recognize that if we can successfully run pre-release builds of Microsoft products in our www.microsoft.com environments, we can significantly increase the probability that our final products will perform well in our customers' enterprise environments as well.
This article discusses why the MSCOM Ops team dogfoods (adopts early builds of) Microsoft Web platform products. It also describes how we implement this process through our three phases of early product adoption:
· Rollout and Reporting
One of the most difficult things to duplicate in any test environment is performing stress tests that accurately simulate the live traffic volume that production servers handle on a day-to-day basis. We achieve this by building redundancy and scalability into the architecture of our production environments, and by implementing an early adoption process that uses our servers as early test beds for pre-release versions of products. This practice has minimal impact on the users who visit our site.
Benefits of Early Product Adoption
As early adopters of pre-release versions of Microsoft products, the MSCOM Ops team strives to achieve:
· Higher product quality when we release to market (RTM).
· Useful adoption information that we can showcase on our Microsoft.com Operations Tech Center site.
· Realistic data about the performance, reliability, and availability of our site.
· Substance behind the feature list.
· Release criteria that are based on real-life scenarios and results.
Higher Product Quality at RTM
We identify as many bugs as possible in the pre-release version of a product so that the Product team can resolve them prior to RTM. This process helps ensure a more stable and reliable RTM version of the product for customers.
Useful Adoption Information in our Tech Center
As part of our early adoption efforts, the MSCOM Ops team is committed to developing product-specific white papers, articles, blogs, and “How We Do It” webcasts that we make available to the IT Professional community on our Microsoft.com Operations Tech Center site.
Realistic Data about Performance, Reliability, and Availability
With our robust architecture, we have the ability to place servers that are running the latest product build into production to gather performance, reliability, availability, and other system data from actual traffic. This is invaluable data that we provide to the Product team to help raise the overall quality of the product at RTM.
Substance behind the Feature List
Each new version of the product contains features that help ensure a positive end-user experience. By working with each new version, our systems engineers (SEs) gain the hands on experience that allows us to provide real-life scenarios and feedback to the product group. Our SEs also verify that the product functions as advertised and report any bugs they encounter. This process further helps to improve product quality at RTM.
Release Criteria Based on Real-life Scenarios and Results
If the product does not work well in the www.microsoft.com production environment, it is not ready to be shipped to our customers. The majority of our product releases require that the product runs successfully in the www.microsoft.com production environment as a major shipping criterion.
The Three Phases of the Early Product Adoption Process
We separate our dogfooding efforts into three main phases of early product adoption. These phases are illustrated in Figure 1.
Figure 1. The Three Phases of Early Adoption
Each of our three phases of early product adoption is characterized by specific deliverables. Ownership of each phase is assigned to specific members of Microsoft.com Ops who have the experience and expertise necessary to complete the requirements of the particular phase. Team members include Debug team members, SEs (who are either Web or database focused), and program managers (PMs).
Table 1 describes who is responsible for each phase of the early adoption process and their responsibilities.
Table 1. Early Adoption Phase Ownership and Responsibilities
Early Adoption Phase
MSCOM Engineering Operations Debug team
Install product build on single out-of-production server.
Rollout and Reporting
MSCOM Engineering Operations SEs, database administrators (DBAs) and Early Adoption PM
Coordinate early adoption as widely as possible.
MSCOM Engineering Operations Customer Connections PM
Provide first-hand experience and knowledge to customers.
The Investigation Phase
The first thing we do when we investigate a new product for adoption is to assign a Debug team resource to the product group. The Debug team resource must have the appropriate focus and level of technical skills necessary to provide efficient in-depth analysis and provide credible feedback to the product group. For example, when we assist the IIS Product team in releasing a management utility, we assign a resource, who has in-depth experience managing computers that run IIS, as the primary point of contact for the investigation phase. When we investigate a new version of the Windows Server operating system, we assign someone with experience at the driver level because many problems surface from that area.
After the appropriate point person is assigned, we take a single server out of production, and then install the product build on that server. After we install the new build, we run test and verification against this single computer. We then return the server to production to gather data from the real production load. We constantly monitor the server to ensure that it does not have a negative impact on end users. We then analyze the error logs on the server and investigate the root cause of any problems that we observe. We provide this feedback to the Product team, which then applies this feedback and fixes to the bugs that we have reported to a new product build.
We continue the cycle that is illustrated Figure 2 until the product is deemed to be of high enough quality for general use within the MSCOM Operations environment. The timeline and number of cycles necessary to reach this point vary by product.
Figure 2. The Investigation Phase Cycle
The Product Group provides us with new builds to deploy in the investigation phase. The frequency and number of builds varies by product, and overall stability of the builds.
Deployment to Single Server
The Senior SE or Technical Architect on the Debug team identifies a single production server that is removed from production and used as the investigation test bed for each new build.
Testing and Verification
The Debug team uses multiple tools, such as TinyGet and the Web Capacity Analysis Tool (WCAT), and also performs manual tests in the investigation phase. We use these tools to compare results to the existing product that is currently deployed in production, and to test specific scenarios that are co-defined by the Product and Debug team members.
The Debug team places the server deemed as the v-next, or next version, investigation server back into production to begin taking real load. We closely monitor both the server and applications hosted on the server to ensure that we are not negatively impacting the end-user experience. Monitoring the production load allows us to capture real production data (such as traffic, volume, performance, and availability) that are nearly impossible to recreate in a lab environment.
After the v-next investigation server is removed from production, the Debug team members can crawl the server logs and investigate any problems that may have occurred.
Feedback to Product Team
When necessary, the Debug team files bugs against the current build with the Product team, and provides overall health reports until we receive a build that meets the necessary criteria to move into the rollout and reporting phase.
The Rollout and Reporting Phase
After the product has passed the investigation phase, our primary goal is to coordinate early adoption as widely as possible on Web servers and back-end database servers across the properties that we host. We do this by partnering with the application development teams that host in our environment to identify target servers and develop the migration plans.
By widely adopting the product across the multiple server farms at www.microsoft.com, we increase the value of the feedback that we provide to the Product team by hosting various application solutions and architectures on the new product. In effect, we become an extended test team for the Product team by ensuring that more real-life application hosting scenarios are run, and that bugs are identified and resolved by the Product team long before the product is released to our customers.
The Rollout Component
It is the role of the Early Adoption PM to:
· Coordinate the early agreements with the application development teams that host in our environment, and communicate the product release candidate (RC) and RTM schedule.
· Identify target servers and application releases in which the early product adoption migration can occur with minimal impact to end users.
· Create the overall development and test schedule of these application development teams.
· Work with the MSCOM Operations SEs and application development teams to coordinate the migration schedule.
· Identify and help mitigate risks and potential blocking issues.
· Ensure execution of the plan by the SEs.
It is the role of the individual SEs and DBAs to install the new product on the designated target severs, often by leveraging the expertise of the primary Debug resource to assist with the initial installation to get the product up and running. After initial installation is completed, SEs and DBAs develop installation scripts or procedures to help with the migration efforts for all the target servers for which they are responsible in the data center. The SEs and DBAs provide these scripts or procedures to our partner teams and assist them in building their development and test environments. SEs and DBAs also provide input into the overall migration schedule. The PM then defines and coordinates the schedule, which is executed by the individual SEs.
Figure 3 illustrates the process flow that the SEs and DBAs oversee through each beta and RC to RTM. The timeline and number of cycles necessary to reach this point vary by product.
Figure 3. The Rollout and Reporting Phase Cycle – SE and DBA Roles
Pre-release Builds through RTM
In each product cycle, the product team releases beta and then RC builds of the product, until the product reaches RTM. MSCOM Ops deploys each of these pre-release builds to verify its functionality in the production environment.
Perform Initial Installation
In each pre-release build, our SEs and DBAs perform an initial manual installation to a single or small number of servers. This provides the SEs and DBAs a chance to identify any problems with the installation prior to rolling the build out across the entire production environment.
Develop Installation Scripts
After the initial manual installation, the SEs and DBAs develop installation scripts to help them deploy the software across multiple servers.
Install to Target Servers
The SEs and DBAs work in partnership with the Early Adoption PM and the application development teams who host applications on the MSCOM Ops servers to identify target servers for the dogfooding effort. After the installation scripts are completed, the SEs and DBAs deploy the software across all target servers running in the Data Center environments. We deploy to 80 Web servers on the www.microsoft.com Web farm.
Monitoring and Investigation
In the rollout phase the SEs and DBAs monitor the servers in production, crawl logs, and investigate any problems that they find. This is similar to the process followed by the Debug team members in the investigation phase.
Feedback to Product Team
When necessary, the SEs and DBAs file bugs against the current pre-release build with the Product team, and provide overall health reports that provide the Product team with data that helps them determine if they are meeting the quality bar for RTM. This process continues through each beta and RC build as well as the final RTM deployment of the product in our environment.
Figure 4 illustrates the process flow that the Early Adoption PM oversees through each beta release, RC, and RTM. The timeline and number of cycles necessary to reach this point vary by product.
Figure 4. The Rollout and Reporting Phase Cycle – Early Adoption PM Role
Pre-release Builds through RTM
In each product cycle, the product group releases beta and then RC builds of the product, until the product reaches RTM. It is the responsibility of the Early Adoption PM to understand and communicate the overall product schedule including each pre-release build through the RTM release of the product.
Early Adoption Agreements
All of the servers within the MSCOM Operations environment host applications and content developed by groups across Microsoft. It is the role of the Early Adoption PM to work with the application owner teams hosting on our primary environments (such as www.microsoft.com and msdn.microsoft.com) to ensure that we develop shared goals and agree on the target servers and schedule for deployment with each pre-release build. Advanced planning and agreements help to minimize surprises and negative impacts to the development projects that are already in progress.
Identify Risks and Blocking Issues
The Early Product Adoption PM is responsible for identifying potential risks and blocking issues, and mitigation and contingency plans for each deployment of a pre-release build of the software. For example, if an application development team is in the middle of driving toward an application launch, there may be increased risks associated with early adoption efforts (such as resource constraints, change control, and end-user impact) if there is associated downtime. These specific issues are typically mitigated by simply understanding and communicating the overall schedules and identifying target deployment dates that ultimately reduce these risks.
When possible, we try to coordinate an early adoption effort in line with the application development cycle and release schedule to minimize churn, test cycles, and ad hoc changes to the production servers hosting these applications. When coupling is not possible, the agreement is made up front, risks are identified, and mitigation and contingency plans are put in place to ensure that the early adoption process succeeds.
Develop and Execute Deployment Plan
The Early Product Adoption PM works in partnership with the MSCOM Ops SEs and DBAs and the application development teams to develop step-by-step deployment plans, and coordinates the overall execution of each upgrade with each pre-release build through RTM.
The Early Product Adoption PM develops standard reports to meet the needs of various audiences and provides these reports on a predictable basis depending on the audience. For more information about standard reports, see "The Reporting Component" later in this article.
Feedback to Product Team
When necessary, the Early Product Adoption PM files bugs against the current pre-release build with the Product team, and provides overall early adoption project health reports. This process continues through each beta and RC build and the final RTM deployment of the product in our environment.
The Reporting Component
Reporting is a key component of the rollout and reporting phase. As a part of our dogfooding deployment efforts, we must keep upper management and other concerned groups (for example, product teams, tactical deployment teams, and Marketing) up to date with regular, predictable, and consistent reports. The reporting deliverables vary by audience. For example, our audience includes:
· Product teams
· Tactical deployment teams
· Senior management
Product Team Reporting Requirements
Product groups have the following reporting requirements:
· Status and progress against shared target goals with MSCOM Engineering Operations
· Technical bug descriptions
· Product-specific risks
· Blocking issues and mitigation plans
· Overall schedule
· Features being used
· Availability and performance data, and comparison of the dogfood product being rolled out compared to the current product in production
Tactical Deployment Team Reporting Requirements
Tactical deployment teams have the following reporting requirements:
· Status and progress toward shared target goals between MSCOM Ops and the Application Development team
· Project-specific risks
· Blocking issues and mitigation plans
· Pre-release build expiration dates
· Overall release plan and schedule, and progress of each
Senior Management Reporting Requirements
Senior management has the following reporting requirements:
· High level summaries of shared goals between Product, Operations, and Application Development teams
· Status of target goals
· Bug totals and summaries by category
· Highlights and lowlights
Marketing Reporting Requirements
Marketing has the following reporting requirements:
· Historical view of rollout milestones
· Benefits realized by upgrading to v-next from the previous release of the product
· Highlights and lowlights
· Overall schedule
The reporting process allows us to demonstrate the business value of our dogfooding efforts and how we constantly work towards achieving greater efficiency in the process and overall product quality for our customers.
The Evangelism Phase
We take full advantage of these dogfooding efforts to share our first-hand experience and knowledge of running the product in an enterprise environment with our customers. The Customer Connections PM is responsible for communicating, or evangelizing, this information to our customers through a variety of media, including:
· Executive Briefing presentations
· Tech Center content
· MSCOM Operations Forum
We use these information channels to communicate with to our customers prior to, and after a product is released. IT professionals can use this information to better understand the benefits, risks, "gotchas", tips and tricks, and best practices that the Microsoft.com Ops team has discovered. The Customer Connections PM works with the Debugging, SE, and DBA resources to have them produce content directly based on their hands-on experience. This is invaluable information to both internal partners and external customers who are evaluating an upcoming Microsoft Web platform product for adoption in their companies.
Figure 5 illustrates the process flow that the Customer Connection PM oversees through each beta release, RC, RTM, and beyond. The timeline and number of cycles necessary to reach this point varies by product.
Figure 5. The Rollout and Reporting Phase Cycle – Customer Connection PM Role
Early Product Adoption Cycle
As each product enters into the Early Product Adoption Investigation phase, the Customer Connection PM closely engages with both the Product and Debug team members to communicate features, schedule, benefits, risks, and so on.
Identify Topics and Content Owners
With each product, there are several topics of interest to our customers. For example, our customers often ask questions such as:
“What are the top ten features from an MSCOM Ops perspective?”
“How do you migrate from version n to the latest version?”
The Customer Connection PM works with the Debug, SE and DBA subject matter experts to identify these topics and begin developing the content to support the overall product launch.
Coordinate Content Creation
The Customer Connection PM coordinates the overall content creation, delivery vehicles, schedule, and so on to ensure we have the right content at the right time to better support the overall product launch.
The Customer Connection PM publishes the completed content to our various communication vehicles such as the Microsoft.com Operations Tech Center, blogs, podcasts, and so on.
The Customer Connection PM is responsible for coordinating executive briefings between MSCOM Operations and Microsoft external enterprise customers such as the Boeing Company, Washington Mutual, and Safeco. The Customer Connection PM establishes the agenda, builds the overall presentation deck and content, and brings in appropriate subject matter experts, depending on who the customer is and the focus of the topics.
In this article, we discussed why the Microsoft.com Engineering Operations team dogfoods Microsoft server products. We described how we implement this process through the three phases of early product adoption: investigation, rollout and reporting, and evangelism.