Scaling Up

Applies To: Windows Server 2003, Windows Server 2003 with SP1

Scaling up means adding hardware, such as RAM or CPUs, to a Web server to increase the number of sites a server can host. Scaling up can be an inexpensive way to boost performance, especially if you host static-content Web sites.

Shortly after moving the server with four 550-MHz processors into production, the Contoso Marketing department finalized plans for a major television advertisement campaign. The marketing director informed the administrators that site traffic would likely double in the next few weeks. The administrators needed to determine whether the server with four 550-MHz processors could handle sustained traffic loads of 600 requests per second while staying within the 30 percent CPU restriction. The cycles-per-request analysis is as follows:

  • 4 x 550 megacycles/second = 2,200 megacycles/second.

  • 2,200,000,000 cycles/second divided by 893 requests/second = 2,463,606 cycles/request (on average). Rounded up, the server performed 2.5 million cycles/request while the CPU was at 100 percent saturation.

  • 600 requests/second 2.5 million cycles/request = 1500 MHz.

  • 1500 MHz divided by 30 percent CPU saturation maximum (1500/.3) = 5000 megacycles/second.

As a result of this analysis, the administrators realized they needed a server that could perform 5,000 megacycles/second in order to use 1,500 megacycles/second at 30 percent CPU. The server with four 550-MHz processors could only achieve 2,200 megacycles/second. Consequently, they would need to scale up, scale out, or both.

To determine the results of scaling up, the administrators tested a server with eight 900-MHz processors. This new server was capable of 7,200 megacycles/second, which covered their current needs.

The administrators migrated the application and performed their WCAT stress test again. In an effort to saturate the CPU at 100 percent, the administrators ran the test with 48, then 72, and finally 120 virtual clients. CPU utilization never exceeded 70 percent. The administrators checked Task Manager and found that the network hovered around 11 percent. The administrators determined at this point that they had a bottleneck somewhere in the system, but for the purpose of their tests, the result of 1,904 requests/second with an average response time of 630 milliseconds was sufficient test data.

The server with four 550-MHz processors performed 893 requests/second. The server with eight 900-MHz processors performed 1,900 requests/second. The installation scaled at a ratio of 1 to 1.6 [(7200/1900)/(2200/893)] when moving from four to eight processors. The administrators understood this to be an acceptable scaling ratio. Realistically, the overhead in system resources and issues such as lock contention do not allow a system to scale linearly; and so they did not expect to see a scale ratio of 1 to 2 when moving from four to eight processors. According to their calculations, the administrators determined that the server with eight 900-MHz processors, barring any unforeseen problems, should scale and perform satisfactorily for their application.