Chapter 3 – Risks Addressed Through Performance Testing
Performance Testing Guidance for Web Applications
J.D. Meier, Carlos Farre, Prashant Bansode, Scott Barber, and Dennis Rea
Microsoft Corporation
September 2007
Objectives
- Understand how speed, scalability, and stability are viewed from a performance-testing perspective.
- Learn how performance testing can be used to mitigate risks related to speed, scalability, and stability.
- Learn about the aspects of these risks that performance testing does not adequately address.
Overview
Performance testing is indispensable for managing certain significant business risks. For example, if your Web site cannot handle the volume of traffic it receives, your customers will shop somewhere else. Beyond identifying the obvious risks, performance testing can be a useful way of detecting many other potential problems. While performance testing does not replace other types of testing, it can reveal information relevant to usability, functionality, security, and corporate image that is difficult to obtain in other ways.
Many businesses and performance testers find it valuable to think of the risks that performance testing can address in terms of three categories: speed, scalability, and stability.
How to Use This Chapter
Use this chapter to learn about typical performance risks, the performance test types related to those risks, and proven strategies to mitigate those risks. To get the most from this chapter:
- Use the “Summary Matrix” section to understand the different types of tests and the risks they can mitigate.
- Use the various risk type sections to understand related strategies that can help you determine the best testing approach for your particular situation.
Summary Matrix of Risks Addressed by Performance Testing Types
Performance test type |
Risk(s) addressed |
Capacity |
|
Component |
|
Endurance |
|
Investigation |
|
Load |
|
Smoke |
|
Spike |
|
Stress |
|
Unit |
|
Validation |
|
Risks |
Performance test types |
|||||||||
Capacity |
Comp- onent |
Endurance |
Inves- tigation |
Load |
Smoke |
Spike |
Stress |
Unit |
Validation |
|
Speed-related risks |
||||||||||
User satisfaction |
|
|
X |
X |
X |
|
|
X |
|
X |
Synchronicity |
|
X |
X |
X |
X |
|
X |
X |
X |
|
Service Level Agreement (SLA) violation |
|
|
X |
X |
X |
|
|
|
|
X |
Response time trend |
|
X |
X |
X |
X |
X |
|
|
X |
|
Configuration |
|
|
X |
X |
X |
X |
|
X |
|
X |
Consistency |
|
X |
X |
X |
X |
|
|
|
X |
X |
Scalability-related risks |
||||||||||
Capacity |
X |
X |
X |
X |
X |
|
|
|
|
X |
Volume |
X |
X |
X |
X |
X |
|
|
|
|
X |
SLA violation |
|
|
X |
X |
X |
|
|
|
|
X |
Optimization |
X |
X |
|
X |
|
|
|
|
X |
|
Efficiency |
X |
X |
|
X |
|
|
|
|
X |
|
Future growth |
X |
X |
|
X |
X |
|
|
|
|
X |
Resource consumption |
X |
X |
X |
X |
X |
X |
X |
X |
X |
X |
Hardware / environment |
X |
X |
X |
X |
X |
|
X |
X |
|
X |
Service Level Agreement (SLA) violation |
X |
X |
X |
X |
X |
|
|
|
|
X |
Stability-related risks |
||||||||||
Reliability |
|
X |
X |
X |
X |
|
X |
X |
X |
|
Robustness |
|
X |
X |
X |
X |
|
X |
X |
X |
|
Hardware / environment |
|
|
X |
X |
X |
|
X |
X |
|
X |
Failure mode |
|
X |
X |
X |
X |
|
X |
X |
X |
X |
Slow leak |
|
X |
X |
X |
X |
|
|
|
X |
|
Service Level Agreement (SLA) violation |
|
X |
X |
X |
X |
|
X |
X |
X |
X |
Recovery |
|
X |
|
X |
|
|
X |
X |
X |
X |
Data accuracy and security |
|
X |
X |
X |
X |
|
X |
X |
X |
X |
Interfaces |
|
X |
X |
X |
X |
|
|
X |
X |
X |
Speed-Related Risks
Speed-related risks are not confined to end-user satisfaction, although that is what most people think of first. Speed is also a factor in certain business- and data-related risks. Some of the most common speed-related risks that performance testing can address include:
- Is the application fast enough to satisfy end users?
- Is the business able to process and utilize data collected by the application before that data becomes outdated? (For example, end-of-month reports are due within 24 hours of the close of business on the last day of the month, but it takes the application 48 hours to process the data.)
- Is the application capable of presenting the most current information (e.g., stock quotes) to its users?
- Is a Web Service responding within the maximum expected response time before an error is thrown?
Speed-Related Risk-Mitigation Strategies
The following strategies are valuable in mitigating speed-related risks:
- Ensure that your performance requirements and goals represent the needs and desires of your users, not someone else’s.
- Compare your speed measurements against previous versions and competing applications.
- Design load tests that replicate actual workload at both normal and anticipated peak times.
- Conduct performance testing with data types, distributions, and volumes similar to those used in business operations during actual production (e.g., number of products, orders in pending status, size of user base). You can allow data to accumulate in databases and file servers, or additionally create the data volume, before load test execution.
- Use performance test results to help stakeholders make informed architecture and business decisions.
- Solicit representative feedback about users’ satisfaction with the system while it is under peak expected load.
- Include time-critical transactions in your performance tests.
- Ensure that at least some of your performance tests are conducted while periodic system processes are executing (e.g., downloading virus-definition updates, or during weekly backups).
- Measure speed under various conditions, load levels, and scenario mixes. (Users value consistent speed.)
- Validate that all of the correct data was displayed and saved during your performance test. (For example, a user updates information, but the confirmation screen still displays the old information because the transaction has not completed writing to the database.)
Scalability-Related Risks
Scalability risks concern not only the number of users an application can support, but also the volume of data the application can contain and process, as well as the ability to identify when an application is approaching capacity. Common scalability risks that can be addressed via performance testing include:
- Can the application provide consistent and acceptable response times for the entire user base?
- Can the application store all of the data that will be collected over the life of the application?
- Are there warning signs to indicate that the application is approaching peak capacity?
- Will the application still be secure under heavy usage?
- Will functionality be compromised under heavy usage?
- Can the application withstand unanticipated peak loads?
Scalability-Related Risk-Mitigation Strategies
The following strategies are valuable in mitigating scalability-related risks:
- Compare measured speeds under various loads. (Keep in mind that the end user does not know or care how many other people are using the application at the same time that he/she is.)
- Design load tests that replicate actual workload at both normal and anticipated peak times.
- Conduct performance testing with data types, distributions, and volumes similar to those used in business operations during actual production (e.g., number of products, orders in pending status, size of user base). You can allow data to accumulate in databases and file servers, or additionally create the data volume, before load test execution.
- Use performance test results to help stakeholders make informed architecture and business decisions.
- Work with more meaningful performance tests that map to the real-world requirements.
- When you find a scalability limit, incrementally reduce the load and retest to help you identify a metric that can serve as a reliable indicator that the application is approaching that limit in enough time for you to apply countermeasures.
- Validate the functional accuracy of the application under various loads by checking database entries created or validating content returned in response to particular user requests.
- Conduct performance tests beyond expected peak loads and observe behavior by having representative users and stakeholders access the application manually during and after the performance test.
Stability-Related Risks
Stability is a blanket term that encompasses such areas as reliability, uptime, and recoverability. Although stability risks are commonly addressed with high-load, endurance, and stress tests, stability issues are sometimes detected during the most basic performance tests. Some common stability risks addressed by means of performance testing include:
- Can the application run for long periods of time without data corruption, slowdown, or servers needing to be rebooted?
- If the application does go down unexpectedly, what happens to partially completed transactions?
- When the application comes back online after scheduled or unscheduled downtime, will users still be able to see/do everything they expect?
- When the application comes back online after unscheduled downtime, does it resume at the correct point? In particular, does it not attempt to resume cancelled transactions?
- Can combinations of errors or repeated functional errors cause a system crash?
- Are there any transactions that cause system-wide side effects?
- Can one leg of the load-balanced environment be taken down and still provide uninterrupted service to users?
- Can the system be patched or updated without taking it down?
Stability-Related Risk-Mitigation Strategies
The following strategies are valuable in mitigating stability-related risks:
- Make time for realistic endurance tests.
- Conduct stress testing with the key scenarios. Work with key performance indicators (network, disk, processor, memory) and business indicators such as number of orders lost, user login failures, and so on.
- Conduct stress testing with data feeds that replicate similar business operations as in an actual production environment (e.g., number of products, orders in pending status, size of user base). You can allow data to accumulate in databases and file servers, or additionally create the data volume, before stress test execution. This will allow you to replicate critical errors such as database or application deadlocks and other stress failure patterns.
- Take a server offline during a test and observe functional, performance, and data-integrity behaviors of the remaining systems.
- Execute identical tests immediately before and after a system reboot. Compare the results. You can use an identical approach for recycling services or processes.
- Include error or exception cases in your performance test scenarios (for example, users trying to log on with improper credentials).
- Apply a patch to the system during a performance test.
- Force a backup and/or virus definition update during a performance test.
Summary
Almost all application- and business-related risks can be addressed through performance testing, including user satisfaction and the application’s ability to achieve business goals.
Generally, the risks that performance testing addresses are categorized in terms of speed, scalability, and stability. Speed is typically an end-user concern, scalability is a business concern, and stability is a technical or maintenance concern.
Identifying project-related risks and the associated mitigation strategies where performance testing can be employed is almost universally viewed as a valuable and time-saving practice.