Share via


I Want You To Guess Accurately

When estimating performance and capacity, there are no guarantees except for the guarantees that you build for yourself. This means that, despite all the documentation, no one, no piece of paper, will ever tell you how your environment is going to behave in the real world. The systems involved are too complex for those types of performance assertions.

As a result, what you see in much of the published performance documentation falls in to one of two categories - rules of thumb, and examples. Rules of thumbs are things like, "Plan for 0.75 IOPS per gigabyte of content", and examples are things like, "With a CPU of X and Y gigabytes of RAM, we were able to hit Z requests per second."

None of which is a guarantee, and none of which describe your environment. All are based on bold assumptions that may not apply to you.

Rules of thumb and examples are fantastic in giving you a place to start your architecture planning, but they should not be the end of the road. Do not assume that you can estimate the peak requests per second (RPS) required for your solution, find an example solution that purportedly supports that RPS level, and build something just like it and "just be okay". Perhaps for smaller, less-critical solutions, but if your environment is intended to support mission critical workloads - such that you'd go through all the rigamarole to plan it in detail and establish a disaster recovery plan for it in the first place - then chances are you need a good understanding of the performance characteristics of the environment.

What I find interesting is often, people will get to this point - where they've got an initial architecture together - and will begin looking for external validation that their design will meet their needs. It's then that someone will point out that there lack of load testing introduces a high degree of variability in the design, and that they should stop and test the designed infrastructure to see if it actually does meet their needs. The reaction is frequently then to start oversizing portions of the solution, with excuses like, "We don't have time for load testing in our project plan! We have no funding allocated for that!".

Frequently they are right - they don't have the time. They didn't include load testing in their plan from the very beginning; they expected to develop requirements, map to a case study or example, use a few rules of thumbs, size, and be done with it. Wrong. You have to load test. It's not hard. There are examples. It does take time, though - and now would be a great time in the project to make that time.

As for funding, it's also not uncommon for people to spend so much additional money on extra, possibly uneeded hardware, that if they had actually spent the time to load test their initial design and find out how much additional hardware they actually needed (as opposed to what they think they might need, maybe), it would've been considerably less expensive. I've seen people get to this point, and literally double the amount of hardware.

Don't do that. Load test. It'll be okay. You'll be a better person for it.