Scenarios and Underpants Gnomes

Data driven quality requires a certain kind of thinking. It took me a while to understand the right thought process. I kept getting caught up in the details. What were we building and what would use look like? These are valid questions, but there are more important ones to be asking. Not asking whether the product is being successfully used, but rather how it is affecting user behavior. If we know what a happy user looks like and we see that behavior, we have a successful product.

As I wrote in What Is Quality?, true quality is the fitness of a particular form (program) for a function (what the user wants to accomplish). True data driven quality should measure this fitness function. A truly successful product will maximize this function. The key to doing this is to understand what job the user needs the product to accomplish and then measure whether that job is being done in an optimal way. It is important to understand the pain the customer is experiencing and then visualize what the world would look like if that pain were relieved. If we can measure that alleviation, we know we have a successful product.

There is a key part of the process I did not mention. What does the product do that alleviates the pain the customer is experiencing? This is unimportant. In fact, it is best not to know. Wait, you might think. Clearly it is important. If the product does nothing, the situation will not change and the customer will remain in pain. That is true, but that is also getting the cart before the horse. Knowing how the product intends to operate can cloud our judgment about how to measure it. We will be tempted to utilize confirmatory metrics instead of experimental ones. We will measure what the product does and not what it accomplishes. Just like test driven development requires the tests be written before the code, data driven quality demands that the metrics be designed before the features.

One way to accomplish this is through what can be called a scenario. This term is used for many things so let me be specific about my use. A scenario takes a particular form. It asks what problem the user is having and what alleviation of that pain looks like. It treats the solution as a black box.

  1. Customer Pain
  2. Magic Happens
  3. World Without Pain

I say "Magic Happens" because at this stage, it doesn't matter how things get better, only that they do. This reminds me of an old South Park sketch called the Underpants Gnomes. In it a group of gnomes has a brilliant business plan. They will gather underwear, do something with it, and then profit!


Their pain is a lack of money and an overabundance of underwear. Their success is more money (and fewer underpants?). To measure the success of their venture, it is not necessary to understand how they will generate profits from the underpants. It will suffice to measure their profits. Unfortunately for the gnomes, there may be no magic which can turn underwear into profit.

Let's walk through a real-world example.

  1. Customer Pain: When I start my news app, the news is outdated. I must wait for updated news to be retrieved. Sometimes I close the app immediately because the data is stale.
  2. Magic Happens
  3. World Without Pain: When I start the app, the news is current. I do not need to wait for data to be retrieved. Today's news is waiting for me.

What metrics might we use to measure this? We likely cannot measure the user's satisfaction with the content directly, but we can measure the saliency of the news. We could measure the time it takes to get updated content on the screen? Does this go down? We could tag the news content with timestamps and measure the median age of news when the app starts. Does the median age reduce? We could measure how often a user closes the app within the first 15 seconds of it starting up. Are fewer users rage quitting the app? We might even be able to monitor overall use of the app. Is median user activity going up?

Whether the solution involves improving server response times, caching content, utilizing OS features to prefetch the content while the app is not active, or other solutions is not necessary to understand. These are all part of the "magic happens" stage. We can and should experiment with several ideas to see which improve the situation the most. The key here is to measure how these ideas affect user behavior and user perception, not how often the prefetch APIs are called or whether server speeds are increased.