Measure code performance and establish baselines

8 minutes

Measuring code performance and establishing baselines is a critical step in the optimization process. It provides a reference point to evaluate the results of any changes you make.

Establish a performance baseline

A performance baseline is a set of measurements that captures how the code performs before you make any changes. It's your point of reference. Without a baseline for comparison, you don't know whether your updates improve code performance or make things worse.

Here are some key aspects to measure in a baseline:

Execution time (latency): How long does it take to execute a certain operation or process? This value can be measured for a small function (for example, it takes 5 milliseconds to sort 10,000 items) or an end-to-end scenario (for example, the service responds to a request in 250 milliseconds).
Throughput: How many operations can be done in a given time? This value is relevant for server applications (requests per second), batch processing (records processed per minute), etc.
Resource usage: Memory consumption (working set, allocations, garbage collection frequency), CPU usage (does it peg a core at 100%? for how long?), disk I/O, network I/O, etc., depending on what the code does.
Scalability characteristics: How do the above metrics change as input size grows or as load (concurrent users) increases? You might measure that a function takes 0.1 milliseconds for 100 items and 10 milliseconds for 10,000 items. That indicates how it scales (in this case, roughly linear with number of items). This kind of baseline helps you understand scalability and set goals. For example, maybe you want 100k items to sort in under 200 milliseconds, which might mean you need a more efficient algorithm or parallel sorting.

When establishing a baseline:

Make the environment as consistent as possible. Run tests on the same machine, under similar conditions (close background programs, use release build, etc.). Performance can vary run-to-run; minimizing noise gives more reliable data.
Use release builds and realistic settings. Debug builds often run slower due to extra checks and lack of optimizations. If you're measuring algorithm speed, compile in Release mode with optimizations on. Similarly, if your code is multi-threaded or async, test it in a scenario that resembles production (for example, the same number of threads or same dataset size).
Automate or script the measurement. You should run your code multiple times. It helps to have a simple benchmark harness or use tools that automate it. For example, for a small function, you might write a loop that calls it, measure time, and compute an average.
Record the numbers. Keep a record of the baseline metrics (time, memory, etc.). If you have a unit test or log output that prints performance info, save that. For nontrivial cases, using a spreadsheet or simple table to compare "before vs after" is useful.

Baseline measurement using Stopwatch

C# offers the System.Diagnostics.Stopwatch class, which is useful for measuring the execution time of code blocks.

For example, suppose you have the following ProcessData() method that you suspect is slow:

var sw = Stopwatch.StartNew();
ProcessData();
sw.Stop();
Console.WriteLine($"ProcessData() completed in {sw.ElapsedMilliseconds} milliseconds");

Imagine that this code consistently prints a value of approximately 120 milliseconds. That average value is your baseline for ProcessData() under the test conditions you used.

Baseline for memory usage

The GC.GetTotalMemory method is a .NET method call that retrieves the current amount of memory allocated by the garbage collector (GC). This GetTotalMemory method captures the current memory allocation state when invoked.

You can capture memory usage by checking GC.GetTotalMemory(false) before and after an operation.

For example, suppose you have the following code:

long memoryBefore = GC.GetTotalMemory(forceFullCollection: true);
ProcessData();
long memoryAfter = GC.GetTotalMemory(forceFullCollection: true);
Console.WriteLine($"Memory used by ProcessData(): {memoryAfter - memoryBefore} bytes");

If this code prints, "Memory used by ProcessData(): 50,000 bytes", that's how much memory the operation allocated (net). This value can be part of your baseline, especially if you worry about memory or garbage collection overhead.

Contextual baselines

The baseline often isn't a single number. You might have a baseline at a certain scale and then test at a larger scale. For example:

Sorting 10k items takes 50 milliseconds (baseline at 10k).
Sorting 100k items takes 800 milliseconds.

This result indicates that performance degrades as input grows, which is expected. It also suggests the algorithm might be worse than O(n log n) because increasing the items by a factor of 10 appears to increase time by a factor of 16. This type of baseline helps you understand scalability and set goals.

Performance measurement tools and techniques

Understanding performance requires both measurement and analysis of algorithmic characteristics. Before diving into measurement tools, it's important to understand key performance concepts:

Understanding algorithmic complexity

Time complexity describes how an algorithm's runtime grows as input size increases, typically expressed using Big O notation:

O(1) - Constant time: Performance doesn't change with input size (for example, dictionary lookups).
O(n) - Linear time: Performance grows proportionally with input (for example, iterating through a list once).
O(n²) - Quadratic time: Performance grows with the square of input size (for example, nested loops).
O(log n) - Logarithmic time: Performance grows slowly as input increases (for example, binary search).

Space complexity measures how memory usage grows with input size, using the same Big O notation. Understanding these concepts helps identify why certain code patterns become bottlenecks at scale.

Recognizing performance patterns and anti-patterns

Common performance anti-patterns include:

N+1 query problems: Making one query to get a list, then N more queries for related data.
Inefficient data structures: Using lists for frequent lookups instead of dictionaries or hash sets.
Premature string concatenation: Building large strings using += instead of StringBuilder.
Synchronous operations: Blocking threads with synchronous I/O when asynchronous operations would be better.

Caching strategies can dramatically improve performance by storing frequently accessed data in memory, avoiding expensive recomputation or I/O operations.

From simple to sophisticated, here are tools and methods to measure performance:

Manual timing with Stopwatch

Manually instrumenting code with Stopwatch or even DateTime.UtcNow differences can give quick insight. This approach is ad-hoc but accessible and often sufficient for a first investigation. You can sprinkle timing logs around code sections (for example, log how long a database query took, how long file parsing took, etc.).

Logging and counters

Logging is ad-hoc but accessible and often sufficient for a first investigation. You can sprinkle timing logs around code sections (for example, log how long a database query took, how long file parsing took, etc.).

Event logging and performance metrics

Adding strategic logging can reveal performance patterns and help identify bottlenecks:

Operation counts: Log how many items were processed, database queries executed, or cache hits/misses occurred.
Timing breakdowns: Measure different phases of operations (for example, "Database query: 50 milliseconds, Processing: 20 milliseconds, Serialization: 10 milliseconds").
Resource utilization: Track memory allocations, thread pool usage, or connection pool metrics.
Performance metrics: Monitor latency (response time), throughput (operations per second), and scalability characteristics.

For instance, if your log shows "Fetched 1000 records from database" when you expected to see 100 records, the discrepancy might indicate an N+1 query problem or inefficient query logic. Similarly, logging cache hit rates can help evaluate caching strategy effectiveness.

Built-in profilers (Visual Studio Diagnostics)

Visual Studio (Enterprise edition, and to some extent Community) has profiling tools (CPU Usage, Memory Usage, Performance Profiler). You can run your application under the profiler and it shows you, for example, a breakdown of CPU time by function, or a list of objects on the heap and who allocated them.

The CPU profiler typically produces a call tree, where you can see which methods consumed the most CPU time.
The memory profiler can take snapshots to show how memory usage grows, and what types of objects are taking space.

Using a profiler is often as simple as clicking "Start Diagnostic" and choosing the kind of profile.

.NET CLI tools

For .NET Core and .NET 5+, Microsoft provides command-line tools like dotnet-counters, dotnet-trace, dotnet-dump, dotnet-gcdump.

dotnet-counters can display real-time performance metrics of a running app (GC collections, exceptions, thread pool usage, etc.).
dotnet-trace can collect a trace of an app's execution, which one can analyze to see which methods were running.

These tools are more advanced, but invaluable for deep dives or production profiling where you can't attach a GUI profiler.

Microbenchmarking libraries

If you want to compare two implementations of a function (for example, your original vs a proposed optimized version) in a rigorous way, BenchmarkDotNet is a popular library. It runs the functions many times, warms up the just-in-time compiler, measures precisely, and gives statistics (like mean, standard deviation). This data is used for micro-benchmarks (small isolated code paths) with high precision.

Performance/load tests

For larger scenarios (web applications, services), you might write a load test or use a tool (like JMeter, k6, or Visual Studio load test) to simulate many requests or large inputs. This approach can reveal throughput and stability under stress, and help identify bottlenecks that only appear at scale.

System monitoring

Monitor overall system behavior to identify resource bottlenecks and scalability issues:

CPU usage patterns: High CPU usage might indicate algorithmic inefficiencies or computational bottlenecks.
Memory consumption: Growing memory usage could suggest memory leaks, inefficient data structures, or excessive object allocation.
I/O metrics: High disk or network I/O might indicate inefficient data access patterns or poor caching.
Garbage collection: Frequent garbage collection can impact performance, especially in high-throughput applications.

Tools like Task Manager, dotnet-counters, or PerfMon on Windows can provide system-level insights. Understanding these metrics helps correlate code-level performance with system resource utilization.

Monitor improvements and avoid regressions

After establishing a baseline, you can start implementing code improvements. Code improvements should be made in a cycle of code updates followed by measurement to see if the change had the desired effect.

When you make a change, always remeasure using the same method as your baseline. Compare the new measurements to the baseline.

Here are some recommended guidelines for this process:

Change one thing at a time (if possible): So you know what caused any improvement or regression.
Run the same test/measurement as baseline: Use the same procedure and compare metrics directly.
If improvement achieved: Great, consider if it meets the goal. If you needed a 2x speedup and you got 1.5x, you might iterate further.
If no improvement or worse performance: Investigate why. It could be your change didn't address the real bottleneck or traded one cost for another.
Check for side effects: Side effects can include unexpected correctness or performance issues. For example, you might optimize CPU performance but witness a large increase in memory usage. Is that side effect acceptable?
Automate regression detection: Writing performance unit tests or benchmarks that run before and after changes is a common technique for regression detection. While not every team writes performance tests, it's not a bad idea to have a small suite of performance tests (especially for critical paths) that ensure new changes don't drastically slow things down.

Use GitHub Copilot to assist with measurements

GitHub Copilot can help you set up performance measurement techniques:

Ask for timing examples: "How can I measure the execution time of a C# method?" GitHub Copilot can suggest Stopwatch with sample code.
Get profiling guidance: "What tools can I use to profile a .NET application?" GitHub Copilot can list profiling tools and their use cases.
Generate benchmark code: GitHub Copilot can help create BenchmarkDotNet classes or other measurement harnesses.

GitHub Copilot serves as a quick reference for implementing measurement techniques correctly.

Summary

Measuring code performance and establishing baselines is a critical step in the optimization process. By systematically capturing execution time, resource usage, and scalability characteristics, you create a reference point to evaluate the impact of your changes. Utilizing various measurement techniques—from simple timing with Stopwatch to advanced profiling tools—enables you to identify bottlenecks accurately. Always ensure that performance improvements don't compromise correctness or introduce new issues. With a disciplined approach to measurement and iteration, you can effectively enhance your code's performance while maintaining its integrity.

Feedback

Was this page helpful?