Trace .NET applications with PerfCollect
This article applies to: ✔️ .NET Core 2.1 SDK and later versions
When performance problems are encountered on Linux, collecting a trace with perfcollect
can be used to gather detailed information about what was happening on the machine at the time of the performance problem.
perfcollect
is a bash script that uses Linux Trace Toolkit: next generation (LTTng) to collect events written from the runtime or any EventSource, as well as perf to collect CPU samples of the target process.
Prepare your machine
Follow these steps to prepare your machine to collect a performance trace with perfcollect
.
Note
If you're capturing from inside a container, your container must have the appropriate capabilities. The minimal required capabilities are PERFMON
and SYS_PTRACE
. If the capture fails with the minimal set, add the SYS_ADMIN
capability to the container. For more information on tracing applications inside containers using PerfCollect, see Collect diagnostics in containers.
Download
perfcollect
.curl -OL https://aka.ms/perfcollect
Make the script executable.
chmod +x perfcollect
Install tracing prerequisites - these are the actual tracing libraries.
sudo ./perfcollect install
This will install the following prerequisites on your machine:
perf
: the Linux Performance Events subsystem and companion user-mode collection/viewer application.perf
is part of the Linux kernel source, but is not usually installed by default.LTTng
: Used to capture event data emitted at run time by CoreCLR. This data is then used to analyze the behavior of various runtime components such as the GC, JIT, and thread pool.
Recent versions of .NET Core and the Linux perf tool support automatic resolution of method names for framework code.
For resolving method names of native runtime DLLs (such as libcoreclr.so), perfcollect
will resolve symbols for them when it converts the data, but only if the symbols for these binaries are present. See Getting Symbols for the Native Runtime section for details.
Collect a trace
Have two shells available - one for controlling tracing, referred to as [Trace], and one for running the application, referred to as [App].
[Trace] Start collection.
sudo ./perfcollect collect sampleTrace
Expected Output:
Collection started. Press CTRL+C to stop.
[App] Set up the application shell with the following environment variables - this enables tracing configuration of CoreCLR.
export DOTNET_PerfMapEnabled=1 export DOTNET_EnableEventLog=1
Note
When executing the app with .NET 7, you must also set
DOTNET_EnableWriteXorExecute=0
in addition to the preceding environment variables. For example:export DOTNET_EnableWriteXorExecute=0
Note
.NET 6 standardizes on the prefix
DOTNET_
instead ofCOMPlus_
for environment variables that configure .NET run-time behavior. However, theCOMPlus_
prefix will continue to work. If you're using a previous version of the .NET runtime, you should still use theCOMPlus_
prefix for environment variables.[App] Run the app - let it run as long as you need to in order to capture the performance problem. The exact length can be as short as you need as long as it sufficiently captures the window of time where the performance problem you want to investigate occurs.
dotnet run
[Trace] Stop collection - hit CTRL+C.
^C ...STOPPED. Starting post-processing. This may take some time. Generating native image symbol files ...SKIPPED Saving native symbols ...FINISHED Exporting perf.data file ...FINISHED Compressing trace files ...FINISHED Cleaning up artifacts ...FINISHED Trace saved to sampleTrace.trace.zip
The compressed trace file is now stored in the current working directory.
View a trace
There are a number of options for viewing the trace that was collected. Traces are best viewed using PerfView on Windows, but they can be viewed directly on Linux using PerfCollect
itself or TraceCompass
.
Use PerfCollect to view the trace file
You can use perfcollect itself to view the trace that you collected. To do this, use the following command:
./perfcollect view sampleTrace.trace.zip
By default, this will show the CPU trace of the application using perf
.
To look at the events that were collected via LTTng
, you can pass in the flag -viewer lttng
to see the individual events:
./perfcollect view sampleTrace.trace.zip -viewer lttng
This will use babeltrace
viewer to print the events payload:
# [01:02:18.189217659] (+0.020132603) ubuntu-xenial DotNETRuntime:ExceptionThrown_V1: { cpu_id = 0 }, { ExceptionType = "System.Exception", ExceptionMessage = "An exception happened", ExceptionEIP = 139875671834775, ExceptionHRESULT = 2148734208, ExceptionFlags = 16, ClrInstanceID = 0 }
# [01:02:18.189250227] (+0.020165171) ubuntu-xenial DotNETRuntime:ExceptionCatchStart: { cpu_id = 0 }, { EntryEIP = 139873639728404, MethodID = 139873626968120, MethodName = "void [helloworld] helloworld.Program::Main(string[])", ClrInstanceID = 0 }
Use PerfView to open the trace file
To see an aggregate view of both the CPU sample and the events, you can use PerfView
on a Windows machine.
Copy the trace.zip file from Linux to a Windows machine.
Download PerfView from https://aka.ms/perfview.
Run PerfView.exe
PerfView.exe <path to trace.zip file>
PerfView will display the list of views that are supported based on the data contained in the trace file.
For CPU investigations, choose CPU stacks.
For detailed GC information, choose GCStats.
For per-process/module/method JIT information, choose JITStats.
If there is not a view for the information you need, you can try looking for the events in the raw events view. Choose Events.
For more information on how to interpret views in PerfView, see help links in the view itself, or from the main window in PerfView, choose Help->Users Guide.
Note
Events written via System.Diagnostics.Tracing.EventSource API (including the events from Framework) won't show up under their provider name. Instead, they are written as EventSourceEvent
events under Microsoft-Windows-DotNETRuntime
provider and their payloads are JSON serialized.
Note
If you observe [unknown] /memfd:doublemapper
frames in method names and callstacks, set DOTNET_EnableWriteXorExecute=0
before running the app that you're tracing with perfcollect.
Use TraceCompass to open the trace file
Eclipse TraceCompass is another option you can use to view the traces. TraceCompass
works on Linux machines as well, so you don't need to move your trace over to a Windows machine. To use TraceCompass
to open your trace file, you will need to unzip the file.
unzip myTrace.trace.zip
perfcollect
will save the LTTng trace it collected into a CTF file format in a subdirectory in the lttngTrace
. Specifically, the CTF file will be located in a directory that looks like lttngTrace/auto-20201025-101230\ust\uid\1000\64-bit\
.
You can open the CTF trace file in TraceCompass
by selecting File -> Open Trace
and select the metadata
file.
For more details, please refer to TraceCompass
documentation.
Get symbols for the native runtime
Most of the time you are interested in your own code, which perfcollect
resolves by default. Sometimes it is useful to see what is going on inside the .NET DLLs (which is what the last section was about), but sometimes what is going on in the native runtime dlls (typically libcoreclr.so), is interesting. perfcollect
will resolve the symbols for these when it converts its data, but only if the symbols for these native DLLs are present (and are beside the library they are for).
There is a global command called dotnet-symbol that does this. To use dotnet-symbol to get native runtime symbols:
Install
dotnet-symbol
:dotnet tool install -g dotnet-symbol
Download the symbols. If your installed version of the .NET Core runtime is 2.1.0, the command to do this is:
mkdir mySymbols dotnet symbol --symbols --output mySymbols /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.0/lib*.so
Copy the symbols to the correct place.
sudo cp mySymbols/* /usr/share/dotnet/shared/Microsoft.NETCore.App/2.1.0
If this cannot be done because you do not have write access to the appropriate directory, you can use
perf buildid-cache
to add the symbols.
After this, you should get symbolic names for the native dlls when you run perfcollect
.
Collect in a Docker container
For more information on how to use perfcollect
in container environments, see Collect diagnostics in containers.
Learn more about collection options
You can specify the following optional flags with perfcollect
to better suit your diagnostic needs.
Collect for a specific duration
When you want to collect a trace for a specific duration, you can use -collectsec
option followed by a number specifying the total seconds to collect a trace for.
Collect threadtime traces
Specifying -threadtime
with perfcollect
lets you collect per-thread CPU usage data. This lets you analyze where every thread was spending its CPU time.
Collect traces for managed memory and garbage collector performance
The following options let you specifically collect the GC events from the runtime.
perfcollect collect -gccollectonly
Collect only a minimal set of GC Collection events. This is the least verbose GC eventing collection profile with the lowest impact on the target app's performance. This command is analogous to PerfView.exe /GCCollectOnly collect
command in PerfView.
perfcollect collect -gconly
Collect more verbose GC collection events with JIT, Loader, and Exception events. This requests more verbose events (such as the allocation information and GC join information) and will have more impact to the target app's performance than -gccollectonly
option. This command is analogous to PerfView.exe /GCOnly collect
command in PerfView.
perfcollect collect -gcwithheap
Collect the most verbose GC collection events, which tracks the heap survival and movements as well. This gives in-depth analysis of the GC behavior but will incur high performance cost as each GC can take more than two times longer. It is recommended you understand the performance implication of using this trace option when tracing in production environments.