Analyze CPU Usage in a Windows Universal App
Note
This article applies to Visual Studio 2015. If you're looking for the latest Visual Studio documentation, see Visual Studio documentation. We recommend upgrading to the latest version of Visual Studio. Download it here
Applies to Windows and Windows Phone](../Image/windows_and_phone_content.png "windows_and_phone_content")
When you need to investigate performance issues in your app, a good place to start is understanding how it uses the CPU. The CPU Usage tool shows you where the CPU is spending time executing code. To focus on specific scenarios, CPU Usage can be run with the XAML UI Responsiveness tool, the CPU Usage Tool tool, or both tools in a single diagnostic session.
Note
The CPU Usage tool cannot be used with Windows Phone Silverlight 8.1 apps.
This walkthrough takes you through collecting and analyzing CPU usage for a simple Windows Universal XAML app.
Create the CpuUseDemo project
CpuUseDemo is an app that was created to demonstrate how to collect and analyze CPU usage data. The buttons generate a number by calling a method that selects the maximum value from multiple calls to a function. The called function creates a very large number of random values and then returns the last one. The data is displayed in a text box.
Create a new C# Windows Universal app project named CpuUseDemo using the BlankApp template.
Replace MainPage.xaml with this code.
Replace MainPage.xaml.cs with this code.
Build the app and try it out. The app is simple enough to show you some common cases of CPU Usage data analysis.
Collect CPU usage data
In Visual Studio, set the deployment target to Simulator and the solution configuration to Release.
Running the app in the simulator lets you switch easily between the app and the Visual Studio IDE.
Running this app in Release mode gives you a better view of the actual performance of your app.
On the Debug menu, choose Performance Profiler....
In the Performance and Diagnostic hub, choose CPU Usage and then choose Start.
When the app starts, click Get Max Number. Wait about a second after the output is displayed, then choose Get Max Number Async. Waiting between button clicks makes it easier to isolate the button click routines in the diagnostic report.
After the second output line appears, choose Stop Collection in the Performance and Diagnostic hub.
The CPU Usage tool analyzes the data and displays the report.
Analyze the CPU Usage report
CPU utilization timeline graph
The CPU utilization graph shows the CPU activity of the app as a percent of all CPU time from all the processor cores on the device. The data of this report was collected on a dual-core machine. The two large spikes represent the CPU activity of the two button clicks. GetMaxNumberButton_Click
performs synchronously on a single core, so that it makes sense that method’s graph height never exceeds 50%. GetMaxNumberAsycButton_Click
runs asynchronously across both cores, so it so it again looks right that its spike gets closer to utilizing all of the CPU resources on both cores.
Select timeline segments to view details
Use the selection bars on the Diagnostic session timeline to focus on the GetMaxNumberButton_Click data:
The Diagnostic session timeline now displays the time spent in the selected segment (a bit more than 2 seconds in this report) and filters the call tree to those methods that ran in the selection.
Now select the GetMaxNumberAsyncButton_Click
segment.
This method completes about a second faster than GetMaxNumberButton_Click
, but the meaning of the call tree entries are less obvious.
The CPU Usage call tree
To get started understanding call tree information, reselect the GetMaxNumberButton_Click
segment, and look at the call tree details.
Call tree structure
Image | Description |
---|---|
The top-level node in CPU Usage call trees is a pseudo-node | |
In most apps, when the Show External Code option is disabled, the second-level node is an [External Code] node that contains the system and framework code that starts and stops the app, draws the UI, controls thread scheduling, and provides other low-level services to the app. | |
The children of the second-level node are the user-code methods and asynchronous routines that are called or created by the second-level system and framework code. | |
Child nodes of a method contain data only for the calls of the parent method. When Show External Code is disabled, app methods can also contain an [External Code] node. |
External Code
External code consists of functions in system and framework components that are executed by the code you write. External code includes functions that start and stop the app, draw the UI, control threading, and provide other low-level services to the app. In most cases, you won’t be interested in external code, and so the CPU Usage call tree gathers the external functions of a user method into one [External Code] node.
When you want to view the call paths of external code, choose Show External Code from the Filter view list and then choose Apply.
Be aware that many external code call chains are deeply nested, so that the width of the Function Name column can exceed the display width of all but the largest of computer monitors. When this happens, function names are shown as […]:
Use the search box to find a node that you are looking for, then use the horizontal scroll bar to bring the data into view:
Call tree data columns
Property | Description |
---|---|
Total CPU (%) | The percentage of the app's CPU activity in the selected time range that was used by calls to the function and the functions called by the function. Note that this is different from the CPU Utilization timeline graph, which compares the total activity of the app in a time range to the total available CPU capacity. |
Self CPU (%) | The percentage of the app's CPU activity in the selected time range that was used by the calls to the function, excluding the activity of functions called by the function. |
Total CPU (ms) | The number of milliseconds spent in calls to the function in the selected time range and the functions that were called by the function. |
Self CPU (ms) | The number of milliseconds spent in calls to the function in the selected time range and the functions that were called by the function. |
Module | The name of the module containing the function, or the number of modules containing the functions in an [External Code] node. |
Asynchronous functions in the CPU Usage call tree
When the compiler encounters an asynchronous method, it creates a hidden class to control the method’s execution. Conceptually, the class is a state machine that includes a list of compiler-generated functions that call operations of the original method asynchronously, and the callbacks, scheduler, and iterators required to them correctly. When the original method is called by a parent method, the runtime removes the method from the execution context of the parent, and runs the methods of the hidden class in the context of the system and framework code that control the app’s execution. The asynchronous methods are often, but not always, executed on one or more different threads. This code is shown in the CPU Usage call tree as children of the [External Code] node immediately below the top node of the tree.
To see this in our example, re-select the GetMaxNumberAsyncButton_Click
segment in the timeline.
The first two nodes under [External Code] are the compiler-generated methods of the state machine class. The third is the call to original method. Expanding the generated methods shows you what’s going on.
MainPage::GetMaxNumberAsyncButton_Click
does very little; it manages a list of the task values, computes the maximum of the results, and displays the output.MainPage+<GetMaxNumberAsyncButton_Click>d__3::MoveNext
shows you the activity required to schedule and launch the 48 tasks that wrap the call toGetNumberAsync
.MainPage::<GetNumberAsync>b__b
shows you the activity of the tasks that callGetNumber
.
Next steps
The CpuUseDemo app is not the most brilliant of apps, but you can extend its utility by using it to experiment with asynchronous operation and other tools in the Performance and Diagnostics hub.
Note that
MainPage::<GetNumberAsync>b__b
spends more time in [External Code] than it does executing the GetNumber method. Much of this time is the overhead of the asynchronous operations. Try increasing the number of tasks (set in theNUM_TASKS
constant of MainPage.xaml.cs) and reducing the number of iterations inGetNumber
(change theMIN_ITERATIONS
value). Run the collection scenario and compare the CPU activity ofMainPage::<GetNumberAsync>b__b
to that in the original CPU Usage diagnostic session. Try reducing the tasks and increasing the iterations.Users often don’t care about the real performance of your app; they do care about the perceived performance and responsiveness of the app. The XAML UI Responsive tool shows you details of activity on the UI thread that effect perceived responsiveness.
Create a new session in the Diagnostic and Performance hub, and add both the XAML UI Responsive tool and the CPU Usage tool. Run the collection scenario. If you’ve read this far, the report probably doesn’t tell you anything that you haven’t already figured out, but the differences in the UI Thread utilization timeline graph for the two methods is striking. In complex, real-world apps, the combination of tools can be very helpful.
MainPage.xaml
<Page
x:Class="CpuUseDemo.MainPage"
xmlns="https://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:x="https://schemas.microsoft.com/winfx/2006/xaml"
xmlns:local="using:CpuUseDemo"
xmlns:d="https://schemas.microsoft.com/expression/blend/2008"
xmlns:mc="https://schemas.openxmlformats.org/markup-compatibility/2006"
mc:Ignorable="d">
<Page.Resources>
<Style TargetType="TextBox">
<Setter Property="FontFamily" Value="Lucida Console" />
</Style>
</Page.Resources>
<Grid Background="{ThemeResource ApplicationPageBackgroundThemeBrush}">
<Grid.RowDefinitions>
<RowDefinition Height="Auto" />
<RowDefinition Height="*" />
</Grid.RowDefinitions>
<StackPanel Grid.Row="0" Orientation="Horizontal" Margin="0,40,0,0">
<Button Name="GetMaxNumberButton" Click="GetMaxNumberButton_Click" Content="Get Max Number" />
<Button Name="GetMaxNumberAsyncButton" Click="GetMaxNumberAsyncButton_Click" Content="Get Max Number Async" />
</StackPanel>
<StackPanel Grid.Row="1">
<TextBox Name="TextBox1" AcceptsReturn="True" />
</StackPanel>
</Grid>
</Page>
MainPage.xaml.cs
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices.WindowsRuntime;
using Windows.Foundation;
using Windows.Foundation.Collections;
using Windows.UI.Xaml;
using Windows.UI.Xaml.Controls;
using Windows.UI.Xaml.Controls.Primitives;
using Windows.UI.Xaml.Data;
using Windows.UI.Xaml.Input;
using Windows.UI.Xaml.Media;
using Windows.UI.Xaml.Navigation;
using Windows.Foundation.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
using System.Collections.Concurrent;
// The Blank Page item template is documented at https://go.microsoft.com/fwlink/?LinkId=234238
namespace CpuUseDemo
{
/// <summary>
/// An empty page that can be used on its own or navigated to within a Frame.
/// </summary>
public sealed partial class MainPage : Page
{
public MainPage()
{
this.InitializeComponent();
}
const int NUM_TASKS = 48;
const int MIN_ITERATIONS = int.MaxValue / 1000;
const int MAX_ITERATIONS = MIN_ITERATIONS + 10000;
long m_totalIterations = 0;
readonly object m_totalItersLock = new object();
private void GetMaxNumberButton_Click(object sender, RoutedEventArgs e)
{
GetMaxNumberAsyncButton.IsEnabled = false;
lock (m_totalItersLock)
{
m_totalIterations = 0;
}
List<int> tasks = new List<int>();
for (var i = 0; i < NUM_TASKS; i++)
{
var result = 0;
result = GetNumber();
tasks.Add(result);
}
var max = tasks.Max();
var s = GetOutputString("GetMaxNumberButton_Click", NUM_TASKS, max, m_totalIterations);
TextBox1.Text += s;
GetMaxNumberAsyncButton.IsEnabled = true;
}
private async void GetMaxNumberAsyncButton_Click(object sender, RoutedEventArgs e)
{
GetMaxNumberButton.IsEnabled = false;
GetMaxNumberAsyncButton.IsEnabled = false;
lock (m_totalItersLock)
{
m_totalIterations = 0;
}
var tasks = new ConcurrentBag<Task<int>>();
for (var i = 0; i < NUM_TASKS; i++)
{
tasks.Add(GetNumberAsync());
}
await Task.WhenAll(tasks.ToArray());
var max = 0;
foreach (var task in tasks)
{
max = Math.Max(max, task.Result);
}
var func = "GetMaxNumberAsyncButton_Click";
var outputText = GetOutputString(func, NUM_TASKS, max, m_totalIterations);
TextBox1.Text += outputText;
this.GetMaxNumberButton.IsEnabled = true;
GetMaxNumberAsyncButton.IsEnabled = true;
}
private int GetNumber()
{
var rand = new Random();
var iters = rand.Next(MIN_ITERATIONS, MAX_ITERATIONS);
var result = 0;
lock (m_totalItersLock)
{
m_totalIterations += iters;
}
// we're just spinning here
// and using Random to frustrate compiler optimizations
for (var i = 0; i < iters; i++)
{
result = rand.Next();
}
return result;
}
private Task<int> GetNumberAsync()
{
return Task<int>.Run(() =>
{
return GetNumber();
});
}
string GetOutputString(string func, int cycles, int max, long totalIters)
{
var fmt = "{0,-35}Tasks:{1,3} Maximum:{2, 12} Iterations:{3,12}\n";
return String.Format(fmt, func, cycles, max, totalIters);
}
}
}