Diagnostics and performance monitoring for Reliable Actors
The Reliable Actors runtime emits EventSource events and performance counters. These provide insights into how the runtime is operating and help with troubleshooting and performance monitoring.
EventSource events
The EventSource provider name for the Reliable Actors runtime is "Microsoft-ServiceFabric-Actors". Events from this event source appear in the Diagnostics Events window when the actor application is being debugged in Visual Studio.
Examples of tools and technologies that help in collecting and/or viewing EventSource events are PerfView, Azure Diagnostics, Semantic Logging, and the Microsoft TraceEvent Library.
Keywords
All events that belong to the Reliable Actors EventSource are associated with one or more keywords. This enables filtering of events that are collected. The following keyword bits are defined.
Bit | Description |
---|---|
0x1 | Set of important events that summarize the operation of the Fabric Actors runtime. |
0x2 | Set of events that describe actor method calls. For more information, see the introductory topic on actors. |
0x4 | Set of events related to actor state. For more information, see the topic on actor state management. |
0x8 | Set of events related to turn-based concurrency in the actor. For more information, see the topic on concurrency. |
Performance counters
The Reliable Actors runtime defines the following performance counter categories.
Category | Description |
---|---|
Service Fabric Actor | Counters specific to Azure Service Fabric actors, e.g. time taken to save actor state |
Service Fabric Actor Method | Counters specific to methods implemented by Service Fabric actors, e.g. how often an actor method is invoked |
Each of the above categories has one or more counters.
The Windows Performance Monitor application that is available by default in the Windows operating system can be used to collect and view performance counter data. Azure Diagnostics is another option for collecting performance counter data and uploading it to Azure tables.
Performance counter instance names
A cluster that has a large number of actor services or actor service partitions will have a large number of actor performance counter instances. The performance counter instance names can help in identifying the specific partition and actor method (if applicable) that the performance counter instance is associated with.
Service Fabric Actor category
For the category Service Fabric Actor
, the counter instance names are in the following format:
ServiceFabricPartitionID_ActorsRuntimeInternalID
ServiceFabricPartitionID is the string representation of the Service Fabric partition ID that the performance counter instance is associated with. The partition ID is a GUID, and its string representation is generated through the Guid.ToString
method with format specifier "D".
ActorRuntimeInternalID is the string representation of a 64-bit integer that is generated by the Fabric Actors runtime for its internal use. This is included in the performance counter instance name to ensure its uniqueness and avoid conflict with other performance counter instance names. Users should not try to interpret this portion of the performance counter instance name.
The following is an example of a counter instance name for a counter that belongs to the Service Fabric Actor
category:
2740af29-78aa-44bc-a20b-7e60fb783264_635650083799324046
In the example above, 2740af29-78aa-44bc-a20b-7e60fb783264
is the string representation of the Service Fabric partition ID, and 635650083799324046
is the 64-bit ID that is generated for the runtime's internal use.
Service Fabric Actor Method category
For the category Service Fabric Actor Method
, the counter instance names are in the following format:
MethodName_ActorsRuntimeMethodId_ServiceFabricPartitionID_ActorsRuntimeInternalID
MethodName is the name of the actor method that the performance counter instance is associated with. The format of the method name is determined based on some logic in the Fabric Actors runtime that balances the readability of the name with constraints on the maximum length of the performance counter instance names on Windows.
ActorsRuntimeMethodId is the string representation of a 32-bit integer that is generated by the Fabric Actors runtime for its internal use. This is included in the performance counter instance name to ensure its uniqueness and avoid conflict with other performance counter instance names. Users should not try to interpret this portion of the performance counter instance name.
ServiceFabricPartitionID is the string representation of the Service Fabric partition ID that the performance counter instance is associated with. The partition ID is a GUID, and its string representation is generated through the Guid.ToString
method with format specifier "D".
ActorRuntimeInternalID is the string representation of a 64-bit integer that is generated by the Fabric Actors runtime for its internal use. This is included in the performance counter instance name to ensure its uniqueness and avoid conflict with other performance counter instance names. Users should not try to interpret this portion of the performance counter instance name.
The following is an example of a counter instance name for a counter that belongs to the Service Fabric Actor Method
category:
ivoicemailboxactor.leavemessageasync_2_89383d32-e57e-4a9b-a6ad-57c6792aa521_635650083804480486
In the example above, ivoicemailboxactor.leavemessageasync
is the method name, 2
is the 32-bit ID generated for the runtime's internal use, 89383d32-e57e-4a9b-a6ad-57c6792aa521
is the string representation of the Service Fabric partition ID, and 635650083804480486
is the 64-bit ID generated for the runtime's internal use.
List of events and performance counters
Actor method events and performance counters
The Reliable Actors runtime emits the following events related to actor methods.
Event name | Event ID | Level | Keyword | Description |
---|---|---|---|---|
ActorMethodStart | 7 | Verbose | 0x2 | Actors runtime is about to invoke an actor method. |
ActorMethodStop | 8 | Verbose | 0x2 | An actor method has finished executing. That is, the runtime's asynchronous call to the actor method has returned, and the task returned by the actor method has finished. |
ActorMethodThrewException | 9 | Warning | 0x3 | An exception was thrown during the execution of an actor method, either during the runtime's asynchronous call to the actor method or during the execution of the task returned by the actor method. This event indicates some sort of failure in the actor code that needs investigation. |
The Reliable Actors runtime publishes the following performance counters related to the execution of actor methods.
Category name | Counter name | Description |
---|---|---|
Service Fabric Actor Method | Invocations/Sec | Number of times that the actor service method is invoked per second |
Service Fabric Actor Method | Average milliseconds per invocation | Time taken to execute the actor service method in milliseconds |
Service Fabric Actor Method | Exceptions thrown/Sec | Number of times that the actor service method threw an exception per second |
Concurrency events and performance counters
The Reliable Actors runtime emits the following events related to concurrency.
Event name | Event ID | Level | Keyword | Description |
---|---|---|---|---|
ActorMethodCallsWaitingForLock | 12 | Verbose | 0x8 | This event is written at the start of each new turn in an actor. It contains the number of pending actor calls that are waiting to acquire the per-actor lock that enforces turn-based concurrency. |
The Reliable Actors runtime publishes the following performance counters related to concurrency.
Category name | Counter name | Description |
---|---|---|
Service Fabric Actor | # of actor calls waiting for actor lock | Number of pending actor calls waiting to acquire the per-actor lock that enforces turn-based concurrency |
Service Fabric Actor | Average milliseconds per lock wait | Time taken (in milliseconds) to acquire the per-actor lock that enforces turn-based concurrency |
Service Fabric Actor | Average milliseconds actor lock held | Time (in milliseconds) for which the per-actor lock is held |
Actor state management events and performance counters
The Reliable Actors runtime emits the following events related to actor state management.
Event name | Event ID | Level | Keyword | Description |
---|---|---|---|---|
ActorSaveStateStart | 10 | Verbose | 0x4 | Actors runtime is about to save the actor state. |
ActorSaveStateStop | 11 | Verbose | 0x4 | Actors runtime has finished saving the actor state. |
The Reliable Actors runtime publishes the following performance counters related to actor state management.
Category name | Counter name | Description |
---|---|---|
Service Fabric Actor | Average milliseconds per save state operation | Time taken to save actor state in milliseconds |
Service Fabric Actor | Average milliseconds per load state operation | Time taken to load actor state in milliseconds |
Events related to actor replicas
The Reliable Actors runtime emits the following events related to actor replicas.
Event name | Event ID | Level | Keyword | Description |
---|---|---|---|---|
ReplicaChangeRoleToPrimary | 1 | Informational | 0x1 | Actor replica changed role to Primary. This implies that the actors for this partition will be created inside this replica. |
ReplicaChangeRoleFromPrimary | 2 | Informational | 0x1 | Actor replica changed role to non-Primary. This implies that the actors for this partition will no longer be created inside this replica. No new requests will be delivered to actors already created within this replica. The actors will be destroyed after any in-progress requests are completed. |
Actor activation and deactivation events and performance counters
The Reliable Actors runtime emits the following events related to actor activation and deactivation.
Event name | Event ID | Level | Keyword | Description |
---|---|---|---|---|
ActorActivated | 5 | Informational | 0x1 | An actor has been activated. |
ActorDeactivated | 6 | Informational | 0x1 | An actor has been deactivated. |
The Reliable Actors runtime publishes the following performance counters related to actor activation and deactivation.
Category name | Counter name | Description |
---|---|---|
Service Fabric Actor | Average OnActivateAsync milliseconds | Time taken to execute OnActivateAsync method in milliseconds |
Actor request processing performance counters
When a client invokes a method via an actor proxy object, it results in a request message being sent over the network to the actor service. The service processes the request message and sends a response back to the client. The Reliable Actors runtime publishes the following performance counters related to actor request processing.
Category name | Counter name | Description |
---|---|---|
Service Fabric Actor | # of outstanding requests | Number of requests being processed in the service |
Service Fabric Actor | Average milliseconds per request | Time taken (in milliseconds) by the service to process a request |
Service Fabric Actor | Average milliseconds for request deserialization | Time taken (in milliseconds) to deserialize actor request message when it is received at the service |
Service Fabric Actor | Average milliseconds for response serialization | Time taken (in milliseconds) to serialize the actor response message at the service before the response is sent to the client |