Events
CycleCloud 8.0 generates events when certain changes happen (e.g., when a node is created, or a cluster is deleted). Some events are instantaneous (e.g., deleting a cluster), and some events represent transitions (e.g., creating a node which implies creating a VM). In these cases, the event is sent at the end of the transition, whether successful or not.
CycleCloud can be configured to publish to an Event Grid topic by connecting it in the CycleCloud Settings page in CycleCloud. Event Grid event subscriptions can be attached to the topic to route the events to a destination, such as a Storage Queue, where a program can consume events and process them.
Event objects
The events are in the standard Event Grid schema. All CycleCloud-specific elements are in the data
property on the event.
Name | Type | Description |
---|---|---|
eventId | String | Uniquely identifies the event |
eventTime | String | The time of this event (yyyy-MM-ddTHH:mm:ss.SSSZZ) |
eventType | String | The kind of state transition that happened (e.g., Microsoft.CycleCloud.NodeCreated ) |
subject | String | The affected resource (see Event Subject) |
dataVersion | String | The schema in use for data (currently "1") |
In addition, there are several custom properties in data
for almost all events:
Property | Type | Description |
---|---|---|
status | Status (String) | Whether this transition was successful or not |
reason | Reason (String) | Why this event was initiated |
message | String | A human-readable summary of this event |
errorCode | String | The code for this operation if it failed or was unavailable. Note this may come directly from Azure calls and may not be present for all failures |
Cluster events
CycleCloud sends events when clusters are modified. Cluster events contain the following common properties in data
:
Property | Type | Description |
---|---|---|
clusterName | String | The name of the cluster |
Microsoft.CycleCloud.ClusterStarted
This event is fired when a cluster is started.
Microsoft.CycleCloud.ClusterTerminated
This event is fired when a cluster is terminated.
Microsoft.CycleCloud.ClusterDeleted
This event is fired when a cluster is deleted.
Microsoft.CycleCloud.ClusterSizeIncreased
This event is fired when nodes are added to the cluster. There is one event for each set of nodes added. (Nodes in a set all have the same definition.)
Property | Type | Description |
---|---|---|
nodesRequested | Integer | How many nodes were requested for this set |
nodesAdded | Integer | How many nodes were actually added to the cluster |
nodeArray | String | The nodearray these nodes were created from |
subscriptionId | String | The subscription ID for this node's resources |
region | String | The location of this node |
vmSku | String | The SKU (i.e., machine type) for the VM |
priority | String | The VM pricing model in effect (either "regular" or "spot") |
placementGroupId | String | The placement group these nodes are in, if any |
Node events
CycleCloud sends events when nodes change state. Node events contain additional information in the data
property:
Property | Type | Description |
---|---|---|
status | Status (String) | Whether this event was successful or not |
clusterName | String | The name of the cluster this node is in. Names are not unique over time |
nodeName | String | The name of the node that is affected. Names are not unique over time |
nodeId | String | The id of this node. Node ids are unique over time, and once a node is deleted the id will not be reused |
nodeArray | String | The name of the nodearray this node was created from |
resourceId | String | The Azure resource for the VM, if there was one created |
subscriptionId | String | The subscription ID for this node's resources |
region | String | The location of this node |
vmSku | String | The SKU (i.e., Machine Type) for the VM |
priority | String | The VM pricing model in effect (either "regular" or "spot") |
placementGroupId | String | The placement group this node is in, if any |
retryCount | Integer | How many times this specific action was attempted previously (see Retry Count) |
timing | (Object) | A map of stages in this event and their durations (see Timing) |
Microsoft.CycleCloud.NodeAdded
This event is fired for each node that is added to a cluster. (To get an event for a set of nodes added at once, see ClusterSizeIncreased.) This is sent when the node first appears in the UI, so it does not have any timing information.
Microsoft.CycleCloud.NodeCreated
This event is fired each time a node is started for the first time (i.e., a VM is created for it). This event contains the following timing information:
Create
: The total time to create the node. This includes creating the VM and configuring the VM.CreateVM
: How long it took to create the VM.Configure
: How long it took to install software and configure the node.
Microsoft.CycleCloud.NodeDeallocated
This event is fired each time a node is deallocated. This event contains the following timing information:
Deallocate
: The total time to deallocate the node.DeallocateVM
: How long it took to deallocate the VM.
Microsoft.CycleCloud.NodeStarted
This event is fired each time a node is re-started from a deallocated state. This event contains the following timing information:
Start
: The total time it took to restart the deallocated node.StartVM
: How long it took to start the deallocated VM.
Microsoft.CycleCloud.NodeTerminated
This event is fired each time a node is terminated and its VM is deleted. This event contains the following timing information:
Terminate
: The total time it took to terminate the node.DeleteVM
: How long it took to delete the VM.
Subject
Each event has a "subject" which can be used for filtering in Event Grid. Events in CycleCloud have subjects in the following pattern:
/sites/SITENAME
: for events specific to a given CycleCloud installation/sites/SITENAME/clusters/CLUSTERNAME
: for cluster-level events/sites/SITENAME/clusters/CLUSTERNAME/nodes/NODENAME
: for node-level events
This allows "scoping" an Event Grid subscription to a specific prefix to collect a subset of events. This can be used in conjunction with Event Type filtering.
Status
Succeeded
: the operation was successful.Failed
: the operation was not successful. There is often areason
and/orerrorCode
set.Canceled
: the operation was canceled.
Reason
Some events have a reason that they were initiated. Unless otherwise indicated, these are set on the ClusterSizeIncreased
, NodeAdded
, NodeCreated
, NodeDeallocated
, NodeStarted
, and NodeTerminated
events.
Autoscaled
: the node was modified in response to an autoscaling request made through the APIUserInitiated
: the operation was done directly through the UI or CLISystem
: the operation was initiated by CycleCloud (e.g., by default, execute nodes are automatically removed from the cluster when terminated)SpotEvicted
: the event was triggered because a spot VM was evicted (NodeTerminated events only)VMDisappeared
: the event was triggered because a non-spot VM disappeared (NodeTerminated events only)AllocationFailed
: the VM could not be allocated due to placement or capacity constraints (NodeTerminated/NodeDeallocated events only, with the status indicating the result of the terminate/deallocate operation)
Note
The reason
is set on NodeTerminated events to indicate why the node was terminated.
When a node fails to be created due to capacity, it fails with the specific error code from Azure (of which there are several).
The node is then automatically terminated, and the reason for the termination is AllocationFailed
.
When a running spot VM is evicted, the create operation had already succeeded.
The node is then automatically terminated and the reason given for the termination event is SpotEvicted
.
Timing
Some events contain timing information. The timing
entry in data
is an object with keys corresponding to stages of the event, and values as total seconds. Each event may have multiple timing stages associated with it. For instance, suppose a node is added to a cluster, started, and terminated:
- T1: User adds a node. A
NodeAdded
event is sent, with no timing. - T2: The create-VM operation fails, so
NodeCreated
is sent with a status of Failed and the following timing information:Create
: T2-T1CreateVM
: T2-T1
- T3: User clicks Retry
- T4: The Create-VM operation succeeds, so the node starts installing software.
- T5. The software installs successfully, so
NodeCreated
is sent with a status of Succeeded and the following timing information:Create
: (T5-T3)CreateVM
: (T4-T3)Configure
: (T5-T4)
- T6: User clicks Terminate.
- T7: The delete-VM operation succeeds, so
NodeTerminated
is sent with a state of Succeeded and the following timing information:Started
: T6-T5Terminate
: T7-T6DeleteVM
: T7-T6
Previous State Timing
The first time a node transitions to a state (whether successfully or not), it has no previous state. When the target state is then changed after that point, the time spent in the previous state is included in the event for the new target state. Note that this is only included if it reached the previous state successfully. Thus these timing entries measure the length of time for the following:
Started
: prior to this event, the node had been running (i.e., green)Deallocated
: prior to this event, the node had been deallocatedTerminated
: prior to this event, the node had been off
This can be used, for instance, to track how long a spot VM was running before it was evicted.
Retry Count
Some operations can be retried in CycleCloud if they fail. These operations are reflected in the NodeCreated
, NodeDeallocated
, NodeStarted
, and NodeTerminated
events. These events contain an optional retryCount
property on the event's data
property indicating how many times prior to this the operation was attempted. This property is included on subsequent retries, whether those attempts succeeded or failed.