5: Making Tailspin Surveys More Elastic
Retired Content |
---|
This content and the technology described is outdated and is no longer being maintained. For more information, see Transient Fault Handling. |
On this page: |
---|
The Premise | Goals and Requirements | Overview of the Autoscaling Solution | Using the Autoscaling Application Block in Tailspin Surveys - Features of the Autoscaling Application Block, Hosting the Autoscaling Application Block in Tailspin Surveys, Scale Groups in Tailspin Surveys | Autoscaling Rules in Tailspin Surveys | Collecting Autoscaling History Data in Tailspin Surveys | An Autoscaling Configuration UI | Notifying Operators by SMS When a Scaling Operation Takes Place | Inside the Implementation | Enabling the Autoscaling Application Block to Read from the .cscfg File | Tailspin's Service Information Definition | Tailspin's Autoscaling Rules - Tailspin Surveys Constraint Rules, Tailspin Surveys Reactive Scaling Rules, Tailspin Surveys Reactive Throttling Rules, Tailspin Surveys Operands | Collecting Performance Counter Data from Tailspin Surveys | Implementing Throttling Behavior | Editing and Saving Rules - Discovering the Location of the Rules Store, Reading and Writing to the Rules Store, Creating Valid Autoscaling Rules, Validating Target Names in the Rule Definitions | Editing and Saving the Service Information | Visualizing the Autoscaling Actions | Implementing a Custom Action - Integrating a Custom Action with the Autoscaling Application Block, Integrating a Custom Action with the Tailspin Surveys Rule Editor | Implementing Custom Operands - Integrating a Custom Operand with the Autoscaling Application Block, Integrating a Custom Operand with the Tailspin Surveys Rule Editor | Configuring Logging in Tailspin Surveys | Setup and Physical Deployment | Certificates and Tailspin Surveys Deployment - Deploying a Service Certificate to Enable SSL, Deploying the Management Certificate to Enable Scaling Operations | Deploying Tailspin Surveys in Multiple Geographic Locations - Data Transfer Costs, Role Instances, Configuration Differences, Application Differences | More Information |
This chapter walks you through the changes that Tailspin made when it added the Autoscaling Application Block to the Surveys application. These changes made it possible to automate the processes of adding and removing role instances as well as to manage the application's resource requirements in response to changes in demand for the application. The chapter also shows how Tailspin configured different sets of autoscaling rules for different elements of the Surveys application to meet their different autoscaling requirements and describes how Tailspin plans to monitor and refine its autoscaling rules.
The Premise
The number of customers using the Tailspin Surveys application continues to grow, with customers creating and publishing surveys all around the world. The number of surveys with large numbers of respondents is also increasing. Tailspin has noticed that there are an increasing number of bursts in demand associated with these large surveys, with the bursts occurring very shortly after the customer publishes the survey. Tailspin cannot predict when these bursts will occur, or in which geographic location they will occur. Tailspin has also noticed that there are overall bursts in demand at particular times in particular geographic locations. For some of these bursts Tailspin understands the cause, such as an upcoming holiday season; for others, Tailspin does not yet understand what triggered the burst in demand.
The Tailspin operators have always carefully monitored the traffic and the utilization levels of their systems. When needed, they manually adjust the number of web and worker role instances to accommodate the change in demand. However, they find it difficult to determine the correct number of role instances to have active at specific times. To ensure high performance and availability for the Surveys application, they usually start up more servers than they might need. However, when a large burst in traffic occurs, it can take Tailspin's operators some time to react and increase the server capacity, especially for non-US based data centers. Also, operators at Tailspin have sometimes been slow to shut down servers when a burst in activity is over.
Poe Says: | |
---|---|
|
The result of this manual process is that there are often too many active role instances, both during times of normal activity and during bursts of activity, which increases the operational costs of the Surveys application. Also, when an unpredicted burst in traffic occurs, it can take too long to add new role instances, which results in poor performance and a negative user experience.
Poe Says: | |
---|---|
|
Goals and Requirements
Tailspin wants to make their application more elastic, so that the number of servers can grow and shrink automatically as demand varies. This will reduce the costs of running the Surveys application in the Microsoft Azure™ technology platform and also reduce the number of ad hoc, manual tasks for Tailspin's operators.
Bharath Says: | |
---|---|
|
Tailspin wants to set explicit boundaries on the number of active role instances, to keep the operational costs within a predictable range and to ensure that the Azure SLA applies to the Surveys application.
In the past, Tailspin has encountered some very sudden bursts in demand; for example, when customers have offered a reward for the first number of people to complete a survey. Tailspin is concerned that new role instances cannot be started fast enough to meet these types of activity bursts. In this type of scenario, Tailspin would like to be able to immediately begin degrading some of the non-essential functionality of the application so that the UI response times are maintained until the new role instances have started and are available to help out. Tailspin would also like its operators to be notified by an SMS message when certain scaling operations are taking place.
Markus Says: | |
---|---|
|
Tailspin can already predict when some bursts in demand will occur based on data that it has collected in the past. Tailspin wants to be able to pre-emptively timetable the addition and removal of role instances so that there is no lag when such a burst in demand occurs.
Overall, Tailspin wants to reduce the operational costs while still providing the maximum performance for their end users. This means using the minimum number of role instances required to perform a task at the required level of performance. When the values for certain performance counters, such as CPU usage, exceed a predefined threshold, the system should add more role instances until the values have dropped to more acceptable levels. When the values for these performance counters drop enough, the role instance should be removed again.
Tailspin must be able to control the autoscaling parameters by defining rules. A rule defines which instances will be scaled, which metrics to monitor, and what thresholds to use for scaling up and down. For example, when CPU utilization hits 80% as averaged across the running instances, add a new role instance. It should also be possible to use rules to set the minimum and maximum number of role instances for a certain timeframe. These rules should be configurable through a user interface.
It is difficult to determine the right number of role instances for a particular task, even with an automatic scaling solution. Tailspin wants to have access to detailed logging information that records the autoscaling activities that have taken place. For example, Tailspin wants to know when a role instance was added or removed and what triggered that action. Tailspin plans to analyze this data and use the results to refine the autoscaling behavior of the Surveys application.
Tailspin wants to get an overview of the resource utilization over time. For example, it wants to know how many role instances were active at a certain point in time, what their CPU utilization was, and how many concurrent users were active at that time. Tailspin wants to be able to analyze when bursts in overall demand occurred in particular geographic locations so that it can plan for these events in the future. Tailspin also plans to use this information to fine-tune their pricing strategy based on more detailed analysis of predicted levels of demand.
Tailspin has deployed the Surveys application to multiple Azure data centers to ensure that the Surveys application is available in a data center located close to the majority of people who complete surveys. Tailspin would like to use a single management application from which Tailspin can create and edit the various autoscaling rules used in the different geographical locations, and also monitor the autoscaling behavior in all the data centers. At the same time, Tailspin wants to minimize the costs of this distributed autoscaling solution.
Overview of the Autoscaling Solution
This section describes how the Autoscaling Application Block helped the Tailspin Surveys application become more elastic.
Using the Autoscaling Application Block in Tailspin Surveys
This section describes how the Surveys application uses the features of the Autoscaling Application Block.
For more information about autoscaling and how the Autoscaling Application Block works, see Chapter 4, "Autoscaling and Microsoft Azure," in this guide.
Features of the Autoscaling Application Block
The Autoscaling Application Block provides two types of autoscaling rules: constraint rules and reactive rules. Tailspin uses both types of rules in the Surveys application.
Tailspin uses constraint rules to specify minimum and maximum numbers of role instances for both its worker and web roles. Tailspin can set minimum values to ensure that it meets its SLA commitments and use the maximum values to ensure that its costs are kept under control. It uses a number of constraint rules to set different maximum and minimum values at different times of the day when it can predict changes in the workload for the Surveys application.
Tailspin uses reactive rules to enable the Surveys application to respond to sudden changes in demand. These rules are based on metrics that Tailspin has identified for the Surveys application. Some of these metrics are standard performance counters; some require custom data that is specific to the Tailspin Surveys application.
In addition to rules that automatically change the number of role instances in the Surveys application, Tailspin also uses reactive rules to modify the behavior of the application. These rules are designed to help the surveys application respond quickly to sudden bursts in demand before new role instances can be started.
Tailspin operators also regularly review all of the diagnostics data that is collected from the Surveys application in order to ensure that the current rules provide the optimum behavior, that Tailspin's SLAs are being met, and that there are no unnecessary role instances running. Tailspin operators use this data to refine the set of rules in order to better meet the SLA and cost requirements of the Surveys application.
For more information about the functionality and usage of the Autoscaling Application Block, see Chapter 4, "Autoscaling and Microsoft Azure."
Hosting the Autoscaling Application Block in Tailspin Surveys
Tailspin decided to host the Autoscaling Application Block in a web role in Azure. It also considered hosting the Autoscaling Application Block in an on-premises application. Tailspin did not identify any benefits from hosting the Autoscaling Application Block in an on-premises application; it has no plans to integrate the Autoscaling Application Block with any existing on-premises logging or diagnostic tools. The following diagram shows some of the high-level components of the Tailspin Surveys application that will take part in the autoscaling process.
Note
The Tailspin autoscaling hosted service includes the autoscaling management web role and the autoscaling worker role. These two roles are typically developed together, and it simplifies the deployment of the Surveys application to place them in the same hosted service.
If your management website has more responsibilities than managing the autoscaling process, you should consider using a separate hosted service for each role so they can be developed and deployed in isolation.
Figure 1
Components that take part in autoscaling in Tailspin Surveys
In the Tailspin Surveys application, the public web role that people use to submit survey answers and the worker role will benefit from scaling; both have been affected by high usage levels in the past. Tailspin does not anticipate that the tenant's web roles will be affected by high levels of demand because tenants use the tenant web role to design new surveys. However, Tailspin will add some autoscaling rules in case of any unexpected bursts in demand.
Bharath Says: | |
---|---|
|
Jana Says: | |
---|---|
|
Scale Groups in Tailspin Surveys
The Tailspin Surveys application only has two role types that Tailspin wants to scale automatically: the public web role and the worker role responsible for statistics calculation and data export. These two roles have different usage patterns and need different autoscaling rules, so there is no benefit in grouping them together in a scale group. Similarly, the instances of each role type that run in different Azure data centers also need to be scaled independently of each other. Therefore, Tailspin does not use the scale groups feature of the Autoscaling Application Block.
Autoscaling Rules in Tailspin Surveys
The following table shows the initial set of constraint rules that Tailspin identified for use in the Surveys application.
Description |
Rank |
Timetable |
Role |
Maximum instances |
Minimum instances |
---|---|---|---|---|---|
Default constraints for all roles |
0 |
All day, every day |
Tailspin worker role |
5 |
2 |
Default constraints for all roles |
0 |
All day, every day |
Tailspin public web role |
5 |
2 |
Default constraints for all roles |
0 |
All day, every day |
Tailspin tenant web role |
5 |
2 |
Additional instances for public web role during peak hours |
5 |
08:00 – 09:50 on Monday, Tuesday, Wednesday, Thursday, Friday |
Tailspin public web role |
6 |
3 |
Bharath Says: | |
---|---|
|
The following table shows the initial set of reactive rules that Tailspin identified for use in the Surveys application.
Description |
Target |
Action |
---|---|---|
Look at number of rejected requests and CPU usage for the public website. If both become too large, then scale up the number of instances. |
Tailspin Surveys public web role |
Add one instance |
Look at number of rejected requests and CPU usage for the public website. If both drop to acceptable levels, then reduce the number of instances. |
Tailspin Surveys public web role |
Remove one instance |
When there is a burst in activity on the tenant website or the worker role, then enable throttling. This should reduce the load so that the site remains workable. If the load remains high, then the scaling rules will kick in and increase the number of instances. |
Tailspin Surveys worker role Tailspin Surveys tenant web role |
Throttling mode "ReduceSystemLoad" |
When the burst in activity is over, then disable throttling. |
Tailspin Surveys worker role Tailspin Surveys tenant web role |
Throttling mode "Normal" |
When there are many tenants or surveys and there is high CPU usage, then add more capacity to the public website. |
Tailspin Surveys public web role |
Add one instance |
When there are a normal number of tenants and surveys, then decrease the number of instances. |
Tailspin Surveys public web role |
Remove one instance |
Look at number of rejected requests and CPU usage for the tenant website. If both become large enough, then scale up the number of instances. |
Tailspin Surveys tenant web role |
Add one instance |
Look at number of rejected requests and CPU usage for the tenant website. If the load is acceptable, then reduce the number of instances. |
Tailspin Surveys tenant web role |
Remove one instance |
These tables show the initial set of rules that Tailspin used in the production version of the Surveys application. Tailspin will monitor and evaluate these rules to determine if they are producing optimal scaling behavior. Tailspin expects to make the following types of adjustment to the set of rules to improve their effectiveness.
- Modify the threshold values that determine when to scale down or out and when to enable or disable throttling.
- Add additional constraint rules to handle other predictable changes in activity.
- Change the metrics and timespans that the reactive rules use to trigger scaling actions.
- Use different rule sets in different Azure data centers to reflect different usage patterns in different geographic locations.
From the initial set of rules, Tailspin identified a set of metrics that it must configure the application block to collect. The following table shows the initial set of metrics that Tailspin identified for the reactive rules in the Surveys application.
Description |
Aggregate function |
Timespan |
Source |
Metric |
---|---|---|---|---|
CPU usage in the Surveys public web role |
Average |
20 minutes |
Tailspin public web role |
\Processor(_Total)\% Processor Time |
CPU usage in the Surveys worker role |
Average |
5 minutes |
Tailspin worker role |
\Processor(_Total)\% Processor Time |
CPU usage in the Surveys tenant web role |
Average |
20 minutes |
Tailspin tenant web role |
\Processor(_Total)\% Processor Time |
CPU usage in the Surveys tenant web role |
Average |
5 minutes |
Tailspin tenant web role |
\Processor(_Total)\% Processor Time |
Rejected ASP.NET requests in the Surveys public web role |
Average |
10 minutes |
Tailspin public web role |
\ASP.NET\Requests Rejected |
Rejected ASP.NET requests in the Surveys tenant web role |
Average |
10 minutes |
Tailspin tenant web role |
\ASP.NET\Requests Rejected |
Number of surveys submitted |
Average |
10 minutes |
Tailspin |
The number of active surveys in the Surveys application |
Number of tenants |
Average |
10 minutes |
Tailspin |
The number of registered tenants |
Number of instances of the Surveys public web role. |
Last |
8 minutes |
Tailspin public web role |
Role instance count |
Jana Says: | |
---|---|
|
Collecting Autoscaling History Data in Tailspin Surveys
Tailspin knows that usage patterns for the Surveys application change over time in different geographical locations. Tailspin is also aware that through careful analysis of the way the Surveys application is used, it can identify usage patterns.
Analyzing past behavior helps you to optimize your autoscaling rules.
Tailspin prefers to be proactive in the way that it approaches autoscaling, so it favors constraint rules over reactive rules. In this way it can try to ensure that it has the right number of instances active so that it can meet its SLA commitments without the need to add new instances in response to changes in workload. Therefore, every month Tailspin reviews the log data collected from the Autoscaling Application Block to try to identify any new patterns or changes in existing patterns. If there are any changes to usage patterns, it either modifies existing constraint rules or adds new ones.
Tailspin still maintains a set of reactive rules so that the Surveys application can respond to any unanticipated changes in its workload. Tailspin also analyzes when and why these reactive rules ran to make sure that they are performing the optimum scaling operations.
An Autoscaling Configuration UI
Although it is possible for administrators to edit the XML file that contains Tailspin's autoscaling rules directly, this is a potentially error-prone process. If used, a schema-aware XML editor may handle some of the validation issues, but some values in the rules definition file refer to entries in the service information definition and any errors in the references will not be detected by the XML validation. In addition, the administrators would also need to upload the rules XML file to the correct storage location for the application block to be able to load and use the new rules definitions. Because of these challenges, Tailspin decided to build a web-hosted rules editor that would handle all of the validation issues and be able to save the rules to the correct location.
There are similar issues associated with the XML file that contains the Survey's service information description. Tailspin anticipates that administrators will need to edit the scale group definitions in this file, and wants administrators to be able to perform this task through the same UI that they use for editing rules.
Poe Says: | |
---|---|
|
Notifying Operators by SMS When a Scaling Operation Takes Place
Sending notifications by SMS when performing a scaling action is not one of the built-in features of the Autoscaling Application Block. Tailspin decided to create a custom action to send SMS notifications. Tailspin can add this custom action to selected reactive rules so that its operators are always notified when significant scaling operations are taking place.
Although the Autoscaling Application Block already includes a feature that can send notifications when it performs a scaling action, the built-in feature uses email and Tailspin prefers to use SMS messages to notify its operators.
Inside the Implementation
This section describes some of the details of how Tailspin hosted the Autoscaling Application Block and modified the Surveys application to work with the application block. If you are not interested in the details, you can skip to the next section.
You may find it useful to have the Tailspin solution open in Microsoft® Visual Studio® development system while you read this section so that you can refer to the code directly.
For instructions about how to install the Tailspin Surveys application, see Appendix B, "Tailspin Surveys Installation Guide."
Enabling the Autoscaling Application Block to Read from the .cscfg File
The Autoscaling Application Block reads connection string information from the .cscfg file in order to access Azure storage. It does this by using the FromConfigurationSetting method in the CloudStorageAccount class. For this to work, the Surveys application must set up a configuration setting publisher at startup. The following code from the Global.asax.cs file in the Tailspin.Web.Management project shows this.
CloudStorageAccount.SetConfigurationSettingPublisher(
(s, p) => p(RoleEnvironment.GetConfigurationSettingValue(s)));
Markus Says: | |
---|---|
You should set up a configuration setting publisher for each role that the Autoscaling Application Block is configured to monitor and scale. |
For more information, see "CloudStorageAccount.SetConfigurationSettingPublisher Method" on MSDN.
Tailspin's Service Information Definition
The following code snippet shows the default service information definition for the Tailspin Surveys application. Tailspin defines the contents of this file once for the initial deployment; they do not anticipate changing anything after the Autoscaling Application Block is running.
<serviceModel ...>
<subscriptions>
<!--
Todo when installing the RI for the first time:
Update your subscription ID and Certificate Thumbprint
-->
<subscription name="TailspinSubscription"
subscriptionId="[Enter subscription id here]"
certificateThumbprint="[Enter certificate thumbprint here]"
certificateStoreName="My"
certificateStoreLocation="LocalMachine">
<services>
<service dnsPrefix="Tailspin-Surveys" slot="Staging" scalingMode="Scale">
<roles>
<role alias="SurveyWorkers"
roleName="Tailspin.Workers.Surveys"
wadStorageAccountName="TailspinStorage" />
<role alias="PublicWebSite"
roleName="Tailspin.Web.Survey.Public"
wadStorageAccountName="TailspinStorage" />
<role alias="TenantWebSite"
roleName="Tailspin.Web"
wadStorageAccountName="TailspinStorage" />
</roles>
</service>
</services>
<storageAccounts>
<!--
Todo when installing the RI for the first time:
Update the connection string to your storage account
-->
<storageAccount alias="TailspinStorage"
connectionString="[Enter connection string here]">
<queues>
<queue alias="SurveyAnswerStoredQueue"
queueName="surveyanswerstored" />
<queue alias="SurveyTransferQueue"
queueName="surveytransfer" />
</queues>
</storageAccount>
</storageAccounts>
</subscription>
</subscriptions>
<scaleGroups />
<stabilizer scaleUpCooldown="00:10:00" scaleDownCooldown="00:10:00"
notificationsCooldown="00:30:00">
<role roleAlias="PublicWebSite" scaleUpCooldown="00:08:00"
scaleDownCooldown="00:15:00" />
</stabilizer>
</serviceModel>
Note
If you are installing the Tailspin Surveys application, you must edit this file to add the information that is specific to your Azure account. For more information, see the section "Setup and Physical Deployment" in this chapter. A sample service information definition file is included in the "Sample Stores" folder in the Visual Studio solution.
The role and queue aliases are used in Tailspin Survey's autoscaling rules.
Jana Says: | |
---|---|
Tailspin is only using the Autoscaling Application Block to scale three roles, so it is not using scale groups. If you use scale groups in your application, you define them in the service information definition file. |
The stabilizer element shows the cool-down periods configured by Tailspin. These include global settings and specific settings for the public website. Tailspin has extended to 15 minutes the amount of time that must elapse after a scaling operation before the public website can be scaled down. Scaling up the public website can happen slightly earlier than the other roles.
Tailspin's Autoscaling Rules
The following code snippets show a default set of rules that Tailspin used when it first started using the Autoscaling Application Block. Tailspin plans to evaluate the effectiveness of these rules in scaling the Surveys application, and will make changes when it has collected sufficient data to be able to analyze the autoscaling behavior in their production environment. Tailspin has built a web-based rule editor to enable operators to edit the rules more easily. For more information about Tailspin's web-based rule editor, see the section "Editing and Saving Rules" in this chapter.
Tailspin Surveys Constraint Rules
The following code snippet shows the initial set of constraint rules that Tailspin configured for the Surveys application. There is a default rule that sets default values for all of the roles in Tailspin Surveys; it has a rank of zero. The second rule scales up the Tailspin Surveys worker role in anticipation of greater levels of activity during the work week.
<rules ...>
<constraintRules>
<rule name="Default constraints for all roles"
description="This rule sets the default constraints for all web and worker
roles. The minimum values guard our SLA, by ensuring there will never be
less than these instances. The maximum values guard our wallet, by ensuring
there will never be more than the configured number of instances."
enabled="true" rank="0">
<timetable startTime="00:00:00" duration="1.00:00:00" utcOffset="+00:00">
<daily />
</timetable>
<actions>
<range target="SurveyWorkers" min="2" max="5" />
<range target="PublicWebSite" min="2" max="5" />
<range target="TenantWebSite" min="2" max="5" />
</actions>
</rule>
<rule name="Additional instances for public web during peak hours"
description="Our testing has indicated that there will be additional load
during peak hours. To accommodate for that additional load, there will be
additional instances for the public website. These peaks occur during
working hours and early evenings. By providing a higher
rank, this rule takes precedence over the default rule."
enabled="true" rank="5">
<timetable startTime="08:00:00" duration="09:50:00" utcOffset="+00:00">
<weekly days="Monday Tuesday Wednesday Thursday Friday" />
</timetable>
<actions>
<range target="PublicWebSite" min="3" max="6" />
</actions>
</rule>
</constraintRules>
...
</rules>
Tailspin Surveys Reactive Scaling Rules
The following snippet shows how Tailspin initially defined the reactive scaling rules for the Surveys application. The first pair of rules defines how the Surveys public web role should scale up or down based on the number of rejected ASP.NET requests and CPU utilization levels. The second pair of rules defines how the Surveys public web role should scale up or down based on the number of tenants and active surveys. The third pair of rules defines how the Surveys tenant web role should scale up or down based on the number of rejected ASP.NET requests and CPU utilization levels.
Poe Says: | |
---|---|
Notice how the reactive rules are paired; one specifies when to scale up, and one specifies when to scale down. |
<rules ...>
...
<reactiveRules>
...
<rule name="Public Web - Heavy Demand (Increase)"
description="Look at number of rejected requests and CPU for the public
website. If either becomes too large, then scale up the number of
instances."
enabled="true">
<actions>
<scale target="PublicWebSite" by="1" />
</actions>
<when>
<all>
<greater operand="PublicWeb_AspNetRequestsRejected_Avg_10m" than="5" />
<greater operand="PublicWeb_CPU_Avg_20m" than="80" />
</all>
</when>
</rule>
<rule name="Public Web - Normal Demand (Reduce)"
description="Look at number of rejected requests and CPU for the public
website. If both drop to acceptable levels, then reduce the number of
instances."
enabled="true">
<actions>
<scale target="PublicWebSite" by="-1" />
</actions>
<when>
<all>
<lessOrEqual operand="PublicWeb_AspNetRequestsRejected_Avg_10m" than="1" />
<lessOrEqual operand="PublicWeb_CPU_Avg_20m" than="40" />
</all>
</when>
</rule>
<rule name="PublicWeb - Many Tenants Or Surveys (Increase)"
description="When there are many tenants or surveys and the CPU usage is
high, then we'll need more capacity in the public website.
This rule demonstrates
the use of the custom operands, called ActiveSurveyCount and TenantCount.
Using the load simulation page, you can easily add and remove tenants and
surveys to test the load on the system."
enabled="true">
<actions>
<scale target="PublicWebSite" by="1" />
</actions>
<when>
<all>
<any>
<greaterOrEqual operand="Tailspin_ActiveSurveyCount_Avg_10m"
than="50 * PublicWeb_InstanceCount_Last" />
<greaterOrEqual operand="Tailspin_TenantCount_Avg_10m"
than="50 * PublicWeb_InstanceCount_Last" />
</any>
<greater operand="PublicWeb_CPU_Avg_20m" than="50"/>
</all>
</when>
</rule>
<rule name="PublicWeb - Normal Tenants And Surveys (Decrease)"
description="When there are a normal number of tenants and surveys, then
decrease the number of instances."
enabled="true">
<actions>
<scale target="PublicWebSite" by="-1" />
</actions>
<when>
<all>
<less operand="Tailspin_TenantCount_Avg_10m"
than="30 * PublicWeb_InstanceCount_Last" />
<less operand="Tailspin_ActiveSurveyCount_Avg_10m"
than="30 * PublicWeb_InstanceCount_Last" />
</all>
</when>
</rule>
<rule name="TenantWeb - Heavy demand (Increase)"
description="Look at number of rejected requests and CPU for the tenant
website. If either becomes too large, then scale up the number of
instances."
enabled="true">
<actions>
<scale target="TenantWebSite" by="1" />
</actions>
<when>
<all>
<greaterOrEqual operand="TenantWeb_AspNetRequestsRejected_avg_10m"
than="5" />
<greaterOrEqual operand="TenantWeb_CPU_Avg_20m" than="80" />
</all>
</when>
</rule>
<rule name="TenantWeb - Normal Demand (Decrease)"
description="Look at number of rejected requests and CPU for the tenant
website. If the load is acceptable, then reduce the number of instances."
enabled="true">
<actions>
<scale target="TenantWebSite" by="-1" />
</actions>
<when>
<all>
<lessOrEqual operand="TenantWeb_AspNetRequestsRejected_avg_10m" than="2" />
<lessOrEqual operand="TenantWeb_CPU_Avg_20m" than="60" />
</all>
</when>
</rule>
...
</reactiveRules>
...
</rules>
Note
Tailspin has not assigned a rank to any of the reactive rules.
Tailspin Surveys Reactive Throttling Rules
Tailspin uses throttling rules to dynamically change the behavior of the Surveys public web role. It uses CPU utilization to determine when to enable and when to disable throttling in the Surveys application.
For more information about how Tailspin implemented the throttling behavior in the Surveys application, see the section "Implementing Throttling Behavior," later in this chapter.
Ed Says: | |
---|---|
Throttling behavior is triggered by using the changeSetting action. |
<rules ...>
...
<reactiveRules>
...
<rule name="TenantWeb & Survey Worker - Burst - Throttle"
description="When there is a burst in activity on the tenant website
and the worker role, then enable throttling. This should reduce
the load so that the site remains workable. If the load remains high,
then the scaling rules will kick in and increase the number of
instances. Throttling in Tailspin does the following:
* Disable exporting of values to Microsoft SQL Server in the worker role.
* Only allow paying tenants to the tenant site. Tenants on a trial
subscription cannot enter."
enabled="true">
<actions>
<changeSetting target="SurveyWorkers" settingName="ThrottlingMode"
value="ReduceSystemLoad" />
<changeSetting target="TenantWebSite" settingName="ThrottlingMode"
value="ReduceSystemLoad" />
</actions>
<when>
<all>
<greaterOrEqual operand="TenantWeb_CPU_Avg_5m" than="90" />
<greaterOrEqual operand="SurveyWorkers_CPU_Avg_5m" than="90" />
</all>
</when>
</rule>
<rule name="TenantWeb & Survey Worker - Burst - Stop throttling"
description="When there is no burst in activity, then disable throttling."
enabled="true">
<actions>
<changeSetting target="TenantWebSite" settingName="ThrottlingMode"
value="Normal" />
<changeSetting target="SurveyWorkers" settingName="ThrottlingMode"
value="Normal" />
</actions>
<when>
<any>
<lessOrEqual operand="SurveyWorkers_CPU_Avg_5m" than="50" />
<lessOrEqual operand="TenantWeb_CPU_Avg_5m" than="50" />
</any>
</when>
</rule>
...
</reactiveRules>
...
</rules>
Tailspin Surveys Operands
In addition to using the built-in performance counter operands, Tailspin created two custom operands, activeSurveysOperand and tenantCountOperand, that enable it to use the number of surveys with more than a specified number of answers in a rule and the number of tenants.
<rules ...>
...
<operands>
<roleInstanceCount alias="PublicWeb_InstanceCount_Last" timespan="00:08:00"
aggregate="Last" role="PublicWebSite" />
<performanceCounter alias="PublicWeb_AspNetRequestsRejected_Avg_10m"
timespan="00:10:00" aggregate="Average" source="PublicWebSite"
performanceCounterName="\ASP.NET\Requests Rejected" />
<performanceCounter alias="PublicWeb_CPU_Avg_20m" timespan="00:20:00"
aggregate="Average" source="PublicWebSite"
performanceCounterName="\Processor(_Total)\% Processor Time" />
<activeSurveysOperand alias="Tailspin_ActiveSurveyCount_Avg_10m"
timespan="00:10:00" aggregate="Average" minNumberOfAnswers="0"
xmlns="http://Tailspin/ActiveSurveys" />
<tenantCountOperand alias="Tailspin_TenantCount_Avg_10m" timespan="00:10:00"
aggregate="Average" xmlns="http://Tailspin/TenantCount" />
<performanceCounter alias="TenantWeb_AspNetRequestsRejected_avg_10m"
timespan="00:10:00" aggregate="Average" source="TenantWebSite"
performanceCounterName="\ASP.NET\Requests Rejected" />
<performanceCounter alias="TenantWeb_CPU_Avg_20m" timespan="00:20:00"
aggregate="Average" source="TenantWebSite"
performanceCounterName="\Processor(_Total)\% Processor Time" />
<performanceCounter alias="SurveyWorkers_CPU_Avg_5m" timespan="00:05:00"
aggregate="Average" source="SurveyWorkers"
performanceCounterName="\Processor(_Total)\% Processor Time" />
<performanceCounter alias="TenantWeb_CPU_Avg_5m" timespan="00:05:00"
aggregate="Average" source="TenantWebSite"
performanceCounterName="\Processor(_Total)\% Processor Time" />
</operands>
</rules>
For more information about how Tailspin implemented the custom operands, see the section "Implementing Custom Operands" in this chapter.
Collecting Performance Counter Data from Tailspin Surveys
The reactive rules that Tailspin uses for the Surveys application use performance counter data from the public web role and worker role. The Autoscaling Application Block expects to find this performance counter data in the Azure Diagnostics table named WADPerformanceCountersTable in Azure storage. Tailspin modified the public web and worker role in the Surveys application to save the performance counter data that the application block uses to evaluate the reactive rules.
Jana Says: | |
---|---|
You must remember to modify your application to collect the performance counter data that your reactive rules use and to transfer the performance counter data to Azure storage. |
The following code sample from the WebRole class in the Tailspin public web role configures the role to collect and save performance counter data.
...
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.Diagnostics;
using Microsoft.WindowsAzure.ServiceRuntime;
public class WebRole : RoleEntryPoint
{
public override bool OnStart()
{
var config = DiagnosticMonitor.GetDefaultInitialConfiguration();
var cloudStorageAccount =
CloudStorageAccount.Parse(
RoleEnvironment.GetConfigurationSettingValue(
"Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"));
// Get the perf counters
config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Add the perf counters
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier = @"\Processor(_Total)\% Processor Time",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier = @"\Process(aspnet_wp)\% Processor Time",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier = @"\Process(aspnet_wp)\Private Bytes",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier =
@"\Microsoft® .NET CLR Exceptions\# Exceps thrown / sec",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier = @"\ASP.NET\Requests Rejected",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier = @"\ASP.NET\Worker Process Restarts",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.DataSources.Add(
new PerformanceCounterConfiguration
{
CounterSpecifier = @"\Memory\Available Mbytes",
SampleRate = TimeSpan.FromSeconds(30)
});
// Diagnostics Infrastructure logs
config.DiagnosticInfrastructureLogs.ScheduledTransferPeriod =
System.TimeSpan.FromMinutes(1);
config.DiagnosticInfrastructureLogs.ScheduledTransferLogLevelFilter =
LogLevel.Verbose;
// Windows Event Logs
config.WindowsEventLog.DataSources.Add("System!*");
config.WindowsEventLog.DataSources.Add("Application!*");
config.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
config.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning;
// Azure Trace Logs
config.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
config.Logs.ScheduledTransferLogLevelFilter = LogLevel.Warning;
// Crash Dumps
CrashDumps.EnableCollection(true);
// IIS Logs
config.Directories.ScheduledTransferPeriod = TimeSpan.FromMinutes(10);
DiagnosticMonitor diagMonitor =
DiagnosticMonitor.Start(cloudStorageAccount, config);
return base.OnStart();
}
}
The Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString setting in the configuration file (.cscfg) determines the Azure storage account to use for the performance counter data.
Ed Says: | |
---|---|
This must be the same storage account that Tailspin configures for the dataPointsStoreAccount setting in the Autoscaling Application Block configuration. |
Implementing Throttling Behavior
Tailspin uses reactive rules to scale and throttle the Tailspin Surveys application. To implement the throttling behavior, Tailspin modified the Surveys application to change its behavior when a reactive rule changes the throttling mode. The following code snippet shows example reactive rules that request throttling actions in the Surveys application. These two example rules assign values to the configuration setting ThrottlingMode in Azure roles.
<rule name="TenantWeb & Survey Worker - Burst - Throttle"
description="..."
enabled="true">
<actions>
<changeSetting target="SurveyWorkers" settingName="ThrottlingMode"
value="ReduceSystemLoad" />
<changeSetting target="TenantWebSite" settingName="ThrottlingMode"
value="ReduceSystemLoad" />
</actions>
<when>
...
</when>
<rank>0</rank>
</rule>
<rule name="TenantWeb & Survey Worker - Burst - Stop throttling"
description="..."
enabled="true">
<actions>
<changeSetting target="TenantWebSite" settingName="ThrottlingMode"
value="Normal" />
<changeSetting target="SurveyWorkers" settingName="ThrottlingMode"
value="Normal" />
</actions>
<when>
...
</when>
<rank>0</rank>
</rule>
These settings must exist in the target role service definitions. The following snippet shows this in the Tailspin.Surveys.Cloud service definition file (*.csdef).
<WorkerRole name="Tailspin.Workers.Surveys">
<ConfigurationSettings>
...
<Setting name="ThrottlingMode" />
</ConfigurationSettings>
...
</WorkerRole>
The Surveys application uses the throttling mode value to change the behavior of the application. For example, in the Tailspin Surveys worker role the QueueHandler and BatchProcessingQueueHandler classes check the value of the setting before processing any messages. The following code sample shows how the TransferSurveysToSqlAzureCommand class checks the configuration setting.
public bool CanRun
{
get
{
return !this.configurationSettings.ConfigurationSettingEquals(
AzureConstants.ConfigurationSettings.ThrottlingMode, "ReduceSystemLoad");
}
}
The Tailspin tenant web role also uses the setting to inform users when the application is being throttled. The Index.aspx file in the Tailspin.Web.Areas.Survey.Views.Surveys folder reads the configuration value, and the view then displays a message when the throttling mode is set to ReduceSystemLoad.
Editing and Saving Rules
This section describes how Tailspin built its web hosted autoscaling rule editor so that it saves the rules to the correct location, ensures that the rule definitions for the Surveys application comply with the Autoscaling Application Block rules schema, and handles cross-validation with the Surveys application's service information definition.
Discovering the Location of the Rules Store
To be able to load and save Tailspin Survey's autoscaling rules to the correct location, the application must read the Autoscaling Application Block configuration from the web.config file of the web role that hosts the application block. The following code sample from the SharedContainerBootstrapper class in the Tailspin.Shared project shows how this is done.
private static IConfigurationFileAccess
CreateServiceInformationModelConfigurationFileAccess(IUnityContainer container)
{
AutoscalingSettings settings =
(AutoscalingSettings)ConfigurationManager
.GetSection("autoscalingConfiguration");
BlobServiceInformationStoreData serviceInformationStoreData =
(BlobServiceInformationStoreData)settings.ServiceInformationStores
.Get(settings.ServiceInformationStoreName);
return new BlobConfigurationFileAccess(
new AzureStorageAccount(serviceInformationStoreData.StorageAccount),
serviceInformationStoreData.BlobContainerName,
serviceInformationStoreData.BlobName,
serviceInformationStoreData.MonitoringRate,
container.Resolve<ILogger>());
}
Reading and Writing to the Rules Store
The Autoscaling Application Block includes the RuleSetSerializer class that uses instances of the RuleSetElement class to deserialize from and serialize to the rules store. The LoadRuleSet and SaveCurrentRuleSet methods in the RuleSetModelStore class in the AutoScaling/Rules folder in the Tailspin.Shared project illustrate how the Surveys application uses the RuleSetSerializer class.
using
Microsoft.Practices.EnterpriseLibrary.WindowsAzure.Autoscaling.Rules.Configuration;
...
private readonly IConfigurationFileAccess fileAccess;
private readonly RuleSetModelToXmlElementConverter ruleSetModelToXmlElementConverter;
...
private RuleSetModel currentRuleSet;
private RuleSetSerializer serializer;
public RuleSetModelStore(
RuleSetModelToXmlElementConverter ruleSetModelToXmlElementConverter,
[Dependency("RuleSetModel")] IConfigurationFileAccess fileAccess,
RetryManager retryManager)
{
...
this.ruleSetModelToXmlElementConverter = ruleSetModelToXmlElementConverter;
this.fileAccess = fileAccess;
this.CreateSerializer();
this.LoadRuleSet();
}
private void CreateSerializer()
{
var allExtensions = new IRuleSerializationExtension[]
{
new AssemblyRuleSerializationExtension(
typeof(ActiveSurveysOperandElement).Assembly.FullName)
};
this.serializer = new RuleSetSerializer(
allExtensions.SelectMany(e => e.CustomActionDefinitions),
allExtensions.SelectMany(e => e.CustomParameterDefinitions));
}
private void LoadRuleSet()
{
string fileContent = this.GetFileContent();
RuleSetElement ruleSetElement;
if (string.IsNullOrEmpty(fileContent))
{
ruleSetElement = new RuleSetElement();
}
else
{
ruleSetElement = this.serializer.Deserialize(new StringReader(fileContent));
}
this.currentRuleSet = this.ruleSetModelToXmlElementConverter
.ConvertElementToModel(ruleSetElement);
}
...
public void SaveCurrentRuleSet()
{
lock (this.syncRoot)
{
var writer = new StringWriter();
RuleSetElement element =
this.ruleSetModelToXmlElementConverter
.ConvertModelToElement(this.currentRuleSet);
this.serializer.Serialize(writer, element);
this.SetFileContent(writer.ToString());
}
}
Creating Valid Autoscaling Rules
Tailspin uses the classes in the Microsoft.Practices.EnterpriseLibrary.WindowsAzure.Autoscaling.Rules.Configuration namespace in the Autoscaling Application Block to ensure that the Surveys application rule editor creates autoscaling rules that are valid according the autoscaling rules schema. For example, the ConstraintRuleToXmlElementConverter class converts between the ConstraintRuleModel class used by the Tailspin Surveys rule editor, and the ConstraintRuleElement class that the Autoscaling Application Block uses. For additional examples, see the other converter classes in the Tailspin.Shared.AutoScaling.Rules.XmlElementConverters namespace.
Markus Says: | |
---|---|
It is easier to bind the Tailspin model classes to the UI than to bind the Autoscaling Application Block element classes to the UI. |
Validating Target Names in the Rule Definitions
Target names in rule actions are aliases for roles and scale groups. The source name of some operands in the rule definitions is also an alias for a role. These aliases are defined in the service information definition for your application. In the rule editor in Tailspin Surveys, the text box where the user enters the target name supports auto-completion based on a list of role aliases and scale group names from the service information definition. The following code sample from the _ConstraintRuleActionEditor.cshtml file in the Tailspin.Web.Management project shows how the UI element is constructed.
<tr>
<td>@Html.HiddenFor(m => m.Id)@Html.EditorFor(m => m.Target, "_AutoComplete",
new { @class = "targetTextBox", url = @Url.Action("GetTargets", "Home",
new { Area = "ServiceInformation" }),
placeholder = "{Target Name}" })@Html.ValidationMessageFor(m => m.Target)</td>
...
</tr>
The following code sample from the HomeController class in the Tailspin.Web.Management.Areas.ServiceInformation.Controllers namespace shows the GetTargets method that is invoked from the view above.
public ActionResult GetTargets()
{
var roleAliasses =
this.serviceInformationModelStore.GetAllRoles().Select(r => r.Alias);
var scaleGroups =
this.serviceInformationModelStore.GetScaleGroups().Select(r => r.Alias);
return this.Json(roleAliasses.Union(scaleGroups).Where(
s => !string.IsNullOrEmpty(s)), JsonRequestBehavior.AllowGet);
}
Editing and Saving the Service Information
The implementation of the service information authoring features described in this section is very similar to the implementation of the rule editing and saving features in Tailspin Surveys.
The ServiceInformationModelStore class is responsible for discovering the location of the service information store by querying the application block's configuration file and is also responsible for enabling the Surveys application to be able to read and write the service information. The application block does not include a custom XML serializer class for the service information, so the ServiceInformationModelStore class uses an XMLSerializer instance to handle the deserialization and serialization to XML.
The service information model classes in the Tailspin.Shared.AutoScaling.ServiceInformation namespace provide their own conversion to the element classes in the Autoscaling Application Block Microsoft.Practices.EnterpriseLibrary.WindowsAzure.Autoscaling.ServiceModel.Configuration namespace. Tailspin uses the element classes in the block to ensure that the Surveys application creates a valid service information XML document to save to the service information store.
Visualizing the Autoscaling Actions
The operators at Tailspin want to be able to see the autoscaling operations that have occurred in the Surveys application in order to help them understand how the Autoscaling Application Block is working. Tailspin has developed a number of visualization charts that plot the number of role instances and show the maximum and minimum constraint values over time.
To create these charts, Tailspin needs information about the current and recent number of role instances for all of the roles that the block is managing as well as information about the maximum and minimum constraint values that were in effect for the period shown on the chart so that these can also be shown on the charts.
The Autoscaling Application Block records the number of role instances for each of the Azure roles that it is managing as data points in the data points store. This store is a table in Azure storage.
Whenever the block evaluates its autoscaling rules, it writes a log message that includes details of the maximum and minimum instance counts that were permitted at the time the rules were evaluated and details of the scaling action, if any, which were suggested by the reactive rules.
The following code sample from the GraphController class in the Tailspin.Web.Management.Areas.Monitoring.Controllers namespace shows how the management website retrieves the instance count values from the data points store to plot on a chart.
private void AddInstanceCountSeries(Chart chart, DateTimeFilter dateTimeFilter,
string sourceName, string sourceAlias)
{
IEnumerable<DataPoint> dataPoints = this.dataPointsStore.Get(
sourceAlias,
"RoleInstanceCount",
"RoleInstanceCount",
dateTimeFilter.GetEffectiveStartDate(),
dateTimeFilter.GetEffectiveEndDate());
Series series = chart.Series.Add(sourceName);
series.ChartType = SeriesChartType.StepLine;
series.ToolTip = "TimeStamp = #VALX{d} \n Number of instances = #VALY{d}";
series.ChartArea = ChartArea;
series.BorderWidth = 5;
series.Color = this.GetRoleColor(sourceAlias);
foreach (DataPoint dp in dataPoints)
{
series.Points.AddXY(dp.DataTimestamp.DateTime.ToLocalTime(), dp.Value);
}
if (!dataPoints.Any())
{
series.Name += " (No matching datapoints found)";
}
AddEmptyStartEndPoints(series, dateTimeFilter);
this.RememberMaximum(chart, dataPoints.MaxOrZero(m => m.Value));
}
In this example, the sourceName string is the name of the role, and Tailspin uses a DateTimeFilter object to specify the range of data points to retrieve. The Get method is provided by the AzureStorageDataPointStore class in the Autoscaling Application Block.
The following code sample from the GraphController class in the Tailspin.Web.Management.Areas.Monitoring.Controllers namespace shows how the management website retrieves the maximum and minimum permitted instance count values from the data points store to plot on a chart.
private void AddMinMaxSeries(Chart chart, DateTimeFilter dateTimeFilter, string sourceName, string sourceAlias)
{
IEnumerable<WADLogsTableEntity> minMaxLogMessages =
this.logDataStore.Get(
dateTimeFilter.GetEffectiveStartDate(),
dateTimeFilter.GetEffectiveEndDate(),
Constants.Scaling.Events.RequestForConfigurationChange);
List<MinMaxInstanceCountDataPoint> minMaxLogMessagesForRole =
minMaxLogMessages.SelectMany(
l => this.CreateMinMaxModels(l, sourceAlias)).ToList();
Series minSeries = chart.Series.Add(string.Empty);
minSeries.ChartArea = ChartArea;
minSeries.ChartType = SeriesChartType.StackedArea;
minSeries.IsVisibleInLegend = false;
minSeries.Color = Color.Transparent;
foreach (MinMaxInstanceCountDataPoint minMaxLogMessage in
minMaxLogMessagesForRole)
{
minSeries.Points.AddXY(minMaxLogMessage.EventDateTime.ToLocalTime(),
minMaxLogMessage.MinInstanceCount);
}
Series maxSeries = chart.Series.Add("Minimum and Maximum instance count");
maxSeries.ChartArea = ChartArea;
maxSeries.ChartType = SeriesChartType.StackedArea;
maxSeries.Color = Color.FromArgb(98, 0, 73, 255); // Transparent blue
foreach (MinMaxInstanceCountDataPoint minMaxLogMessage in
minMaxLogMessagesForRole)
{
var index = maxSeries.Points.AddXY(
minMaxLogMessage.EventDateTime.ToLocalTime(),
minMaxLogMessage.MaxInstanceCount - minMaxLogMessage.MinInstanceCount);
maxSeries.Points[index].ToolTip = string.Format(
"Min Instance Count = {0}\nMax Instance Count = {1}",
minMaxLogMessage.MinInstanceCount, +minMaxLogMessage.MaxInstanceCount);
}
if (!minMaxLogMessagesForRole.Any())
{
minSeries.Name += " (No matching datapoints found)";
maxSeries.Name += " (No matching datapoints found)";
}
this.RememberMaximum(chart, minMaxLogMessagesForRole.MaxOrZero(
m => m.MaxInstanceCount));
}
In this example, the sourceName string is again the name of the role, and Tailspin again uses a DateTimeFilter object to specify the range of data points to retrieve. In this example, Tailspin implemented the Get method in the AzureStorageWadLogDataStore class in the Tailspin.Web.Management.Areas.Monitoring.Models namespace.
Poe Says: | |
---|---|
If you regularly purge old log data from the Azure diagnostics log tables, this will limit how far back you can view this data on the chart. |
To plot the charts, Tailspin used the ASP.NET charting controls to enable clickable behavior. Users can click on the charts to discover more detail about the data behind the points.
Markus Says: | |
---|---|
If you want to learn more about the way that the Tailspin Surveys application renders the charts, take a look at the classes in the Tailspin.Web.Management.Areas.Monitoring namespace. |
Implementing a Custom Action
This section describes how Tailspin implemented a custom action that it can use alongside existing scaling actions to notify operators by an SMS message when important scaling operations are taking place. The Autoscaling Application Block provides an extension point for creating custom actions. Tailspin must also ensure that its rule editing UI can load and save the custom action definitions to the rules store.
Markus Says: | |
---|---|
For Tailspin, adding a custom action requires two sets of related changes. The first is to ensure that the Autoscaling Application Block knows about the custom action, the second is to ensure that the rule editing UI knows about the custom action. |
Integrating a Custom Action with the Autoscaling Application Block
Actions are a part of reactive autoscaling rules that the application block reads from its rules store. Tailspin uses the default blob XML rules store, so Tailspin must provide a way for the application block to deserialize its custom action from the XML document.
The following snippet shows how Tailspin might add a custom action in the rules store.
Note
Tailspin is currently not using this custom action.
<reactiveRules>
<rule ...>
...
<actions>
<smsAction xmlns="http://Tailspin/SendSMS"
phoneNumber="+8888" message="Alert, reactive rule..."/>
</actions>
</rule>
</reactiveRules>
Markus Says: | |
---|---|
If Tailspin operators edited rules in an XML editor, Tailspin could add validation and IntelliSense® behavior to the editor if it created an XML schema for the http://Tailspin/SendSMS namespace. |
Tailspin first created the class shown in the following code sample to perform the deserialization of the custom action from the XML rules store. Notice how the attributes are used to identify the XML element, attributes, and namespace.
[XmlRoot(ElementName = "smsAction", Namespace = "http://Tailspin/SendSMS")]
public class SendSmsActionElement : ReactiveRuleActionElement
{
[XmlAttribute("phoneNumber")]
public string PhoneNumber { get; set; }
[XmlAttribute("message")]
public string Message { get; set; }
public override ReactiveRuleAction CreateAction()
{
return new SendSmsAction
{
Message = this.Message,
PhoneNumber = this.PhoneNumber
};
}
}
The CreateAction method returns a SendSmsAction instance that performs the custom action. The following code snippet shows the SendSmsAction class, which extends the ReactiveRuleAction class.
public class SendSmsAction : ReactiveRuleAction
{
public SendSmsAction()
{
}
public string PhoneNumber { get; set; }
public string Message { get; set; }
public override IEnumerable<RuleEvaluationResult> GetResults(
ReactiveRule forRule, IRuleEvaluationContext context)
{
return new[]
{
new SendSmsActionResult(forRule)
{
Message = this.Message, PhoneNumber = this.PhoneNumber
}
};
}
}
The rules evaluator in the block calls the GetResults method of all the actions for the current rule, and then calls the Execute method on each RuleEvaluationResult object that is returned. The following code snippet shows the SendSmsActionResult class (the ExecuteActionResult class extends the RuleEvaluationResult class).
public class SendSmsActionResult : ExecuteActionResult
{
private readonly ISmsSender smsSender;
public SendSmsActionResult(Rule sourceRule)
: base(sourceRule)
{
this.smsSender =
EnterpriseLibraryContainer.Current.GetInstance<ISmsSender>();
}
public string PhoneNumber { get; set; }
public string Message { get; set; }
public override string Description
{
get
{
return string.Format("Sends an SMS to number: '{0}' with message: '{1}'",
this.PhoneNumber, this.Message);
}
}
public override void Execute(IRuleEvaluationContext context)
{
this.smsSender.Send(this.PhoneNumber, this.Message);
}
}
The block uses the Description property when it logs the sending of the SMS message.
Ed Says: | |
---|---|
If you throw an exception in the Execute method it must be of type ActionExecutionException. |
Finally, Tailspin used the Enterprise Library configuration tool to tell the Autoscaling Application Block about the custom action.
<autoscalingConfiguration ... >
...
<rulesStores>
<add name="Blob Rules Store" type=... >
<extensionAssemblies>
<add name="Tailspin.Shared" />
</extensionAssemblies>
</add>
</rulesStores>
...
</autoscalingConfiguration>
The extensionAssemblies element adds the name of the assembly that contains the classes that define the custom action.
Integrating a Custom Action with the Tailspin Surveys Rule Editor
The Tailspin Surveys rule editor allows administrators to edit the autoscaling rules for the Surveys application in a web UI. This editor can read and save rule definitions to the rules store that the Autoscaling Application Block uses. The block treats the rules store as a read-only store, but the block includes a RuleSetSerializer class, which provides support for saving rule definitions to the store.
Tailspin configures the RuleSetSerializer instance that the rule editor uses with the details of the custom action and operand. The following code snippet from the RuleSetModelStore class shows how the two extensions (the custom action and operand) are loaded.
Markus Says: | |
---|---|
The example loads the assembly containing the custom ActiveSurveysOperandElement class. This assembly also contains the custom SendSmsActionElement class. The extension is loaded explicitly in code, because the management website is not hosting the Autoscaling Application Block and so cannot use the configuration setting to load it. |
private RuleSetSerializer serializer;
public RuleSetModelStore(
RuleSetModelToXmlElementConverter ruleSetModelToXmlElementConverter,
[Dependency("RuleSetModel")] IConfigurationFileAccess fileAccess,
RetryManager retryManager)
{
...
this.CreateSerializer();
...
}
private void CreateSerializer()
{
var allExtensions = new IRuleSerializationExtension[]
{
new AssemblyRuleSerializationExtension(
typeof(ActiveSurveysOperandElement).Assembly.FullName)
};
this.serializer = new RuleSetSerializer(
allExtensions.SelectMany(e => e.CustomActionDefinitions),
allExtensions.SelectMany(e => e.CustomParameterDefinitions));
}
After the extensions are added to the serializer, the rules editor can load and save rules that include the custom actions and operands that Tailspin has created.
You do not need to load the assembly that contains your extensions programmatically in the project that hosts the Autoscaling Application Block because the application block already contains code that will load extension assemblies based on the entries in the autoscalingConfiguration section of the configuration file. In the Tailspin Surveys solution, the Tailspin.Workers.Autoscaling worker role hosts the Autoscaling Application Block and thus loads the extensions automatically; however, the Tailspin.Web.Management web role (which does not host the Autoscaling Application Block) must load the extensions programmatically.
Implementing Custom Operands
The process for creating a custom operand is very similar to the process for creating a custom action. Tailspin implemented two custom operands that enable the rules to use the number of active surveys and the current number of tenants as metrics in a reactive rule.
Ed Says: | |
---|---|
For a custom action, you must extend the ReactiveRuleAction and ExecuteActionResult classes; for a custom operand, you must provide an implementation of the IDataPointsCollector interface. |
The Autoscaling Application Block provides an extension point for creating custom operands. Tailspin must also ensure that its rule editing UI can load and save the custom operands to the rules store.
Integrating a Custom Operand with the Autoscaling Application Block
Operands are a part of reactive autoscaling rules that the application block reads from its rules store. Tailspin uses the default blob XML rules store, so Tailspin must provide a way for the application block to deserialize its custom operand from the XML document.
The following snippet shows an example of Tailspin's activeSurveysOperand custom operand in the rules store.
<operands>
...
<activeSurveysOperand alias="Tailspin_ActiveSurveyCount_Avg_10m"
timespan="00:10:00" aggregate="Average" minNumberOfAnswers="0"
xmlns="http://Tailspin/ActiveSurveys" />
...
</operands>
Markus Says: | |
---|---|
If Tailspin operators edited rules in an XML editor, Tailspin could add validation and IntelliSense behavior to the editor if it created XML schemas for the http://Tailspin/ActiveSurveys and http://Tailspin/TenantCount namespaces. |
Tailspin first created the class shown in the following code sample to perform the deserialization of the activeSurveysOperand custom operand from the XML rules store. Notice how the attributes are used to identify the XML element, attributes, and namespace.
[XmlRoot(ElementName = "activeSurveysOperand",
Namespace = "http://Tailspin/ActiveSurveys")]
public class ActiveSurveysOperandElement : DataPointsParameterElement
{
[XmlAttribute("minNumberOfAnswers")]
public int MinNumberOfAnswers( get; set; }
protected override string DataPointName
{
get
{
return this.DataPointType;
}
}
protected override string DataPointType
{
get
{
return "Number of Active Surveys";
}
}
protected override string SourceName
{
get
{
return "Tailspin";
}
}
protected override Func<IServiceInformationStore,
IEnumerable<IDataPointsCollector>> GetCollectorsFactory()
{
var samplingRate =
ActiveSurveysDataPointsCollector.DefaultPerformanceCounterSamplingRate;
return (sis) =>
new[]
{
new ActiveSurveysDataPointsCollector(
EnterpriseLibraryContainer.Current.GetInstance<ISurveyStore>(),
EnterpriseLibraryContainer.Current
.GetInstance<ISurveyAnswersSummaryStore>(),
samplingRate ,
this.MinNumberOfAnswers,
this.SourceName,
this.DataPointType,
this.DataPointName)
};
}
}
The MinNumberOfAnswers property defines an optional attribute that Tailspin uses to filter the list of surveys that it is counting. For example, if Tailspin sets the minNumberOfAnswers attribute of the operand to 5000, then the activeSurveysOperand will only count surveys that currently have at least 5000 answers collected.
The GetCollectorsFactory method instantiates an ActiveSurveysDataPointsCollector object that performs the custom data collection operation. The following code snippet shows the ActiveSurveysDataPointsCollector class, which implements the IDataPointsCollector interface. This class is responsible for collecting the data points. The Collect method uses the FilterSurveys method to retrieve only surveys that have at least the minimum number of answers specified by the minNumberOfAnswers attribute in the rules store.
public class ActiveSurveysDataPointsCollector : IDataPointsCollector
{
private readonly ISurveyStore surveyStore;
private readonly ISurveyAnswersSummaryStore surveyAnswersSummaryStore;
private readonly TimeSpan samplingRate;
private readonly int minimumNumberOfAnswers;
private readonly string sourceName;
private readonly string dataPointType;
private readonly string dataPointName;
public ActiveSurveysDataPointsCollector(ISurveyStore surveyStore,
ISurveyAnswersSummaryStore surveyAnswersSummaryStore,
TimeSpan samplingRate, int minNumberOfAnswers, string sourceName, string
dataPointType, string dataPointName)
{
this.surveyStore = surveyStore;
this.surveyAnswersSummaryStore = surveyAnswersSummaryStore;
this.samplingRate = samplingRate;
this.minimumNumberOfAnswers = minNumberOfAnswers;
this.sourceName = sourceName;
this.dataPointType = dataPointType;
this.dataPointName = dataPointName;
}
public static TimeSpan DefaultPerformanceCounterSamplingRate
{
get { return TimeSpan.FromMinutes(2); }
}
public TimeSpan SamplingRate
{
get { return this.samplingRate; }
}
public string Key
{
get { return string.Format(CultureInfo.InvariantCulture,
"{0}|{1}", this.minimumNumberOfAnswers, this.samplingRate); }
}
public IEnumerable<DataPoint> Collect(DateTimeOffset collectionTime)
{
IEnumerable<Survey> surveys;
try
{
surveys = this.surveyStore.GetActiveSurveys(FilterSurveys).ToList();
}
catch (StorageClientException ex)
{
throw new DataPointsCollectionException(
"Could not retrieve surveys", ex);
}
return new[]
{
new DataPoint
{
CreationTime = collectionTime,
Source = this.sourceName,
Type = this.dataPointType,
Name = this.dataPointName,
Value = surveys.Count(),
DataTimestamp = collectionTime
}
};
}
private bool FilterSurveys(string tenantname, string slugname)
{
if (this.minimumNumberOfAnswers == 0)
{
return true;
}
var answersSummary =
this.surveyAnswersSummaryStore.GetSurveyAnswersSummary(
tenantname, slugname);
if (answersSummary == null)
{
return false;
}
return answersSummary.TotalAnswers > this.minimumNumberOfAnswers;
}
}
Ed Says: | |
---|---|
Notice how the Collect method can throw an exception of type DataPointsCollectionException. Any exceptions thrown in this method must be of this type. |
Finally, Tailspin used the Enterprise Library configuration tool to tell the Autoscaling Application Block about the custom action. Because the custom operand and custom action are in the same assembly, there is only a single entry in the extensionAssemblies element.
<autoscalingConfiguration ... >
...
<rulesStores>
<add name="Blob Rules Store" type=... >
<extensionAssemblies>
<add name="Tailspin.Shared" />
</extensionAssemblies>
</add>
</rulesStores>
...
</autoscalingConfiguration>
Integrating a Custom Operand with the Tailspin Surveys Rule Editor
This is done in exactly the same way as integrating the custom action with the rules editor. Because the custom operand and custom action are in the same assembly, the CreateSerializer method in the RuleSetModelStore class only adds a single extension assembly.
Configuring Logging in Tailspin Surveys
The Autoscaling Application Block allows you to choose between logging implementations. Because the Autoscaling Application Block is hosted in an Azure worker role and Tailspin does not require any of the additional features offered by the Enterprise Library Logging Application Block, Tailspin uses the logging infrastructure defined in the System.Diagnostics namespace. The following snippet from the configuration for the Azure worker role that hosts the Autoscaling Application Block shows the logging configuration for Tailspin Surveys. The autoscalingConfiguration section selects the system diagnostics logging infrastructure for the Autoscaling Application Block, and the system.diagnostics section configures the logging sources for the log messages from the block.
<autoscalingConfiguration loggerName="Source Logger" ...>
<loggers>
<add name="Source Logger" type="Microsoft.Practices.EnterpriseLibrary
.WindowsAzure.Autoscaling.Logging.SystemDiagnosticsLogger,
Microsoft.Practices.EnterpriseLibrary.WindowsAzure.Autoscaling />
</loggers>
...
</autoscalingConfiguration>
...
<system.diagnostics>
<sources>
<source name="Autoscaling General" switchValue="All">
<listeners>
<add name="AzureDiag" />
<remove name="Default" />
</listeners>
</source>
<source name="Autoscaling Updates" switchValue="All">
<listeners>
<add name="AzureDiag" />
<remove name="Default" />
</listeners>
</source>
</sources>
<sharedListeners>
<add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener,
Microsoft.WindowsAzure.Diagnostics, Version=1.0.0.0, Culture=neutral,
PublicKeyToken=31bf3856ad364e35"
name="AzureDiag"/>
</sharedListeners>
<trace>
<listeners>
<add type="Microsoft.WindowsAzure.Diagnostics.DiagnosticMonitorTraceListener,
Microsoft.WindowsAzure.Diagnostics, Version=1.0.0.0, Culture=neutral,
PublicKeyToken=31bf3856ad364e35"
name="AzureDiagnostics">
<filter type="" />
</add>
</listeners>
</trace>
</system.diagnostics>
Note
The values of the type attributes are shown split over multiple lines. The configuration file should not contain any line breaks.
Setup and Physical Deployment
This section discusses considerations you should take into account when deploying the Tailspin Surveys application.
Certificates and Tailspin Surveys Deployment
When you deploy the Tailspin Surveys application you must also deploy a number of certificates. This section describes the role of the certificates, where they are deployed, and how to obtain or generate suitable certificates. This section focuses on the certificates used directly by Tailspin Surveys and the Autoscaling Application Block. For more information about the certificates used by the simulated issuers that handle claims-based identity management, see the guide "Developing Applications for the Cloud" on MSDN.
When you deploy the Tailspin Surveys application, there are two certificates that you must deploy. One certificate enables Tailspin Surveys to use an HTTPS endpoint, and the other certificate is used by the Autoscaling Application Block to make Azure Service Management API calls to the Tailspin Surveys hosted service. The block uses these API calls to collect data from Tailspin Surveys and to make scaling requests.
Deploying a Service Certificate to Enable SSL
The Dependency Checker tool that you use to install the Tailspin Surveys solution on your local development machine includes a sample localhost certificate that you can use to enable HTTPS when you deploy the Surveys application to Azure. Both the Tailspin Surveys tenant website and management website use HTTPS endpoints. The following snippet from the service definition file (.csdef) for the Tailspin.Web role shows the certificate and endpoint definitions.
<WebRole name="Tailspin.Web" ...>
...
<Certificates>
<Certificate name="localhost_ssl" storeLocation="LocalMachine" storeName="My" />
</Certificates>
<Endpoints>
<InputEndpoint name="HttpsIn" protocol="https" port="443"
certificate="localhost_ssl" />
</Endpoints>
...
</WebRole>
The service configuration file identifies the certificate to use by its thumbprint.
Note
The localhost certificate included with the Tailspin Surveys solution is for demonstration purposes only and should not be used in a production environment.
You must upload the service certificate you plan to use to secure your HTTPS endpoints to the certificate store in your Azure portal and to ensure that the thumbprint of the certificate that you upload matches the thumbprint in the service configuration file (.cscfg).
For more information about obtaining an SSL certificate, see "How to Obtain an SSL Certificate."
For more information about configuring HTTPS endpoints in Azure web roles, see "How to Configure an SSL Certificate on an HTTPS Endpoint."
Deploying the Management Certificate to Enable Scaling Operations
In the Tailspin Surveys application, the Autoscaling Application Block is hosted in a separate worker role from the main Surveys application. The Autoscaling Application Block uses the Azure Service Management API to perform scaling actions on the Tailspin Surveys roles, and this API is secured using a management certificate. This section describes how Tailspin created and deployed this management certificate.
Tailspin uses a standard X.509 v3 certificate with a key length of 2048 bits for the management certificate. To generate this self-signed certificate, Tailspin ran the following command in the Visual Studio command prompt window to create the certificate and install it in the local certificate store.
makecert -r -pe -n "CN= Tailspin Management Certificate" -b 05/10/2010 -e 12/22/2012 -ss my -sr localmachine -sky exchange -sp "Microsoft RSA SChannel Cryptographic Provider" -sy 12
Tailspin then uploaded the public key to the Management Certificates folder in the Azure subscription that hosts the Tailspin Surveys application, and the private key to the Service Certificates folder in the hosted service that hosts the Autoscaling Application Block. This enables the Autoscaling Application Block to secure the Azure Service Management API calls that it makes to the subscription that hosts the Tailspin Surveys application.
Poe Says: | |
---|---|
You can use the Certificates snap-in in the Microsoft Management Console (MMC) to export a file that contains the public key (.cer) and a file that contains the private key (.pfx). |
For more information on management and service certificates in Azure, see "Managing Certificates in Azure."
Deploying Tailspin Surveys in Multiple Geographic Locations
The sample version of the Tailspin Surveys application is designed to deploy to a single data center where the Autoscaling Application Block can scale the application in and out by adding and removing role instances. This represents the first phase of Tailspin's plan to roll out autoscaling to all the locations where the Tailspin Surveys application is currently deployed; these locations are the North Central US Data Center, the West Europe Data Center, and the Southeast Asia Data Center. Tailspin wants to be able to manage the autoscaling behavior in all the data centers from a single, centralized management application.
Figure 2 shows the current architecture in the sample solution in which Tailspin uses the Autoscaling Application Block to manage the Surveys application in a single data center.
Figure 2
Tailspin Surveys deployed to a single data center
Although Tailspin could use the same architecture in the other data centers, this would mean that each data center has its own management website. Tailspin wants to use a single management website to gain a consolidated view of the complete autoscaling infrastructure of the Tailspin Surveys application.
Tailspin considered two alternative architectures for its autoscaling infrastructure. Figure 3 shows the first alternative where the Autoscaling Application Block and management web application are hosted in the US data center.
Figure 3
Option 1: deploying the Autoscaling Application Block centrally
Figure 4 shows the second alternative, in which the Autoscaling Application Block is deployed in each data center and the management web application is still deployed in the US data center.
Figure 4
Option 2: deploying the Autoscaling Application Block in each data center
Both of these alternatives achieve Tailspin's goal of managing the autoscaling infrastructure from a central management application, but there are a number of trade-offs to consider between the two alternatives. Some of these trade-offs are summarized below.
Data Transfer Costs
Although both alternatives will involve data transfers from the remote data centers to the US data center, in option 1, all of the performance counter metrics that the application block collects from the Azure diagnostics tables is transferred to the US data center and stored in the data points store. In option 2, all of the performance counter data is stored in a local data points store. However, any metric data that the management application uses for displaying charts and reports still has to be brought over the network.
Tailspin anticipates that the data transfer costs will be lower if it adopts option 2. Option 2 will also reduce the time taken to transfer data to the data points store and minimize the risk of any transient network conditions impacting on the autoscaling process.
Bharath Says: | |
---|---|
You can use Azure storage analytics to gain deeper insight into your data usage. For more information, see "Storage Analytics Overview." |
Role Instances
Both alternatives need only a single role instance for the management application. Tailspin does not anticipate heavy usage of this application, so it can use a small instance.
In option 1, there is a single instance of the worker role that hosts the Autoscaling Application Block running in the US data center. Tailspin estimates that it can use either a small or medium-sized role instance in this scenario.
In option 2, there is a single instance of the worker role that hosts the Autoscaling Application Block running in each data center. Tailspin estimates that it can use a small role instance for this worker role in each data center.
Beth Says: | |
---|---|
Tailspin plans to use only single instances of the autoscaling roles because it does not require the Azure SLA guarantees for these roles. |
Option 2 will use more role instances than option 1.
Configuration Differences
Option 1 stores all the service information and autoscaling rules in stores in the US data center. If Tailspin is to use different rules in each data center it must be careful to adopt naming conventions to refer to the roles in the different data centers, and the rules and operands that apply to those roles. With option 2, each data center has its own rules store and service information.
In both cases, it is possible to use different rules in each data center if the autoscaling requirements differ. With option 1, Tailspin must take more care to make the rules manageable by adopting suitable naming conventions.
Application Differences
The existing management web application will work unchanged with option 1 as it is designed to work with a single service information store and a single rules store. It would not be too difficult for Tailspin to enhance the management website to work with multiple information stores and rules stores as required by option 2.
The existing custom operands will not work with option 1 because they are not designed to work with multiple instances of the Tailspin Surveys application; there is currently no way to configure them to collect data from a specific instance of the Surveys application. The custom operands will work unchanged with option 2 because each instance of the Autoscaling Application Block manages a single instance of Tailspin Surveys.
Tailspin has decided to go ahead and implement option 2. In this model, each data center is self-contained with the Tailspin Surveys application and the Autoscaling Application Block. This makes it easier for Tailspin to manage the different autoscaling requirements of each application block, and minimizes the quantity of data that is moved between data centers. Tailspin will enhance the web-based autoscaling management application to support this scenario.
More Information
For more information about autoscaling and how the Autoscaling Application Block works, see Chapter 4, "Autoscaling and Microsoft Azure," in this guide.
For instructions about how to install the Tailspin Surveys application, see Appendix B, "Tailspin Surveys Installation Guide."
For more information about the certificates used by the simulated issuers that handle claims-based identity management, see the guide Developing Applications for the Cloud, 2nd Edition on MSDN:
https://msdn.microsoft.com/en-us/library/ff966499.aspx
For more information about the CloudStorageAccount.SetConfigurationSettingPublisher method, see CloudStorageAccount.SetConfigurationSettingPublisher Method on MSDN: https://msdn.microsoft.com/en-us/library/microsoft.windowsazure.cloudstorageaccount.setconfigurationsettingpublisher.aspx
For more information about obtaining an SSL certificate, see "How to Obtain an SSL Certificate" on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234634
For more information about configuring HTTPS endpoints in Azure web roles, see "How to Configure an SSL Certificate on an HTTPS Endpoint" on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234623
For more information on management and service certificates in Azure, see "Managing Certificates in Azure" on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234616
You can use Azure storage analytics to gain deeper insight into your data usage. For more information, see "Storage Analytics Overview" on MSDN:
https://go.microsoft.com/fwlink/?LinkID=234635
Next Topic | Previous Topic | Home
Last built: June 7, 2012