Service Metering Guidance
You may need to meter the use of applications or services in order to plan future requirements; to gain an understanding of how they are used; or to bill users, organization departments, or customers. This is a common requirement, particularly in large corporations and for independent software vendors and service providers.
Why is Metering Important?
Metering is the process of measuring and recording the usage of an entire application, individual parts of an application, or specific services and resources. For example, you may want to record the time a user or customer spends using an application or service, the number of queries against a database, the number of times a specific service is accessed, the processing time for requests, and more. You might also want to measure the amount of storage used by each user or customer, or the total size of data transfers.
It is also useful to meter specific scenarios or use cases, such as selecting a product and placing an order or performing a complex business operation. This requires end-to-end mapping of the operation so that all metered components of it can be combined to give an overall metric that provides useful business information.
Many cloud hosting environments, including Microsoft Azure, do not expose metering information other than the standard billing details accessible to the account owner. It may seem easy to use this billing information to gauge the usage of features, but the details are not broken down in a way that allows you to identify individual applications (or users).
If you need to implement metering for your applications and services, you must create custom mechanisms to achieve this. Typically the instrumentation you add to your applications can provide much of the base data you require. For example, you can use performance counters to measure the average and peak values for the number of a specific operation performed, the volume of data moved in or out of the application, or the average time a specific processes takes to execute. See Instrumentation and Telemetry Guidance elsewhere in this guide for more information.
Scenarios for Metering
When designing metering systems you must consider not only why you want to implement metering, but also the scenario in which it will operate. The appropriate choices for the metering methods, and the items that are metered, differ based on factors such as business requirements, application type, and the customer or user base. The following sections include some examples.
Metering for Forward Planning
Metering can provide valuable information about the way that an application is used, and can identify trends that indicate future requirements such as storage and compute resources. This information is also useful for deciding which features of an application are the most popular, as well as identifying relationships between features and resource usage. For example, metering may indicate that only a very small percentage of users take advantage of one feature of an application, but another feature is very popular and the load at peak periods is affecting response times.
Metering can also provide trend data, such as the average rate of growth of storage used by the application and the cost of this storage per user. This may be useful in directing development effort towards improving storage methods, or moving to a different type of store that can provide additional capacity or reduce storage costs.
Metering for Internal Business Use
When planning for metering business use in a large organization, the primary requirement is to be able to identify each item at the required level of granularity. The data you log for each function can include the current user ID or name or a department name, depending on the purpose of the metering. For example, in an organization that needs to bill individual departments for the use of an application, the metering granularity needs only be at department level. However, if at some point you need to identify which user in a department is using specific features, the logs must also include the user ID or name.
Consider using a structured or semantic logging approach so that the data from the log entry can be easily extracted (see the Instrumentation and Telemetry Guidance for more information). It may also be possible to use data from the built-in infrastructure logs. For example, the IIS request log entries may contain a user ID in the query string.
Metering for Software as a Service (SaaS) Vendors
If the application is designed to serve different customers, such as a multi-tenant application, you may want to implement metering both for forward planning, such as partitioning the tenants and data, and in order to bill customers for the features and services they actually use—especially if the users consume expensive resources. For example, you may want to bill customers for resources such as processing time, storage, or bandwidth. However, it is important to understand that there is a difference between how the platform is metered and billed, and how a SaaS vendor typically bills a user.
Many vendors immediately think that they should directly pass on all the costs to customers. However, detailed usage can be difficult to measure in a multi-tenant solution that shares many resources. Customers are likely to find the billing model difficult to understand, and it makes it hard for them to predict their costs. This approach may also fail to accurately match all vendor costs, such as development and maintenance costs, with income from customers. Instead, it is worth considering alternative approaches:
- Pay-per-use plans where customers are billed based on the resources they use, but with an overhead that covers the vendor’s development, maintenance, and other fixed and ongoing costs. Specific instrumentation must be included in the application to support metering for billing. This plan has the advantage that it relates the customers’ costs to their usage, but it may not be financially capable of supporting the vendor’s investments during the early lifetime of the application when there are few customers. It may also result in complex bills that are hard for customers to understand and predict their ongoing costs.
- Fixed fee plans where customers pay a regular amount that covers all the vendor’s fixed and ongoing costs. To make this more attractive to customers it may be possible to offer different levels of functionality or support, so as to maximize income from a range of small and large customers. One advantage of this plan is that it does not require specific instrumentation to support metering for billing purposes, but the application should still incorporate sufficient instrumentation for monitoring and debugging.
- Fixed fee plans with additional bolt-on features, where the customer is billed a fixed amount for using the application, and can opt for the availability of additional metered features such as extra storage or a higher limit on the number of requests per minute. This requires instrumentation to be included in the application that measures the usage of each bolt-on feature, and prevents the customer exceeding the preset limits.
- Combination plans where there is a fixed monthly fee with additional metered charges based on the usage of specific features, services, or resources in the application. This does require specific instrumentation to be included in the application to support metering for billing, but has the advantage that heavy use, especially of expensive resources, is accounted for and the vendor is protected against unexpected costs that might result. For example, a customer that exceeds a preset quota for storage could be charged extra for each additional gigabyte of data stored.
For more information about building multi-tenant applications, see the p&p guide Developing Multi-tenant Applications for the Cloud on MSDN. The chapter Hosting a Multi-Tenant Application on Microsoft Azure discusses billing and costing in a multi-tenant application. Additional information and sample code is available from Multi-Tenant Metering for Microsoft Azure on the ISV Developer blog, the associated Cloud Ninja Metering Block on CodePlex, and Meter and Autoscale Multi-Tenant Applications in Microsoft Azure on the Azure Insider blog.
Considerations for Metering
When planning to implement metering in your application, consider the following points:
- Why you want to include metering. It may be to plan future requirements; to gain an understanding of how applications are used; to enforce quotas; to bill customers or users; to understand your costs for specific operations; or to identify areas that could be optimized to increase profitability. These decisions should be driven by business requirements. A common challenge, especially for organizations unfamiliar with moving to or building in the cloud, is poorly defined business requirements. If it is not clear why you need to gather metering information, you are unlikely to collect the data you really need.
- The cost of collecting the metrics, and the balance between the value they provide and the impact on application operation. If metering code cannot be incorporated into existing components or roles, and you need to increase the number of instances of components or roles just to carry out metering, you might increase costs beyond the savings or income available from metering. For example, the cost of measuring storage transactions for small tenants may exceed the small fraction of the total operating costs that are incurred by the transactions. One possible solution is to use shared metering components for multiple applications, but you must be aware of any related security issues that this may introduce.
- The robustness of the metering system. If the metering system fails—even partially—or logs are lost, this may have a major impact on vendor profitability. One approach is to regularly checkpoint the logs and save intermediate totals elsewhere, perhaps using a background scheduled task. This is particularly useful where there are many small value transactions. Event log analysis utilities may be able to detect and even restart a failed metering system.
- Taking advantage of surrogate metrics, and metering on an end-to-end scenario or use case basis. For example, count the number of orders placed by a tenant instead of trying to measure transaction size, data storage size, and other intermediate operational factors. This simplifies the metering mechanism and reduces the load on the application while still providing a useful billing metric.
The following examples explore some of the ways that service metering may be relevant in different scenarios:
- Project document storage. Clients upload and store project documents for team collaboration. The application issues clients with a Shared Access Signature (SAS) token that they use to access the storage. To control costs, the application enforces a quota on the storage size by regularly checking the total amount used by each client. If it exceeds the quota, the application will no longer issue that client with a SAS that enables uploads. In this scenario the majority of the cost is storage, and so bandwidth usage and transaction counts are ignored. Important points are to consider how often to collect and save this metering data (daily, hourly, or when an upload occurs), and how long to keep each value. For operational purposes you need only the current size of each client’s storage, but for identifying trends you might want to keep details of the storage use by each client at regular intervals over a longer period.
- Processing and compute. Analysis of a complex model in an engineering application may take between one and 60 minutes to complete, depending on the complexity of the model and the data required. The code is instrumented to record the time it takes to complete a successful analysis of a model, identified by the client ID, and this is used to track the total processing time per client for either quota enforcement or for highly accurate billing purposes. However, a simpler approach, if there is less variation in the processing time for each model, might be to count the total number of analyses performed by each client. Billing could then be based on a standard charge for each analysis performed. The average time taken for each analysis can be calculated from the totals, compared across all clients, and used to determine the individual average cost. The result would be monitored over time to detect changes.
- Web application usage. The application could be instrumented to track the number of requests and the time online for each user. If the load on the application has a direct relationship to the number of concurrent users, these counts or time periods can be used to allocate charges to a tenant or a department. This information can be used for billing purposes, or to better understand how customers use the application. This information could also be used when determining how to rebalance and partition tenants in a multi-tenant application.
- Data transfer. IIS web logs contain an entry for each request/response, and this entry contains information such as the number of bytes transferred as well as the time taken for the request to be processed. The tenant or client ID will often be a part of these entries. It may be included in the query string, or it may be part of the host name if the tenant is in a subdomain or uses a custom domain. Instrumentation is not necessary to collect the information; instead, existing log analysis tools can perform the required analysis. However, this technique will probably not be able to provide information about the internal operations within the application unless they are identified in the request path.
- Storage. Database storage in a multi-tenant solution can represent a significant cost. If each tenant has a separate database it is relatively easy to meter costs, but a shared database approach requires the application to monitor and meter database actions for each tenant, or even for each client. Typically, a large proportion of the data access and storage volume is represented by only a subset of the tables, reducing the amount of instrumentation required and consequent log size. Alternatively the metering code can simply count the number of rows in relevant tables, and associate a value or weighting to each table based on this.
Related Patterns and Guidance
The following guidance may also be relevant to your scenario when implementing metering in your cloud-hosted applications:
- Instrumentation and Telemetry Guidance. Service metering is typically implemented by adding instrumentation to applications, and for large solutions by using telemetry to collect and communicate this information to analysis tools. The Instrumentation and Telemetry Guidance explores the process of gathering remote diagnostics information that is collected by instrumentation in applications.
The chapter Hosting a Multi-Tenant Application on Microsoft Azure in the p&p guide Developing Multi-tenant Applications for the Cloud on MSDN.
The article and sample code from Multi-Tenant Metering for Microsoft Azure on the ISV Developer blog.
The Cloud Ninja Metering Block on CodePlex.
The article Meter and Autoscale Multi-Tenant Applications in Microsoft Azure on the Azure Insider blog.