Workload optimization

Article
10/30/2024

This article helps you understand the Workload optimization capability within the FinOps Framework and how to implement that in the Microsoft Cloud.

Definition

Workload optimization refers to the process of ensuring cloud services are utilized and tuned to maximize business value and minimize wasteful usage and spending.

Review how services get used and ensure each is maximizing return on investment. Evaluate and implement best practices and recommendations.

Every cost should have direct or indirect traceability back to business value. Eliminate fully "optimized" resources that aren't contributing to business value.

Review your resource usage patterns and determine if they can be scaled down or even shutdown (to stop billing) during off-peak hours. To reduce costs, consider cheaper alternatives. Avoid unnecessary usage and costs that don't contribute to the mission, which in turn increases return on investment and profitability.

Getting started

When you first start working with a service or managing costs in the cloud, prioritize using native tools within the portal to drive efficiency and optimize costs.

Review and implement Cloud Adoption Framework costing best practices.
Review and implement Azure Well-Architected Framework cost optimization guidance.
Review and implement Azure Advisor cost recommendations.
- Azure Advisor gives you high-confidence recommendations based on your usage. Azure Advisor is always the best place to start when looking to optimize any workload.
- Consider subscribing to Azure Advisor alerts to get notified when there are new cost recommendations.
Review your usage and purchase commitment discounts when it makes sense.
Take advantage of Azure Hybrid Benefit for Windows, Linux, and SQL Server.
Familiarize yourself with the services you use, how you're charged, and what service-specific cost optimization options you have.
- You can discover the services you use from the Azure portal All resources page or from the Services view in Cost analysis.
- To learn how each service charges you, explore the Azure pricing pages and Azure pricing calculator. Use them to identify options that might reduce costs. For example, shared infrastructure and commitment discounts.
- Review service documentation to learn about any cost-related features that could help you optimize your environment or improve cost visibility. Some examples:
  - Choose spot VMs for low priority, interruptible workloads.
  - Avoid cross-region data transfer.
Determine if services can be paused or stopped to stop incurring charges.
- Some services support autostop natively, like Microsoft Dev Box, Azure DevTest Labs, Azure Lab Services, and Azure Load Testing.
- If you use a service that supports being stopped, but not autostopping, consider using a lightweight flow in Power Automate or Logic Apps.
- If the service can't be stopped, review alternatives to determine if there are any options that can be stopped to stop billing.
- Pay close attention to noncompute charges that might continue to be billed when a resource is stopped so you're not surprised. Storage is a common example of a cost that continues to be charged even if a compute resource that was using the storage is no longer running.
Does the service support serverless compute?
- Serverless compute tiers can reduce costs when not active. Some examples: Azure SQL Database, Azure SignalR Service, Cosmos DB, Synapse Analytics, Azure Databricks.
Review service documentation to learn about any cost-related features that could help you optimize your environment or improve cost visibility. Some examples:
- Choose spot VMs for low priority, interruptible workloads.
- Avoid cross-region data transfer.
Determine if services support autoscaling.
- If the service supports autoscaling, configure it to scale based on your application's needs.
- Autoscaling can work with autostop behavior for maximum efficiency.
To avoid unnecessary costs, consider automatically stopping and manually starting nonproduction resources during work hours.
- Avoid automatically starting nonproduction resources that aren't used every day.
- If you choose to autostart, be aware of vacations and holidays where resources might get started automatically but not be used.
- Consider tagging manually stopped resources. To ensure all resources are stopped, Save a query in Azure Resource Graph or a view in the All resources list and pin it to the Azure portal dashboard.
Consider architectural models such as containers and serverless to only use resources when they're needed, and to drive maximum efficiency in key services.
Use the Cost optimization workbook to evaluate resource utilization, like idle and unused resources.

🏗️ Building on the basics

At this point, you implemented all the basic cost optimization recommendations and tuned applications to meet the most fundamental best practices. As you move beyond the basics, consider the following points:

Automate cost recommendations using Azure Resource Graph
Stay abreast of emerging technologies, tools, and industry best practices to further optimize resource utilization.
Automate the process of automatically scaling or stopping resources that don't support it or have more complex requirements.
- Consider using automation services, like Azure Automation or Azure Functions.
Assign an "Env" or Environment tag to identify which resources are for development, testing, staging, production, etc.
- Prefer assigning tags at a subscription or resource group level. Then enable the tag inheritance policy for Azure Policy and Cost Management tag inheritance to cover resources that don't emit tags with usage data.
- Consider setting up automated scripts to stop resources with specific up-time profiles (for example, stop developer VMs during off-peak hours if they weren't used in 2 hours).
- Document up-time expectations based on specific tag values and what happens when the tag isn't present.
- Use Azure Policy to track compliance with the tag policy.
- Use Azure Policy to enforce specific configuration rules based on environment.
- Consider using "override" tags to bypass the standard policy when needed. To ensure accountability, track the cost and report them to stakeholders.
Consider establishing and tracking KPIs for low-priority workloads, like development servers.
Consider deploying other tools to help you optimizing your environment, for example, the Azure Optimization Engine available on FinOps toolkit provided by Microsoft.

Learn more at the FinOps Foundation

This capability is a part of the FinOps Framework by the FinOps Foundation, a non-profit organization dedicated to advancing cloud cost management and optimization. For more information about FinOps, including useful playbooks, training and certification programs, and more, see the Workload optimization capability article in the FinOps Framework documentation.

You can also find related videos on the FinOps Foundation YouTube channel:

Related FinOps capabilities:

Share via

Workload optimization

Definition

Getting started

🏗️ Building on the basics

Learn more at the FinOps Foundation

Feedback

Additional resources

Share via

Workload optimization

Definition

Getting started

🏗️ Building on the basics

Learn more at the FinOps Foundation

Related content

Feedback

Additional resources