Pricing and billing for Azure SRE Agent

Learn how Azure SRE Agent billing works and what to expect on your Azure bill.

Two billing components are always-on flow (fixed) and active flow (variable, token-based). Active flow measures the large language model (LLM) tokens that your agent consumes. Each token type is metered at a fixed Azure Agent Unit (AAU) rate based on your agent's configured model.

You can monitor consumption in the portal at Settings > Agent consumption.

How billing works

Azure SRE Agent charges are based on AAUs, a standardized measure of agentic processing that's used across all prebuilt Azure agents. Your monthly bill combines two types of charges: always-on flow and active flow.

Always-on flow (fixed cost)

When you create an agent, it's billed at a fixed rate for as long as it exists.

Component	Rate
Always-on flow	Four AAUs per agent hour

Always-on flow doesn't mean that the agent is actively processing work. It represents the baseline cost of keeping your agent provisioned and available. Always-on billing continues from agent creation until the agent is deleted.

Active flow (variable cost)

Whenever your agent is doing work, the agent consumes active flow AAUs. Work examples might be whether a user asks a question interactively, an automation triggers a task, or an async operation runs in the background. Any time that the agent is actively processing counts as active flow, regardless of how the work was initiated.

How tokens become AAUs

Every time that your agent does work, it consumes LLM tokens. Each token type is metered separately at the rate shown in the following table.

Token type	What it measures
Input	Tokens sent to the model (prompts, tool results, and context).
Output	Tokens generated by the model (responses and reasoning).
Cache read	Tokens served from prompt cache (repeated context).
Cache write	Tokens written to prompt cache for future reuse.

Your total active flow AAUs for a task equal the sum of AAUs across all four token types.

AAU rates by model

The following table shows the number of AAUs consumed per 1 million tokens.

Model	Input	Output	Cache read	Cache write
Claude Opus 4.6	100 AAUs	500 AAUs	10 AAUs	125 AAUs
GPT 5.3 Codex	35 AAUs	280 AAUs	3.5 AAUs	—
GPT 5.2	35 AAUs	280 AAUs	3.5 AAUs	—

Rates are per 1 million tokens. Effective April 15, 2026. More models and providers might be added in the future. AAU rates are set by Azure and might be updated as new models are released.

Key details

Only processing time counts: The time that the agent spends waiting for your response isn't billed as active flow.
Active flow resets monthly: The consumption counter for your AAUs resets at the beginning of each calendar month.
Provider is set at agent level: The model provider (Anthropic, OpenAI, and others) is configured in your agent's settings. The corresponding model determines your AAU rates.

Active flow by task type

The number of tokens that are consumed, which results in the AAUs that are billed, depends on the complexity of the task. More complex tasks require more LLM reasoning steps, tool calls, and data processing, which means more tokens.

The following table shows how token consumption translates to AAUs across common scenarios.

Scenario	Input tokens	Output tokens	Cache read	Cache write	Claude Opus 4.6 AAUs	GPT 5.3 Codex AAUs	Example
Quick question	~20K	~2K	~15K	~5K	~3.8	~1.6	"Show me recent alerts."
Incident investigation	~200K	~15K	~150K	~50K	~35.5	~13.7	Automated incident from Azure Monitor.
Full remediation	~500K	~40K	~400K	~100K	~86.5	~33.9	"Diagnose and fix the failing deployment."

How the math works

The following table shows the math for the preceding Claude Opus 4.6 example by using the numbers from the quick question row.

Token type	Tokens	Rate per 1M	AAUs
Input	20K	100	2.0
Output	2K	500	1.0
Cache read	15K	10	0.15
Cache write	5K	125	0.625
Total			3.775 AAUs

Tip

To keep active flow costs predictable, set a monthly AAU allocation limit in Settings > Agent consumption.

Monitor your costs

In the SRE Agent portal

Go to Settings > Agent consumption to view your usage:

Monthly AAU limit: Shows your combined always-on and active flow allocation.
Total active flow consumption: Shows a progress bar that compares your current usage to your limit.
Daily active flow consumption: Shows a bar chart that shows your AAU usage per day for the current month.
Token usage breakdown: Shows your total tokens consumed by category (input, output, cache read, and cache write) so that you can see exactly where your AAUs are going.

Set an active flow spending limit

Select Change AAU allocation to set a monthly active flow AAU limit (minimum 500, maximum 1,000,000 AAUs). This limit applies to active flow only. Always-on billing continues for as long as the agent exists.

When your agent reaches the active flow limit, it becomes unavailable for chat and actions until the next month. Always-on charges continue for the rest of the month.
You can increase or decrease the allocation at any time.
Increases take effect immediately. If you raise the limit above current consumption, chat and actions resume right away.
Decreases below current consumption take effect next month. Until then, the agent runs in always-on flow only.

Billing impact by action

Action	Active flow	Always-on	To resume next month
Set budget limit (hit limit)	Stops	Still billed	Resets automatically at the start of the month.
Stop agent	Stops	Still billed	Manually select Start in Settings > Basics.
Delete agent	Stops	Stops	Create a new agent.

In Microsoft Cost Management

For detailed billing breakdowns across multiple agents and resources, use Microsoft Cost Management in the Azure portal.

Cost optimization tips

Strategy	Impact	How to do it
Add context to your agent.	Wastes fewer tokens.	Add skills, knowledge, and documents so that the agent stays grounded and concise. Persistent memory from past interactions improves efficiency over time.
Filter incidents with response plans.	Reduces unnecessary work.	Use response plans to filter Azure Monitor alerts by severity, service, or keyword. The agent investigates only incidents that match.
Batch work with scheduled tasks.	Makes fewer runs.	Schedule tasks to run daily or weekly instead of polling continuously.
Test in chat before automating.	Avoids wasted runs.	Try your prompt in chat or the playground first. A misconfigured automation runs repeatedly and wastes AAUs.
Stop idle agents.	Eliminates active flow.	Go to Settings > Basics and select Stop. The agent keeps its configuration but stops all active flow. Always-on cost continues until deleted.
Delete unused agents.	Eliminates all costs.	On the Azure SRE Agent webpage, open the agent and go to Settings > Basics > Delete agent. All billing stops immediately.

Frequently asked questions

How does the agent compute AAUs from tokens?

Every time that your agent performs work, it tracks the LLM tokens consumed across all four token types and meters them at the AAU rates for your configured model. You can see your AAU consumption in Settings > Agent consumption.

Does the provider I choose affect my costs?

The model provider (Anthropic, OpenAI, and others) is set at the agent level and determines which AAU rates apply. Different models have different rates. For current rates, see the AAU rates table.

Which model should I choose?

Claude Opus 4.6 has higher AAU rates but typically produces more thorough investigations with fewer reasoning steps. For complex incident investigations and root cause analysis, Opus often reaches a conclusion in fewer tool calls, which can offset the higher per-token rate.

GPT models are a good choice for simpler, high-volume tasks like scheduled compliance checks where cost efficiency matters more than depth. You can change your model provider at any time in Settings > Basics and compare results.

Do I get charged when the agent is waiting for me to respond?

No. Only the time that the agent spends actively processing a task counts as active flow. If the agent asks for your approval and waits, the waiting time isn't billed.

What counts as active flow?

Any time that the agent is actively doing work counts as active flow, such as the following examples:

Interactive prompts: A user asking the agent a question in chat.
Automation: Scheduled tasks, incident response plans, or other automated triggers.
Async operations: Background investigations, report generation, or remediation tasks.

In all cases, the agent meters tokens consumed as AAUs.

What happens if I stop my agent?

A stopped agent can't monitor your resources or respond to prompts, but it still incurs the fixed always-on cost. Active flow AAUs aren't consumed while stopped. To stop your agent, go to Settings > Basics and select Stop. To resume, select Start from the same page. To stop all billing entirely, delete the agent.

Can one agent handle multiple workloads?

Yes. A single agent can monitor multiple resources within its configured scope. Consolidating workloads under one agent reduces always-on costs compared to deploying separate agents.

Is there a free tier?

No. Azure SRE Agent charges begin at agent creation. For current rates, see the Azure pricing calculator.

Is pricing the same in all regions?

For current pricing in your region, check the Azure pricing calculator.

Feedback

Was this page helpful?

Last updated on 2026-04-22