Share via

Clarification on Flex Consumption Billing: Minimum 1000ms rule for Sequential vs. Concurrent requests

yuriy@mquark.com 100 Reputation points
2025-11-26T20:33:57.28+00:00

Hello Azure Support,

I am planning a migration to the Azure Functions Flex Consumption plan and need a definitive clarification on how the 1,000 ms minimum execution charge applies to specific traffic patterns (Sequential vs. Concurrent).

My Configuration:

  • Plan: Flex Consumption (On-Demand mode, "Always Ready" set to 0).
  • Instance Memory: 2048 MB.
  • Function Duration: Each request takes exactly 200 ms to process.

I need to confirm which billing logic applies to the following two scenarios:


Scenario 1: Sequential / Non-Overlapping Requests

  • Pattern: Request A arrives. It finishes (200ms duration). The instance sits idle/warm for 2.5 seconds. Then Request B arrives.
  • Question: How is the bill calculated for these 2 requests?
  • Interpretation A (Per-Request Reset): The billing meter stops after Request A (200ms) and rounds it up to 1000ms. It starts fresh for Request B and rounds it up to 1000ms.
    • Total Billed: 2,000 ms.
  • Interpretation B (Session/Keep-Alive): Because the instance remained allocated/warm during the gap, the meter treats this as one continuous session.
    • Total Billed: 400 ms (or rounded once to 1000ms total).
  • Interpretation C (Continuous Billing): The meter runs continuously through the gap because the instance is allocated.
    • Total Billed: 200ms + 2300ms (gap) + 200ms = 2,700 ms.

Scenario 2: Concurrent / Overlapping Requests

  • Pattern: 5 requests arrive simultaneously. The function app is configured with high concurrency, so the single instance processes all 5 requests in parallel.
  • Wall-Clock Time: The instance stays "Active" (processing at least one request) for a total of 300 ms before all 5 requests are completed and it goes idle.
  • Question: Does the 1,000 ms minimum apply per request or per active instance block?
  • Interpretation D1 (Per Request - Expensive): The minimum applies to every unique execution context.
    • Calculation: 5 requests × 1,000 ms minimum = 5,000 ms billed.
  • Interpretation D2 (Per Instance - Optimized): The minimum applies to the wall-clock time the instance was active. Since the instance was active for 300ms (which is < 1000ms), I am billed for the floor of 1 second.
    • Calculation: 1 active block × 1,000 ms minimum = 1,000 ms billed.

Could you please confirm which Interpretations are correct for Scenario 1 and Scenario 2?

Thank you.

Cost Management
Cost Management

A Microsoft offering that enables tracking of cloud usage and expenditures for Azure and other cloud providers.

0 comments No comments

Answer accepted by question author

SUNOJ KUMAR YELURU 18,256 Reputation points MVP Volunteer Moderator
2025-11-27T02:56:37.8133333+00:00

Hello @yuriy@mquark.com

Scenario 1: Sequential / Non-Overlapping Requests:

Interpretation A (Per-Request Reset) is the correct billing logic for Scenario 1. Total Billed: 2,000 ms

Scenario 2: Concurrent / Overlapping Requests:

Interpretation D2 (Per Instance - Optimized) is the correct billing logic for Scenario 2. Total Billed: 1,000 ms


If the Answer is helpful, please click Accept Answer and Up-Vote, so that it can help others in the community looking for help on similar topics.

Was this answer helpful?

0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.