Clarification on Flex Consumption Billing: Minimum 1000ms rule for Sequential vs. Concurrent requests

Question

Clarification on Flex Consumption Billing: Minimum 1000ms rule for Sequential vs. Concurrent requests

yuriy@mquark.com 100

Hello Azure Support,

I am planning a migration to the Azure Functions Flex Consumption plan and need a definitive clarification on how the 1,000 ms minimum execution charge applies to specific traffic patterns (Sequential vs. Concurrent).

My Configuration:

Plan: Flex Consumption (On-Demand mode, "Always Ready" set to 0).
Instance Memory: 2048 MB.
Function Duration: Each request takes exactly 200 ms to process.

I need to confirm which billing logic applies to the following two scenarios:

Scenario 1: Sequential / Non-Overlapping Requests

Pattern: Request A arrives. It finishes (200ms duration). The instance sits idle/warm for 2.5 seconds. Then Request B arrives.
Question: How is the bill calculated for these 2 requests?
Interpretation A (Per-Request Reset): The billing meter stops after Request A (200ms) and rounds it up to 1000ms. It starts fresh for Request B and rounds it up to 1000ms.
- Total Billed: 2,000 ms.
Interpretation B (Session/Keep-Alive): Because the instance remained allocated/warm during the gap, the meter treats this as one continuous session.
- Total Billed: 400 ms (or rounded once to 1000ms total).
Interpretation C (Continuous Billing): The meter runs continuously through the gap because the instance is allocated.
- Total Billed: 200ms + 2300ms (gap) + 200ms = 2,700 ms.

Scenario 2: Concurrent / Overlapping Requests

Pattern: 5 requests arrive simultaneously. The function app is configured with high concurrency, so the single instance processes all 5 requests in parallel.
Wall-Clock Time: The instance stays "Active" (processing at least one request) for a total of 300 ms before all 5 requests are completed and it goes idle.
Question: Does the 1,000 ms minimum apply per request or per active instance block?
Interpretation D1 (Per Request - Expensive): The minimum applies to every unique execution context.
- Calculation: 5 requests × 1,000 ms minimum = 5,000 ms billed.
Interpretation D2 (Per Instance - Optimized): The minimum applies to the wall-clock time the instance was active. Since the instance was active for 300ms (which is < 1000ms), I am billed for the floor of 1 second.
- Calculation: 1 active block × 1,000 ms minimum = 1,000 ms billed.

Could you please confirm which Interpretations are correct for Scenario 1 and Scenario 2?

Thank you.

0 comments

Answer accepted by question author

0 additional answers

Your answer

Answer 1

SUNOJ KUMAR YELURU 18,256 MVP Volunteer Moderator

Hello @yuriy@mquark.com

Scenario 1: Sequential / Non-Overlapping Requests:

Interpretation A (Per-Request Reset) is the correct billing logic for Scenario 1. Total Billed: 2,000 ms

Scenario 2: Concurrent / Overlapping Requests:

Interpretation D2 (Per Instance - Optimized) is the correct billing logic for Scenario 2. Total Billed: 1,000 ms

If the Answer is helpful, please click Accept Answer and Up-Vote, so that it can help others in the community looking for help on similar topics.

0 comments

Share via

Clarification on Flex Consumption Billing: Minimum 1000ms rule for Sequential vs. Concurrent requests

Scenario 1: Sequential / Non-Overlapping Requests

Scenario 2: Concurrent / Overlapping Requests

0 additional answers

Your answer