Share via

What are the implications of disabling buffering in Application Gateway?

Alejo Villores 20 Reputation points
2026-03-09T18:05:22.7066667+00:00

Hi,

We are evaluating whether to disable response/request buffering in Azure Application Gateway, which sits in front of different resource types (AKS, Service Fabric and WebApps) and we want to understand the potential risks before making this change in production.

Reading the documentation it says the following:

"We strongly recommend that you test and evaluate the performance before rolling this out on the production gateways."

I would like to know if disabling buffering can cause any implications such us:

  • Does disabling buffering affect WAF inspection, SSL termination, or routing behavior?
  • Are there specific workload types (e.g., large file uploads, streaming, slow backends) where disabling buffering is particularly risky?

I'm looking forward to your response.

Thank you.

Alejo.

Azure Application Gateway
Azure Application Gateway

An Azure service that provides a platform-managed, scalable, and highly available application delivery controller as a service.


Answer accepted by question author
  1. Marcin Policht 85,990 Reputation points MVP Volunteer Moderator
    2026-03-09T18:19:43.4566667+00:00

    Disabling request or response buffering in Azure Application Gateway changes how the gateway handles the HTTP body between the client and the backend. With buffering enabled, the gateway reads the entire request or response body before forwarding it. With buffering disabled, the gateway streams the data between client and backend as it arrives. This mainly affects flow control, inspection behavior, and error handling.

    As far as I recall, request buffering cannot be disabled when the Application Gateway is running the WAF_v2 SKU. The Web Application Firewall requires access to the full request body in order to evaluate its rule set, including rules that inspect POST payloads and other request content. The platform enforces request buffering when WAF is enabled. In practice, this means request streaming is not supported on WAF-enabled gateways and the request body must be fully buffered before being sent to the backend. So effectively disabling request buffering is possible on the Standard_v2 SKU or on gateways where WAF is not enabled.

    Response buffering is independent of this behavior because WAF primarily inspects inbound requests rather than outbound responses. As a result, response buffering can still be disabled even when WAF is enabled.

    SSL termination is not affected by buffering settings. TLS negotiation, certificate handling, and decryption still occur at the gateway in the same way. The gateway continues to terminate TLS connections and forward decrypted HTTP traffic to the backend according to the configured HTTP settings. Buffering only changes how the HTTP payload is handled after decryption.

    Routing behavior is also unchanged. Listener selection, host and path-based routing, rewrite rules, and backend pool selection rely on request metadata such as the host header, URI path, and other headers. These elements are available immediately when the request arrives and do not require the request body to be buffered, so disabling response buffering does not alter routing decisions.

    Large uploads are one scenario where request buffering would normally be a concern because buffering requires the gateway to receive the entire payload before sending it to the backend. This increases latency before the backend begins processing and increases temporary memory or disk usage on the gateway. However, because request buffering cannot be disabled when WAF is enabled, large upload workloads behind WAF-enabled gateways will still pass through the full buffering process.

    Streaming responses are a common case where disabling response buffering is beneficial. Examples include server-sent events, long-running report generation that streams incremental output, media streaming, or APIs designed to deliver partial results as they become available. When response buffering is enabled, the gateway waits for the entire backend response before returning it to the client, which prevents real-time streaming and increases perceived latency. Disabling response buffering allows the gateway to forward backend data to the client as soon as it is received.

    Slow backend services can behave differently when response buffering is disabled. With buffering enabled, the gateway can receive the full response quickly and then transmit it to the client independently. When response buffering is disabled, the client connection is tied more directly to the backend’s response rate. If the backend produces data slowly, the client will observe that delay directly and the connection will remain open for longer periods.

    Slow clients can also influence backend connection lifetimes when response buffering is disabled. With buffering enabled, the gateway can read the backend response quickly and deliver it to the client at the client’s pace. Without buffering, a slow client may cause the backend connection to remain open longer because the gateway forwards the response stream at the rate the client can receive it.

    Another operational consideration involves retries and failure handling. When a full request or response is buffered, the gateway may have more flexibility to retry certain backend failures or to return consistent error responses. When streaming is used, once a portion of the response has already been sent to the client, retrying the backend request or altering the response becomes much more limited.

    Resource usage patterns also change. Disabling response buffering can reduce memory pressure on the gateway because full responses are not stored before transmission. At the same time, connections may remain open longer, which can increase concurrent connection counts and affect throughput characteristics.

    For environments where Application Gateway fronts heterogeneous backends such as AKS services, Service Fabric applications, and Azure App Service workloads, the practical risks associated with disabling buffering mainly relate to response streaming behavior, tighter coupling between backend response speed and client delivery, and reduced flexibility in error handling once a response stream has begun.


    If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.

    hth

    Marcin

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.