Hello @siddharth bansal ,
I understand that you are observing inconsistent performance when using application gateway with tier WAF v2. The application works faster when you bypass the Application gateway but when you use the Application gateway, the performance is slower.
WAF is expected to add some latency regardless of it being in prevention or detection mode as the traffic is inspected by the WAF.
In Detection mode, the WAF doesn't block any request, but the traffic is still inspected by the WAF and is logged.
Refer: https://learn.microsoft.com/en-us/azure/web-application-firewall/ag/ag-overview#waf-modes
As long as the Application gateway has the WAF SKU enabled, disabling rules will not help improve performance.
I would request you to check your Application gateway metrics once, as those metrics can be used to determine whether the observed slowdown is due to the client network, Application Gateway performance, the backend network and backend server TCP stack saturation, backend application performance, or large file size.
Refer: https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-metrics
It's important that you scale your Application Gateway according to your traffic and with a bit of a buffer so that you're prepared for any traffic surges or spikes and minimizing the impact that it may have in your QoS.
Capacity Unit is the measure of capacity utilization for an Application Gateway across multiple parameters.
A single Capacity Unit consists of the following parameters:
- 2500 Persistent connections
- 2.22-Mbps throughput
- 1 Compute Unit
If any of these parameters are exceeded, then another N capacity units are necessary, even if the other two parameters don’t exceed this single capacity unit’s limits.
Each Application gateway instance guarantees a minimum of 10 capacity units in terms of processing capability.
Refer: https://learn.microsoft.com/en-us/azure/application-gateway/understanding-pricing#v2-skus
Also, take a look into the below documents which shares guidelines to help you set up your Application Gateway to handle extra traffic for any high traffic volume that may occur:
My suggestion is to check all the listed metrics here and validate the numbers using the below example:
- If there’s a spike in Backend first byte response time trend but the Backend connect time trend is stable, then it can be inferred that the Application gateway to backend latency and the time taken to establish the connection is stable, and the spike is caused due to an increase in the response time of backend application.
- If the spike in Backend first byte response time is associated with a corresponding spike in Backend connect time, then it can be deduced that either the network between Application Gateway and backend server or the backend server TCP stack has saturated.
- If you notice a spike in Backend last byte response time but the Backend first byte response time is stable, then it can be deduced that the spike is because of a larger file being requested.
- Similarly, if the Application gateway total time has a spike but the Backend last byte response time is stable, then it can either be a sign of performance bottleneck at the Application Gateway or a bottleneck in the network between client and Application Gateway.
- Additionally, if the client RTT also has a corresponding spike, then it indicates that the degradation is because of the network between client and Application Gateway.
Check all the metrics for a given time when you observed latency in your application and compare the data to find where the issue is.
Kindly let us know if the above helps or you need further assistance on this issue.
Please "Accept the answer" if the information helped you. This will help us and others in the community as well.