Application gateway 502 bad gateway error intermittent

Ricardo Gabriel Nima Montalban 0 Reputation points
2023-08-09T19:54:00.8333333+00:00

Good day

We have an AAGW: V2 standard (as Load balancer) in front of 2 VMs with a java spring boot application as backend pool. We have TLS end to end, but occurs that when we perform load tests with Jmeter, always show as between 3-16% bad gateway errors.

But the same backend with the next architecture: LB external --> 2 Nginx VMs --> LB internal --> 2 VMs backend pool, if we perfom the same load test we got cero errors in Jmeter.

In both we use the same custom probe with sucessful status.

So any idea about what we have to review or any suggestion.

Best regards

Azure Application Gateway
Azure Application Gateway
An Azure service that provides a platform-managed, scalable, and highly available application delivery controller as a service.
1,213 questions
{count} votes

1 answer

Sort by: Most helpful
  1. ChaitanyaNaykodi-MSFT 27,476 Reputation points Microsoft Employee Moderator
    2023-08-10T01:28:24.5033333+00:00

    @Ricardo Gabriel Nima Montalban

    Welcome to the Microsoft Q&A forum.

    Based on my understanding of your question above you are getting intermittent 502 bad gateway errors when performing load tests with JMeter on an Application Gateway v2 Standard (as Load Balancer) in front of 2 VMs with a Java Spring Boot application as a backend pool. However, when using the same backend with a different architecture (LB external --> 2 Nginx VMs --> LB internal --> 2 VMs backend pool), you got zero errors in JMeter. They use the same custom probe with a successful status in both architectures.

    Regarding your test with Application Gateway as your load balancer. The bad gateway error might be due to Request time-out. When a user request is received, the application gateway applies the configured rules to the request and routes it to a backend pool instance. It waits for a configurable interval of time for a response from the backend instance. By default, this interval is 20 seconds. In Application Gateway v2, if the application gateway doesn't receive a response from the backend application in this interval, the request will be tried against a second backend pool member. If the second request fails, the user request gets a 502 error. The solution in cases will be to increase the request-timeout. Application Gateway allows you to configure this setting via the BackendHttpSetting. You can run the PowerShell command below to increase the time. The Request timeout maximum to private backend is 24 hours as per the limitation documentation. You can try this suggestion and see if that resolves the issue, if this does not help in resolving the issue you can enable diagnostic logging for your application gateway to help dig deeper into the issue.

    New-AzApplicationGatewayBackendHttpSettings -Name 'Setting01' -Port 80 -Protocol Http -CookieBasedAffinity Enabled -RequestTimeout 60
    
    

    Regarding your test using Azure Load Balancer, you observed no errors it might be because Azure Load Balancer is Azure's most performant Load Balancer all while keeping latency ultra-low. Additionally, unlike Application gateway Azure Load Balancer doesn't close or originate flows and idle timeout for Azure Load Balancer is set to 4 minutes by default.

    Additional reference:

    https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-troubleshooting-502#request-time-out

    https://learn.microsoft.com/en-us/azure/architecture/guide/technology-choices/load-balancing-overview

    Hope this answers your query. Please let me know if you have any additional questions. Thank you!


    Please "Accept the answer" if the information helped you. This will help us and others in the community as well.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.