We are getting "upstream prematurely closed connection" error intermittently in kong server and nginx server implemented in Azure

anil kumar 1 Reputation point
2021-07-08T12:12:10.437+00:00

We have a setup where the request flows like below:
Azure VM (running kong kubernetes pod) -> Azure application load balancer - > azure VM (running nginx kubernetes pod) - > azure VM (running uwsgi applications)

We are intermittently seeing "upstream prematurely closed connection" errors in our kong logs and nginx logs and we are suspecting Azure application load balancer is causing some issue or some network issue between azure vm's here.

in our uwsgi applications, we see below error:
uwsgi_response_write_body_do() TIMEOUT !!! OSError: write error

which means, for some reason connections are getting terminated before complete data is transferred from our uwsgi applications back to nginx back to kong.

We enabled logs in Azure application load balancer but for some reason, we do not see these logs related for those connection timeout which are prematurely getting closed. We do not see any pattern when they get closed as well. For some connections, we see connection getting closed with in 2 seconds and some times, it gets closed after 20 seconds etc. This is what making us believe that there is some network issue between these VM's or some issue with the application load balancer itself.

We have a similar set up in AWS and we do not see any issues there but in AWS, we have kong directly talking to nginx ( there is no load balancer in between)

can you please help us on how to troubleshoot this in Azure. Any logs that we can go through to find out what is causing this in Azure?

Azure Application Gateway
Azure Application Gateway
An Azure service that provides a platform-managed, scalable, and highly available application delivery controller as a service.
980 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. TravisCragg-MSFT 5,681 Reputation points Microsoft Employee
    2021-07-09T21:41:09.357+00:00

    Application Gateway does have a default TCP timeout, but it is much longer than what you are encountering (>120 seconds).

    Have you looked at this solution from the nginx end about disabling buffers?