TCP connections locked in “CLOSE_WAIT” status with IIS application in Kerberos authentication on AWS server (Windows server 2016 Standard)

Séverin L'hommelet 6 Reputation points
2021-03-30T13:32:09.547+00:00

We are currently facing an issue while opening and closing sessions with IIS application on AWS server.

Setup :

  • IIS application using com+ component
  • AWS server m5.2xlarge (AMI = ami-0ba02848e01cae475)
  • Kerberos windows authentication
  • Windows server 2016 Standard
  • HTTPS Connexion
  • Used ports for the website : 1029 and 1036
  • Network load balancer (NLB) between 2 servers EC2 AWS.
  • McAfee Antivirus
  • Maximum number of simultaneous users: 400

Issue : After some hours of using the application (3 / 4 hours). Some of the users observe an infinite loading of the web page while trying to access the first login page through https url of the application. The number of users facing the issue increase with the time.

Observations :

  • Some TCP connections from client (random port above 50000) to the server (port 1029 or 1036) are locked in “COSE_WAIT” status while trying to close the session.
  • An application pool recycling allows to close the opened TCP connections and correct the problem for the users.
  • The number of “CLOSE_WAIT” TCP connections locked increases with the number of users facing the issue.
  • This number of locked TCP connections can increase until 300 for one server. At this point, all the users are facing the issue.
  • RAM and CPU usage on the server are stable during the issue.

Notice that this behaviour is never observed with any other architecture (including AWS) where the same application is deployed.

Explored and discarded hypothesis :

  • Load balancer influence : we did a test without using the load balancer for a day (users directly access the application). We are still facing the issue.
  • The parameter “Regular Time Interval (Minutes)” : Set to “0”. Notice that the server is restarted each night.
  • ([HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters) : not present in regedit.
  • 1029 port : control test during a day with only the 1036 port.
  • Number of simultaneous TCP connections allowed : default value (49152)
  • McAfee Antivirus : control test during a day by disabling the antivirus.

Thank you,

Internet Information Services
{count} vote

5 answers

Sort by: Most helpful
  1. Rolling stones 1 Reputation point
    2021-06-18T13:20:48.157+00:00

    Hi,

    We are experiencing the same issue that you are describing. Any idea ?

    0 comments No comments

  2. Séverin L'Hommelet 1 Reputation point
    2021-06-18T13:24:40.437+00:00

    Hi,

    Still not solved on our side.
    Are you on AWS ? Are you using the ASP compatibility mode ?


  3. Rolling stones 1 Reputation point
    2021-06-18T13:32:52.05+00:00

    Our customers are losing patient, 2/3 times a day the IIS hang and the have to restart it

    0 comments No comments

  4. Séverin L'Hommelet 1 Reputation point
    2021-06-18T13:47:04.277+00:00

    What is your authentication mode ?

    0 comments No comments

  5. Rolling stones 1 Reputation point
    2021-06-18T14:03:46.5+00:00

    We are using kerberos.

    0 comments No comments