Hi All,
Recently we have been experiencing a BSOD in Windows Server 2016 tcpip.sys that affects a number of servers in different data centers around the world, but all with the same stack trace. The frequency of crashes range from once a week to 5-6 times per day per server.
Most affected systems are VMware or HyperV guests running Windows Server 2016, all crashes are in tcpip.sys with identical stack traces:
nt!KeBugCheckEx
nt!KiBugCheckDispatch+0x69
nt!KiPageFault+0x428
tcpip!TcpDequeueTcbSend+0x6e5
tcpip!TcpTcbFastDatagram+0x2ca
tcpip!TcpTcbReceive+0x247
tcpip!TcpMatchReceive+0x1e4
tcpip!TcpPreValidatedReceive+0x363
tcpip!IppLoopbackIndicatePackets+0xa7
tcpip!IppLoopbackTransmit+0xd4
tcpip!IppLoopbackTransmitWorker+0x2e
nt!IopProcessWorkItem+0x80
nt!ExpWorkerThread+0x69f
nt!PspSystemThreadStartup+0x18a
nt!KiStartSystemThread+0x16
We managed to isolate and reliably reproduce the issue with clean Windows Server 2016 installation on VMware Workstation running Axxon Next 4.5.2.
Axxon Next is purely userspace software which does not install any kernel drivers (although it makes a heavy use of networking including loopback interface), and should not be able to crash the OS. We have a large installation base of Axxon Next 4.5.2 around the world on a variety of Windows versions, but the issue seems to reproduce mostly on Windows Server 2016.
We have researched a number of similar reports on BSOD in tcpip.sys on the web and tried all suggested solutions but nothing helped so far.
Any support or help investigating the issue is much appreciated, we are ready to provide remote access or share a VM snapshot where the issue is reproduced reliably.
Here is a memory crash dump and initial analysis, see WindowsServer2016_clean_VMware.7z
https://itvgroup-my.sharepoint.com/:f:/g/personal/oleg_malashenko_ru_axxonsoft_com/EklqcqRieStOl5NUxF-cNRsBEeMJc_78rRSO-3mL_j3pLQ?e=LfKJul