I'm encountering some server crashes and hoping to get some insights from the community.
Problem:
- Four servers have experienced crashes.
- The crashes seem related to the fastfat kernel module, typically used for FAT file systems.
- However, our servers use the NTFS file system.
Details:
- SRV04: Crashed twice with error "FAT_FILE_SYSTEM (23)" suggesting the fastfat module might be interacting with the ESP (boot partition) since other file systems are NTFS.
- SRV02 and SRV03: Crashed with error "DPC_WATCHDOG_VIOLATION (133)" and further investigation revealed calls to the fastfat module.
- SRV01: Faced a different crash with error "RESOURCE_NOT_OWNED (e3)" which could be related to a resource access conflict.
Troubleshooting done:
- The servers have been patched, but the crashes persist.
Questions:
- Has anyone faced similar crashes with the fastfat module?
- Could a mounted ESP partition be causing the fastfat module to activate despite the NTFS file system?
- Are there any recommendations for further troubleshooting or potential solutions?
Additional Information:
- OS: Windows Server 2022 Datacenter
- System Model: Lenovo ThinkSystem SR630 V2
- SAN: Storwize V3700
Details of BSOD
SRV01 server experienced a Blue Screen of Death (BSOD) crash which appear to triggered by a specific function (nt!ExpReleaseResourceSharedForThreadLite) within the Windows kernel. The function attempts to release a resource, but the crash occurred due to an error in this process.
RESOURCE_NOT_OWNED (e3)
A thread tried to release a resource it did not own.
Arguments:
Arg1: ffffb387895e6bf8, Address of resource
Arg2: ffffb38704a41040, Address of thread
Arg3: 0000000000000000, Address of owner table if there is one
Arg4: 0000000000000002
28: kd> u nt!ExpReleaseResourceSharedForThreadLite+22552b
nt!ExpReleaseResourceSharedForThreadLite+0x22552b:
fffff803`668627eb cc int 3
fffff803`668627ec 488b9424b8000000 mov rdx,qword ptr [rsp+0B8h]
fffff803`668627f4 498bcf mov rcx,r15
fffff803668627f7 e860e90f00 call nt!KiReleaseQueuedSpinLockInstrumented (fffff803
6696115c)
fffff803`668627fc 90 nop
fffff803668627fd e98eadddff jmp nt!ExpReleaseResourceSharedForThreadLite+0x2d0 (fffff803
6663d590)
fffff803`66862802 80792001 cmp byte ptr [rcx+20h],1
fffff80366862806 0f879dadddff ja nt!ExpReleaseResourceSharedForThreadLite+0x2e9 (fffff803
6663d5a9)
SRV02 and SRV03 experienced a Blue Screen of Death (BSOD) crash with bug check DPC_WATCHDOG_VIOLATION (133) . This bug check indicates that the DPC watchdog executed, either because it detected a single long-running deferred procedure call (DPC), or because the system spent a prolonged time at an interrupt request level (IRQL) of DISPATCH_LEVEL or above.
TRAP_FRAME: ffffa60c46891c90 -- (.trap 0xffffa60c46891c90)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=f534581101050000
rdx=00000000f8b4d000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8017d82e7b3 rsp=ffffa60c46891e20 rbp=0000000000000000
r8=0000000000000000 r9=ffffa60c46892030 r10=0000000000000000
r11=ffffa60c46891f88 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl nz na po nc
nt!KxWaitForLockChainValid+0x23:
fffff8017d82e7b3 488b07 mov rax,qword ptr [rdi] ds:00000000
00000000=????????????????
Resetting default scope
19: kd> .trap 0xffffa60c46891c90
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000000 rcx=f534581101050000
rdx=00000000f8b4d000 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8017d82e7b3 rsp=ffffa60c46891e20 rbp=0000000000000000
r8=0000000000000000 r9=ffffa60c46892030 r10=0000000000000000
r11=ffffa60c46891f88 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl nz na po nc
nt!KxWaitForLockChainValid+0x23:
fffff8017d82e7b3 488b07 mov rax,qword ptr [rdi] ds:00000000
00000000=????????????????
STACK_TEXT:
ffff9581dc7acda8 fffff801
7d886c01 : 0000000000000133 00000000
00000001 0000000000001e00 fffff801
7e30f328 : nt!KeBugCheckEx
ffff9581dc7acdb0 fffff801
7d884ab4 : 000ca51cdf7f9699 00000000
000012b8 ffffab8e5ce50000 fffff801
7d9aca02 : nt!KeAccumulateTicks+0x541
ffff9581dc7ace20 fffff801
7d88471a : 000000000e1c09f4 ffff9581
dc7492b8 0000000000000000 fffff801
7d9258ef : nt!KiUpdateRunTime+0x64
ffff9581dc7aceb0 fffff801
7d8845a4 : ffffab8e5b57ecc0 00000000
00000000 ffffab8e5b57ecc0 00000000
00000000 : nt!KeClockInterruptNotify+0x10a
ffff9581dc7acf40 fffff801
7d852ce0 : 0000000000000000 ffff60b5
846bcfd5 0000000000000000 00000000
00010032 : nt!HalpTimerClockIpiRoutine+0x14
ffff9581dc7acf70 fffff801
7da222ea : ffffa60c46891d10 ffffab8e
5b57ecc0 0000000000000000 00000000
00000000 : nt!KiCallInterruptServiceRoutine+0xa0
ffff9581dc7acfb0 fffff801
7da22b97 : 0000000000000000 00000000
00000000 0000000000000000 00000000
00000000 : nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
ffffa60c46891c90 fffff801
7d82e7b3 : ffffa60c46891ef9 00000000
00010008 ffffa60c46891ed0 fffff801
7dc87385 : nt!KiInterruptDispatchNoLockNoEtw+0x37
ffffa60c46891e20 fffff801
7d83cf65 : 00000000f8b4d3a3 ffffb389
cdbdbc08 ffffa60c46891fb0 00000000
00000000 : nt!KxWaitForLockChainValid+0x23
ffffa60c46891e50 fffff801
7d83d206 : 0000000000000000 ffffb389
a2508040 0000000000000000 00000000
00000000 : nt!ExpReleaseResourceExclusiveForThreadLite+0x535
ffffa60c46891f30 fffff801
9a63b096 : ffffb389bebc1b90 ffffb389
cdbdba10 0000000000000000 00000000
00000000 : nt!ExReleaseResourceLite+0x146
ffffa60c46891f90 fffff801
9a63a680 : ffffb389cdbdba10 00000000
00000000 0000000000000000 fffff801
7d848b00 : fastfat!FatCommonClose+0x466
ffffa60c468920a0 fffff801
7d849025 : 0000000000000000 ffffb389
a171fbe0 0000000000000001 00000000
00000000 : fastfat!FatFsdClose+0x1b0
ffffa60c46892140 fffff801
7dcbcc5f : ffffb389d7d04440 ffffb389
bebc1b90 ffffb389d7d04440 ffffb389
d7d04440 : nt!IofCallDriver+0x55
ffffa60c46892180 fffff801
7dca7740 : ffffab8e5b5f76c0 ffffb389
dddf3110 ffffb389d7d04410 00000000
00000000 : nt!IopDeleteFile+0x14f
ffffa60c46892200 fffff801
7d8360a7 : 0000000000000000 00000000
00000000 ffffa60c46892300 ffffb389
d7d04440 : nt!ObpRemoveObjectRoutine+0x80
ffffa60c46892260 fffff801
7d960362 : 0000000000000000 ffffb389
dddf3110 ffffb389dddf3110 fffff801
00000000 : nt!ObfDereferenceObjectWithTag+0xc7
ffffa60c468922a0 fffff801
7d8d7b41 : ffffb3898f60a2b0 ffffb389
a2508040 ffff9581dc44d380 fffff801
00000000 : nt!CcGetDeviceGuidAsync+0xb2
ffffa60c46892320 fffff801
7d957925 : ffffb389a2508040 00000000
00000001 ffffb389a2508040 00000000
00000080 : nt!ExpWorkerThread+0x161
ffffa60c46892530 fffff801
7da25198 : ffff9581dc840180 ffffb389
a2508040 fffff8017d9578d0 00000000
00000000 : nt!PspSystemThreadStartup+0x55
ffffa60c46892580 00000000
00000000 : ffffa60c46893000 ffffa60c
4688c000 0000000000000000 00000000
00000000 : nt!KiStartSystemThread+0x28
AITHV04 server experienced a Blue Screen of Death (BSOD) with bug check FAT_FILE_SYSTEM (23) .
FAT_FILE_SYSTEM (23)
*If you see FatExceptionFilter on the stack then the 2nd and 3rd*
*parameters are the exception record and context record. Do a .cxr*
*on the 3rd parameter and then kb to obtain a more informative stack*
*trace.*
Arguments:
Arg1: 00000000001c0345
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: 0000000000000000
Stack trace showing sequence of function calls that led up to the crash.
STACK_TEXT:
ffffbe0a8381a758 fffff805
5040a919 : 0000000000000023 00000000
001c0345 0000000000000000 00000000
00000000 : nt!KeBugCheckEx
ffffbe0a8381a760 fffff805
503c5d46 : ffff9c8ebbbf9a10 ffff9c8e
bbbf9a10 00000000c0000101 00000000
00000000 : fastfat!FatDeleteVcb+0x241
ffffbe0a8381a7a0 fffff805
50402e30 : 0000000000000000 ffffbe0a
8381ab00 ffffbe0a8381ab69 00000000
00000000 : fastfat!FatCheckForDismount+0xea
ffffbe0a8381a7e0 fffff805
50402187 : 0000000000000000 ffff9c8e
ab134800 ffff948200000001 00000000
00000000 : fastfat!FatMountVolume+0xc74
ffffbe0a8381aa50 fffff805
504020d2 : ffff9482521994d0 ffff9482
719bbb01 ffff9482719bbb01 ffff9c8e
7a5bba01 : fastfat!FatCommonFileSystemControl+0x57
ffffbe0a8381aa80 fffff805
35041185 : 0000000000000000 ffff9482
521994d0 ffff948252199401 ffff9482
719bbbe0 : fastfat!FatFsdFileSystemControl+0xb2
ffffbe0a8381aac0 fffff805
30c504c4 : ffff9c8e793bc010 ffff9482
00000000 0000000000000000 00000000
00000000 : nt!IofCallDriver+0x55
ffffbe0a8381ab00 fffff805
30c489ed : ffff9482313bcd40 ffff9482
521994d0 ffff9482227dc820 ffff9c8e
beb40a70 : FLTMGR!FltpFsControlMountVolume+0x1f0
ffffbe0a8381abd0 fffff805
35041185 : ffffbe0a8381ad31 ffff9482
313bcd40 ffffbe0a8381ad31 fffff805
35a51060 : FLTMGR!FltpFsControl+0x11d
ffffbe0a8381ac30 fffff805
354fe387 : ffffbe0a8381ad31 ffff9482
2345f050 ffff9482313bcd40 00000000
00000000 : nt!IofCallDriver+0x55
ffffbe0a8381ac70 fffff805
35040f45 : 0000000000000000 ffff9c8e
904e6640 0000000000000000 00000000
00000000 : nt!IopMountVolume+0x3af
ffffbe0a8381ad90 fffff805
3547be12 : ffff9482e9c37080 00000000
00000000 ffffbe0a8381b0b0 00000000
00000f25 : nt!IopCheckVpbMounted+0x205
ffffbe0a8381adf0 fffff805
35481a85 : ffff94822339e060 fffff805
3547b8e0 0000000000000000 ffff9481
ddbfd7a0 : nt!IopParseDevice+0x532
ffffbe0a8381afb0 fffff805
35480f21 : ffffa98697408bf0 ffffbe0a
8381b1e0 0000000000000040 ffff9481
ddbfdd20 : nt!ObpLookupObjectName+0x625
ffffbe0a8381b150 fffff805
3551ca3f : ffff948100000000 00000000
00000001 ffff9c8ed01e5af0 000000c5
7ecfef58 : nt!ObOpenObjectByNameEx+0x1f1
ffffbe0a8381b280 fffff805
3551c619 : 000000c57ecfef18 000000c5
7ecfeaf8 000000c57ecfef58 000000c5
7ecfef20 : nt!IopCreateFile+0x40f
ffffbe0a8381b320 fffff805
35231085 : 0000000000000001 000000c5
7ecff220 0000000000000000 000000c5
7ecff3e8 : nt!NtCreateFile+0x79
ffffbe0a8381b3b0 00007ffe
a973ff14 : 0000000000000000 00000000
00000000 0000000000000000 00000000
00000000 : nt!KiSystemServiceCopyEnd+0x25
000000c57ecfee98 00000000
00000000 : 0000000000000000 00000000
00000000 0000000000000000 00000000
00000000 : 0x00007ffe`a973ff14