Hello to anyone reading.
I've been having crashing issues with my computer for months now, and I've tried everything I can find. I don't know what to do anymore. To describe the crash; my monitors both cut to black and my GPU fans start to spin at max speed. I have to either wait upwards of 5 minutes for the crash to power cycle my PC, or manually flip the PSU switch, because all other buttons become unresponsive.
This happens randomly and unpredictably, but ONLY when I'm playing games. Once it happens once, it continues to happen over and over again within minutes of trying to play the game again after restarting the computer.
Me and my friend are pretty certain at this point that my 3080 is the root of the problem. I've looked into a lot of articles and forum posts about people having similar problems and almost every single one of them is tied to a 3080. I've also managed to reproduce the crash in my little brother's computer after I put my 3080 in there, even though he has different specs, so I don't think its some hardware confliction. Every solution I've tried has either not worked at all, or only fixed the problem temporarily. I recently sent my 3080 back to manufacturer for the SECOND TIME, and it fixed the problem for a solid month and a half, but it recently started happening again for some unknown reason. One of the employees working on my card told me about something called "dirty power" that the 30 series cards are especially sensitive to. I cannot logically assume that's the problem, because I feel like it would've continued crashing as soon as I received my card back from RMA, but it didn't. I haven't replaced any parts or anything to trigger the issue again, and the only new things I've installed are games or driver updates.
I recently noticed that almost every time the computer crashes, there are a lot of errors in the Event Viewer that correlate to a file called "nvlddmkm.sys", which I believe is a Nvidia file related to my graphics card. There is also almost always a "Bug Check" event. which someone told me is a BSOD that I'm just not seeing because my monitors are black. I figured out how to open the MEMORY.DMP file it creates, and I'm gonna post it here at the bottom of the post in the hopes that someone can give me some helpful advice that permanently solves the problem before I just replace the GPU. I honestly am so fed up with the issue at this point after trying over and over again to solve it to no avail, and I'm ready to just change to AMD or something, but if I can avoid spending hundreds of dollars to solve this problem permanently I would really like that. Any help is greatly appreciated.
My specs:
Gigabyte RTX 3080
Intel i7 10700K
32GB RAM
Corsair RM850e 850W power supply
Gigabyte B560 motherboard
And I'm on Windows 10
I have tried:
Backtracking drivers
Installing new drivers
Updating Windows
Ensuring power cords are all seated properly
Replacing the PSU
Testing new RAM
RMA'ing my graphics card TWICE
Resetting and reinstalling Windows
Updating BIOS
Going into system 32 folder and giving the nvlddmkm.sys file full access
Probably a few other random things I forgot about
Full BSOD dump file here:
Microsoft (R) Windows Debugger Version 10.0.22621.2428 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Bitmap Dump File: Kernel address space is available, User address space may not be available.
Symbol search path is: srv*
Executable search path is:
Windows 10 Kernel Version 19041 MP (16 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS Personal
Edition build lab: 19041.1.amd64fre.vb_release.191206-1406
Machine Name:
Kernel base = 0xfffff8036ee00000 PsLoadedModuleList = 0xfffff8036fa2a790
Debug session time: Wed Feb 7 00:28:22.489 2024 (UTC - 6:00)
System Uptime: 0 days 1:20:13.141
Loading Kernel Symbols
...............................................................
.....Page 8905c8 not present in the dump file. Type ".hh dbgerr004" for details
...........................................................
................................................................
...............
Loading User Symbols
PEB is paged out (Peb.Ldr = 000000d2`0ac6a018). Type ".hh dbgerr001" for details
Loading unloaded module list
.........
For analysis of this file, run !analyze -v
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
DPC_WATCHDOG_VIOLATION (133)
The DPC watchdog detected a prolonged run time at an IRQL of DISPATCH_LEVEL
or above.
Arguments:
Arg1: 0000000000000001, The system cumulatively spent an extended period of time at
DISPATCH\_LEVEL or above. The offending component can usually be
identified with a stack trace.
Arg2: 0000000000001e00, The watchdog period.
Arg3: fffff8036fafb320, cast to nt!DPC_WATCHDOG_GLOBAL_TRIAGE_BLOCK, which contains
additional information regarding the cumulative timeout
Arg4: 0000000000000000
Debugging Details:
Unable to load image \SystemRoot\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_2fe7c165c5dd3267\nvlddmkm.sys, Win32 error 0n2
Page 893fd3 not present in the dump file. Type ".hh dbgerr004" for details
*************************************************************************
*** ***
*** ***
*** Either you specified an unqualified symbol, or your debugger ***
*** doesn't have full symbol information. Unqualified symbol ***
*** resolution is turned off by default. Please either specify a ***
*** fully qualified symbol module!symbolname, or enable resolution ***
*** of unqualified symbols by typing ".symopt- 100". Note that ***
*** enabling unqualified symbol resolution with network symbol ***
*** server shares in the symbol path may cause the debugger to ***
*** appear to hang for long periods of time when an incorrect ***
*** symbol name is typed or the network symbol server is down. ***
*** ***
*** For some commands to work properly, your symbol path ***
*** must point to .pdb files that have full type information. ***
*** ***
*** Certain .pdb files (such as the public OS symbols) do not ***
*** contain the required information. Contact the group that ***
*** provided you with these symbols if you need this command to ***
*** work. ***
*** ***
*** Type referenced: TickPeriods ***
*** ***
*************************************************************************
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 1889
Key : Analysis.DebugAnalysisManager
Value: Create
Key : Analysis.Elapsed.mSec
Value: 3474
Key : Analysis.Init.CPU.mSec
Value: 2702
Key : Analysis.Init.Elapsed.mSec
Value: 33096
Key : Analysis.Memory.CommitPeak.Mb
Value: 98
Key : WER.OS.Branch
Value: vb\_release
Key : WER.OS.Timestamp
Value: 2019-12-06T14:06:00Z
Key : WER.OS.Version
Value: 10.0.19041.1
FILE_IN_CAB: MEMORY.DMP
BUGCHECK_CODE: 133
BUGCHECK_P1: 1
BUGCHECK_P2: 1e00
BUGCHECK_P3: fffff8036fafb320
BUGCHECK_P4: 0
DPC_TIMEOUT_TYPE: DPC_QUEUE_EXECUTION_TIMEOUT_EXCEEDED
TRAP_FRAME: fffff80372c7d5b0 -- (.trap 0xfffff80372c7d5b0)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000002 rbx=0000000000000000 rcx=ffffb482bf7b240c
rdx=0000000000000002 rsi=0000000000000000 rdi=0000000000000000
rip=fffff8036f042f28 rsp=fffff80372c7d748 rbp=fffff80372c7d8b0
r8=0000000000000018 r9=0000000000000005 r10=fffff80372c7d9c0
r11=000000000000000c r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz ac po cy
nt!KzRaiseIrql+0x8:
fffff803`6f042f28 450f22c3 mov cr8,r11
Resetting default scope
BLACKBOXBSD: 1 (!blackboxbsd)
BLACKBOXNTFS: 1 (!blackboxntfs)
BLACKBOXPNP: 1 (!blackboxpnp)
BLACKBOXWINLOGON: 1
PROCESS_NAME: RainbowSix.exe
STACK_TEXT:
fffff80372c84e18 fffff8036f237996 : 0000000000000133 0000000000000001 0000000000001e00 fffff8036fafb320 : nt!KeBugCheckEx
fffff80372c84e20 fffff8036f0539e3 : 000010a25362ae0e fffff8036a9b0180 0000000000000000 fffff8036a9b0180 : nt!KeAccumulateTicks+0x1e1756
fffff80372c84e80 fffff8036f0534ca : fffff8036faf3980 fffff80372c7d630 fffff80372137900 0000000000005101 : nt!KeClockInterruptNotify+0x453
fffff80372c84f30 fffff8036f100825 : fffff8036faf3980 0000000000000000 0000000000000000 ffffe7e00b650e78 : nt!HalpTimerClockIpiRoutine+0x1a
fffff80372c84f60 fffff8036f1ff6da : fffff80372c7d630 fffff8036faf3980 000010a25362809a 0000000000000000 : nt!KiCallInterruptServiceRoutine+0xa5
fffff80372c84fb0 fffff8036f1ffee7 : 0000000000000000 fffff8036f1ffef4 ffffe7e00b6a9518 fffff8036f052aaa : nt!KiInterruptSubDispatchNoLockNoEtw+0xfa
fffff80372c7d5b0 fffff8036f042f28 : fffff8039d973459 0000000000000010 fffff80372c7d8b0 fffff80372c7dec0 : nt!KiInterruptDispatchNoLockNoEtw+0x37
fffff80372c7d748 fffff8039d973459 : 0000000000000010 fffff80372c7d8b0 fffff80372c7dec0 0000000000000048 : nt!KzRaiseIrql+0x8
fffff80372c7d750 fffff8039d9754c8 : fffff80372c7d9c0 fffff80372c7dec0 ffffb482c6d67000 ffffbab62b535b05 : nvlddmkm+0xd3459
fffff80372c7d780 fffff8039d95e1d5 : ffffb482c6d67000 0000000000000000 0000000000000000 0000000000000001 : nvlddmkm+0xd54c8
fffff80372c7d7b0 fffff8036f0c166e : fffff8036a9b3240 fffff80372c7dcb0 fffff80372c7dec0 fffff8036a9b0180 : nvlddmkm+0xbe1d5
fffff80372c7dbb0 fffff8036f0c0954 : fffff8036a9b0180 0000000000000000 0000000000000002 0000000000000004 : nt!KiExecuteAllDpcs+0x30e
fffff80372c7dd20 fffff8036f205d75 : 0000000000000000 fffff8036a9b0180 ffff9d006b4a5640 0000027ce4f5be30 : nt!KiRetireDpcList+0x1f4
fffff80372c7dfb0 fffff8036f205b60 : fffffb823456fa80 fffff8036f1251fa 0000027ce4f8dce0 00007ffe3b0107d0 : nt!KxRetireDpcList+0x5
fffffb823456f9c0 fffff8036f2052d5 : 0000027ce4f5be30 fffff8036f1ff7a1 0000000000000001 ffffb48200000000 : nt!KiDispatchInterruptContinue
fffffb823456f9f0 fffff8036f1ff7a1 : 0000000000000001 ffffb48200000000 fffffb8200000000 ffffb48200000000 : nt!KiDpcInterruptBypass+0x25
fffffb823456fa00 00007ffe3b293696 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiInterruptDispatch+0xb1
000000d20eb9fbd0 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : 0x00007ffe`3b293696
SYMBOL_NAME: nvlddmkm+d3459
MODULE_NAME: nvlddmkm
IMAGE_NAME: nvlddmkm.sys
STACK_COMMAND: .cxr; .ecxr ; kb
BUCKET_ID_FUNC_OFFSET: d3459
FAILURE_BUCKET_ID: 0x133_ISR_nvlddmkm!unknown_function
OS_VERSION: 10.0.19041.1
BUILDLAB_STR: vb_release
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {f97493a5-ea2b-23ca-a808-8602773c2a86}
Followup: MachineOwner