Hello there, Do you get any Event ID so we can narrow down things? In the bios under the Advanced tab there's an option for Precision Boost Overdrive. Try disabling it . WHEA_UNCORRECTABLE_ERROR. AuthenticAMD. sys Bsod or Blue Screen error is triggered due to outdated BIOS or AMD Graphic driver, enabled Fast Startup, or hardware failure. The WHEA_UNCORRECTABLE_ERROR bug check has a value of 0x00000124. This bug check indicates that a fatal hardware error has occurred. This bug check uses the error data that is provided by the Windows Hardware Error Architecture (WHEA). Hope this resolves your Query !! --If the reply is helpful, please Upvote and Accept it as an answer--
WHEA_UNCORRECTABLE_ERROR - AuthenticAMD.sys
Joel-mic
0
Reputation points
I'm trying to get a "render farm" going using Nvidia GPUs and Octane Render. On Windows Server 2019, I get BSOD and reboots while running render benchmarks (Octane Bench). It can usually make it a few seconds into the test, further if I power limit the GPUs but it's inconsistent. Enabling only 1/4 or 2/4 and sometimes 3/4 GPUs can allow me to complete the benchmark. But 4/4 crashes the machine. AMD Epyc 7232P processor Asrock ROMED8-2T/BCM motherboard 4x RTX 3090 Founders Edition GPUs I don't have much knowledge to make sense of the minidump file, but here's the text of one:
Microsoft (R) Windows Debugger Version 10.0.25200.1003 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.
Loading Dump File [D:\Polymath Dropbox\Joel Gautraud\Adobe After Effects Auto-Save\040523-15343-01.dmp]
Mini Kernel Dump File: Only registers and stack trace are available
Symbol search path is: srv*
Executable search path is:
Windows 10 Kernel Version 17763 MP (16 procs) Free x64
Product: Server, suite: TerminalServer SingleUserTS
Machine Name:
Kernel base = 0xfffff801`4e60f000 PsLoadedModuleList = 0xfffff801`4ea274d0
Debug session time: Wed Apr 5 12:56:40.820 2023 (UTC - 4:00)
System Uptime: 0 days 0:00:04.829
Loading Kernel Symbols
..............................................................
Loading User Symbols
Mini Kernel Dump does not contain unloaded driver list
For analysis of this file, run !analyze -v
nt!WheapCreateLiveTriageDump+0x7b:
fffff801`4eee9d77 48895c2438 mov qword ptr [rsp+38h],rbx ss:0018:ffffdf8e`98acb5d8=ffff920700000a60
12: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************
WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
nt!_WHEA_ERROR_RECORD structure that describes the error condition. Try !errrec Address of the nt!_WHEA_ERROR_RECORD structure to get more details.
Arguments:
Arg1: 0000000000000007, BOOT Error
Arg2: ffff92076e513d68, Address of the nt!_WHEA_ERROR_RECORD structure.
Arg3: 0000000000000000
Arg4: 0000000000000000
Debugging Details:
------------------
KEY_VALUES_STRING: 1
Key : Analysis.CPU.mSec
Value: 2108
Key : Analysis.DebugAnalysisManager
Value: Create
Key : Analysis.Elapsed.mSec
Value: 3250
Key : Analysis.IO.Other.Mb
Value: 10
Key : Analysis.IO.Read.Mb
Value: 0
Key : Analysis.IO.Write.Mb
Value: 17
Key : Analysis.Init.CPU.mSec
Value: 1358
Key : Analysis.Init.Elapsed.mSec
Value: 41515
Key : Analysis.Memory.CommitPeak.Mb
Value: 77
Key : Bugcheck.Code.DumpHeader
Value: 0x124
Key : Bugcheck.Code.Register
Value: 0x98acb5e0
FILE_IN_CAB: 040523-15343-01.dmp
BUGCHECK_CODE: 124
BUGCHECK_P1: 7
BUGCHECK_P2: ffff92076e513d68
BUGCHECK_P3: 0
BUGCHECK_P4: 0
CUSTOMER_CRASH_COUNT: 1
PROCESS_NAME: System
STACK_TEXT:
ffffdf8e`98acb5a0 fffff801`4eb8c089 : ffff9207`6ad23040 ffff9207`6e513d40 fffff801`4ea13580 00000000`00000000 : nt!WheapCreateLiveTriageDump+0x7b
ffffdf8e`98acbad0 fffff801`4e92d168 : ffff9207`6e513d40 fffff801`4e725711 00000000`00000000 00000000`00000000 : nt!WheapCreateTriageDumpFromPreviousSession+0x2d
ffffdf8e`98acbb00 fffff801`4e92de7b : fffff801`4ea13520 fffff801`4ea13580 fffff801`4ea175e0 ffff9207`67afa960 : nt!WheapProcessWorkQueueItem+0x48
ffffdf8e`98acbb40 fffff801`4e6c01ba : ffff9207`67cc0920 ffff9207`6ad23040 ffff9207`67cc0900 ffff9207`00000000 : nt!WheapWorkQueueWorkerRoutine+0x2b
ffffdf8e`98acbb70 fffff801`4e741ed5 : ffff9207`6ad23040 ffff9207`67bf5300 ffff9207`6ad23040 00000000`00000000 : nt!ExpWorkerThread+0x16a
ffffdf8e`98acbc10 fffff801`4e7d051c : fffff801`4d0f1180 ffff9207`6ad23040 fffff801`4e741e80 00000000`00000000 : nt!PspSystemThreadStartup+0x55
ffffdf8e`98acbc60 00000000`00000000 : ffffdf8e`98acc000 ffffdf8e`98ac6000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x1c
MODULE_NAME: AuthenticAMD
IMAGE_NAME: AuthenticAMD.sys
STACK_COMMAND: .cxr; .ecxr ; kb
FAILURE_BUCKET_ID: 0x124_7_AuthenticAMD_PROCESSOR__UNKNOWN_IMAGE_AuthenticAMD.sys
OSPLATFORM_TYPE: x64
OSNAME: Windows 10
FAILURE_ID_HASH: {9a3989b5-afe5-d9f8-5fed-f06a563b7314}
Followup: MachineOwner
---------
12: kd> lmvm AuthenticAMD
Browse full module list
start end module name
Mini Kernel Dump does not contain unloaded driver list