BSOD on startup every day - Trying to identify specific causation

YELDUR 1 Reputation point
2021-10-15T13:38:02.73+00:00

Hi all,

For the past week or so I've been experiencing BSODs whenever I power on the computer first during the day; after we REACH the Windows splash screen, I have no further issues, even when restarting.

rom reviewing the Event Logs I can see one in there stating the following:

"The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly."
followed closely by:
"The driver \Driver\WudfRd failed to load for the device PCI\VEN_5853&DEV_1003\1&1a590e2c&0&03."

So far as far as causation goes, this is the only thing throwing flags, as I've successfully performed Windows Memory Diagnostics with no issues being found, system file checks with no corruption being found, and lastly checking in on the device manager and checking all tabs to ensure nothing in there is throwing errors. As far as I can tell, these issues began this week.

I know that this week I plugged in a new keyboard that is different to that of my old one, and in doing so I needed to download some more drivers for it, however I went from a Roccat Aimo 120 to a Roccat Aimo 100, to which the only real difference is the fact that the 100 doesn't have a hand wrest with the keyboard. Besides that, it doesn't appear any different specification wise, so I'm unclear on whether this is the cause. I also changed my power plan on the rig from Balanced to Performance, though I don't expect this to be the cause.

Originally I believed perhaps that drivers were the issue, however, now I'm not so sure.

To cut a long story short, I ran a bugcheck analysis using the Windows Debug tools which threw me the following:

12: kd> !analyze -v
***

    *
    Bugcheck Analysis *
    *

***

MEMORY_MANAGEMENT (1a)
# Any other values for parameter 1 must be individually examined.
Arguments:
Arg1: 0000000000041792, A corrupt PTE has been detected. Parameter 2 contains the address of
the PTE. Parameters 3/4 contain the low/high parts of the PTE.
Arg2: ffff83816716da08
Arg3: 0000800000000000
Arg4: 0000000000000000

Debugging Details:
------------------


KEY_VALUES_STRING: 1

Key : Analysis.CPU.mSec
Value: 3249

Key : Analysis.DebugAnalysisManager
Value: Create

Key : Analysis.Elapsed.mSec
Value: 10478

Key : Analysis.Init.CPU.mSec
Value: 1249

Key : Analysis.Init.Elapsed.mSec
Value: 65592

Key : Analysis.Memory.CommitPeak.Mb
Value: 73

Key : MemoryManagement.PFN
Value: 800000000

Key : WER.OS.Branch
Value: vb_release

Key : WER.OS.Timestamp
Value: 2019-12-06T14:06:00Z

Key : WER.OS.Version
Value: 10.0.19041.1


BUGCHECK_CODE: 1a

BUGCHECK_P1: 41792

BUGCHECK_P2: ffff83816716da08

BUGCHECK_P3: 800000000000

BUGCHECK_P4: 0

MEMORY_CORRUPTOR: ONE_BIT

BLACKBOXNTFS: 1 (!blackboxntfs)


CUSTOMER_CRASH_COUNT: 1

PROCESS_NAME: autochk.exe

STACK_TEXT:
ffff988d4679f388 fffff8054624423a : 000000000000001a 0000000000041792 ffff83816716da08 0000800000000000 : nt!KeBugCheckEx
ffff988d4679f390 fffff80546242a6f : ffff8688b7883700 0000000000000000 ffff868800000002 0000000000000000 : nt!MiDeleteVa+0x153a
ffff988d4679f490 fffff80546212c10 : 0000000000000001 ffff988d00000000 ffff8688b7883550 ffff8688b7910080 : nt!MiDeletePagablePteRange+0x48f
ffff988d4679f7a0 fffff80546252277 : 000000002ce2db4f 0000000000000000 ffff868800000000 fffff80500000000 : nt!MiDeleteVad+0x360
ffff988d4679f8b0 fffff805465f908c : ffff988d00000000 0000000000000000 ffff988d4679fa10 000002ce2db30000 : nt!MiFreeVadRange+0xa3
ffff988d4679f910 fffff805465f8b65 : 00007ff70784b980 000002ce44f49e50 ffff988d4679fad8 0000000000000000 : nt!MmFreeVirtualMemory+0x4ec
ffff988d4679fa60 fffff80546408bb8 : ffff8688b7910080 ffff868800000001 0000000000000000 ffff868800000000 : nt!NtFreeVirtualMemory+0x95
ffff988d4679fac0 00007ffa4676d134 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiSystemServiceCopyEnd+0x28
000000e2f757a4b8 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : 0x00007ffa`4676d134


MODULE_NAME: hardware

IMAGE_NAME: memory_corruption

STACK_COMMAND: .thread ; .cxr ; kb

FAILURE_BUCKET_ID: MEMORY_CORRUPTION_ONE_BIT

OS_VERSION: 10.0.19041.1

BUILDLAB_STR: vb_release

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {e3faf315-c3d0-81db-819a-6c43d23c63a7}

Followup: MachineOwner

I work in tech, but I am by no means a master, and to be frank, I don't know what I'm reading here. I can gather that it is telling me that there's something wrong with memory, in that it's seeing corruption, but other than that I'm honestly not too sure.

Here's the event log that prompted me finding these issues:

Event ID 1001

The computer has rebooted from a bugcheck. The bugcheck was: 0x0000001a (0x0000000000041792, 0xffff83816716da08, 0x0000800000000000, 0x0000000000000000). A dump was saved in: C:\WINDOWS\MEMORY.DMP. Report Id: 15812135-3f48-42c4-b474-5b9fd5a5cf7e.

If there's any more information required, please don't hesitate to ask and I will do my best to gather it for you.

Windows 10
Windows 10
A Microsoft operating system that runs on personal computers and tablets.
10,657 questions
0 comments No comments
{count} votes

71 answers

Sort by: Newest
  1. Docs 15,146 Reputation points
    2021-10-25T02:32:17.403+00:00

    Okay.

    When possible:

    a) flash the BIOS
    b) clean ports and fans > monitor temperatures
    c) view the owners manual to see if there are optimal DIMM slots when using only one RAM module
    d) test one RAM module at a time in the same DIMM
    e) if one works without crashes then test all DIMM with the same RAM module
    f) the above may be able to rule in / rule out malfunctioning RAM and or motherboard
    g) if there are no further misbehaving drivers then the focus is on hardware
    (any additional debugging is just to make sure there are no further misbehaving drivers)
    h) make free backup images (once you're familiar with the product you can decide whether to pay for the software enhancements in the different versions)

    Temperatures can be monitored with Speccy, HW Monitor, or SpeedFan

    Unexpected shutdowns should be seen with a :( window.
    Sometimes there can be missed shutdowns so also use:
    https://www.howtogeek.com/166911/reliability-monitor-is-the-best-windows-troubleshooting-tool-you-arent-using/

    .
    .
    .
    .
    .
    Please remember to vote and to mark the replies as answers if they help.

    On the bottom of each post there is:

    Propose as answer = answered the question

    On the left side of each post: Vote = a helpful post
    .
    .
    .
    .
    .

    1 person found this answer helpful.

  2. Docs 15,146 Reputation points
    2021-10-25T01:57:05.917+00:00

    The free time over the next few weeks is unpredictable.

    The week of Nov 1 will be much busier than this coming week.

    The following week at this moment is unknown but there are are many tasks before the upcoming holiday weeks.

    So I can plan to troubleshoot any time but it may need to be adjusted.

    Use this link to make free or pay backup images:

    https://www.tenforums.com/tutorials/61026-backup-restore-macrium-reflect.html

    .
    .
    .
    .
    .
    Please remember to vote and to mark the replies as answers if they help.

    On the bottom of each post there is:

    Propose as answer = answered the question

    On the left side of each post: Vote = a helpful post
    .
    .
    .
    .
    .

    1 person found this answer helpful.

  3. YELDUR 1 Reputation point
    2021-10-25T01:24:26.7+00:00

    Understood Docs,

    I appreciate the feedback.

    What I'm going to try to do is make use of my works buyback system, whereby I can purchase holiday time from them in order to get more time off, this isn't something I'll be able to use this week because it would be too short notice but hypothetically (You don't have to give an answer to this if you don't know) would you be available from the 8th of November or onwards? My aim would be to buy some holiday from work, book that for the 8th or later and then be able to dedicate all my time to finishing this off.

    This week unfortunately is going to be a time when I'm not available, as I'm on the night shift, meaning that I'll be working when you are available and not when you aren't.

    What I'll do in the meantime is look at resetting BIOS back to factory settings and doing a clean of my rig to clean up dust build up within the device, as like you suggested both of these need to be addressed regardless.

    Thanks for bearing with me on this, I know that this must be a nuisance. And I know that the thank yous might be getting tiring but I hate causing people irritation so I'm sorry again for any issues caused by my lack of availability at times.

    For the time being, I'll continue to post V2/Memory logs when crashes occur and will aim to perform the steps you need me to do next over next weekend, as I'll again be available there.

    With regards to switching the RAM modules that one will be a bit tougher but am going to just have to bite the bullet with that one and accept going without the RAM for a few weeks. Most likely when I look to book some time off from work to triage this some more will be the best time for me to do that, so that it doesn't impact on my work.

    0 comments No comments

  4. Docs 15,146 Reputation points
    2021-10-25T01:13:54.037+00:00

    All overclocks should be returned to stock when troubleshooting.

    Overclocks can be a common cause of unexpected shutdowns and restarts.

    In addition high temperatures are another cause of unexpected shutdowns and restarts.

    So both of these should be addressed before the troubleshooting of drivers and hardware.

    I'll have more time than I had expected for some days this coming week.

    The following week I'll be much busier.

    If you can post V2 and memory dumps then I''ll be able to debug them when there is time.

    Most of the troubleshooting that you cannot do will be completed (debugging dump files) soon.

    For the hardware you must rule in or rule out RAM and motherboard.

    RAM modules can be tested one at a time in the same DIMM.

    If only one causes unexpected shutdowns and restarts then you've found the culprit.

    If both when tested one at a time cause unexpected shutdowns and restarts then you still need to continue testing RAM and motherboard.

    There are multiple websites that are related to overclocking.

    Once the troubleshooting has completed you can open a thread in one of them for BIOS settings.

    But you'll want at least two weeks (the more the better) of documented computer stability before overclocking.

    .
    .
    .
    .
    .
    Please remember to vote and to mark the replies as answers if they help.

    On the bottom of each post there is:

    Propose as answer = answered the question

    On the left side of each post: Vote = a helpful post
    .
    .
    .
    .
    .

    1 person found this answer helpful.
    0 comments No comments

  5. YELDUR 1 Reputation point
    2021-10-25T00:59:44.887+00:00

    With regards to resetting the BIOS, what are the ramifications of this? Will I need to reflash the BIOS again or will this just simply revert the DOCP changes made back to their defaults (I've made no other changes to the BIOS besides setting up the DOCP changes)

    I'm going to need the system resources for doing my work; I understand this might be an annoyance but is it possible that we can postpone this until such a time that you and I are better available? I know that you're not very available over the course of the next two weeks (at least), and I am not going to be very available this coming week either. As this isn't affecting the operation of my system outside of when I first boot the device up in the morning, I'm happy to wait until you and I are better available, if this is something you are open to?

    If you're not willing to do this, I understand completely, and will try to continue on my own from here, but would appreciate heavily if you could give me some advice about what it is you search for inside the DMP logs to look at whether drivers are the issues etc etc.

    The bottom line of what needs to be done is essentially:

    1) Reset the BIOS (Presumably I do this from the BIOS screen)
    2) If I continue to see BSOD's once this is done, the RAM is the next thing to look at with regards to issue causation
    3) Need to then remove the RAM module, one at a time and test the system with this setup for 1 week at a time
    4) If I receive BSOD's during the general operation of the device during that week then swap the other RAM module into the same DIMM slot

    Once again, I will say thank you very much for all your time on this regardless of where we go from here, I wouldn't have gotten this far without your support and I'm incredibly glad to have had you.

    0 comments No comments