Sorry for the lengthy post but I have done extensive troubleshooting on this and I'm at a loss. My Windows 10 computer is driving me crazy. It seems to be losing some form of internet connectivity every day approximately 24 hours after rebooting. Then
because I can’t connect completely to the internet and I can’t find a troubleshooting solution, I have to reboot again which fixes it for approximately 24 hours. Rinse, wash, repeat…
To provide some background, I have a CS degree and have been a software developer for 20+ years. I have tons of experience configuring desktops, servers, virtual environments, networks, etc. While I would be considered an expert in many areas, I know there
is far more that I don’t know than I do know. But that’s just to say that I am by no means a novice with the issues I’m having.
My machine:
- Windows 10 Pro (was upgrade from previous versions of Windows)
- Home network (logging on locally via domain credentials from when this was in a corporate environment but no longer connected to domain). But this has been working fine for 3+ years
- Windows Update is at the latest updates (but I turned off the stupid auto-update and reboot to avoid shutdowns when critical apps are open)
- 2 x Xeon E5-2630 2.6GHz
- 64 GB Ram
- 2 drives (primary is SSD, secondary is SATA)
My environment
- Sonicwall TZ210 router to 100Mb FIOS connection
- Machine connected directly to HP ProCurve 1810G switch
- No wireless involved with this machine
- I have 4 static IP addresses on my machine to run various internal web development sites through IIS, etc.
- My static IPs are not in the DHCP range that other devices on my network use
- I use 8.8.8.8 for DNS (and have tried 8.8.4.4 and some ISP DNS also)
- I have several VPNs that I may connect to in order to work throughout the day.
- I use static routes to redirect certain traffic through the VPN
- I do not use any other Proxies (other than VPN traffic) to connect to the internet
This began about 2-3 weeks ago when I installed a big group of Windows Updates which included the cumulative updates, etc. It was also around the time I switched from McAffee VirusScan Enterprise to AntiVirus Plus for renewal purposes. I probably did some
other things around that time to but lots of things were changing.
At that point, I would be working along just fine, and then all of the sudden, all browsers would lose connectivity: “This site can’t be displayed” in Chrome. Same equivalent in Firefox, IE, and Edge. However, some apps like Slack and Skype work for the most
part but occasionally (and then eventually) stop working altogether. However, the AAC+ music streaming seems to run indefinitely. But nothing I change will affect the browsers loading any webpages (which makes it much harder to write software). I don't lose
connection to VPN's etc.
There are no indications from McAfee or Malwarebytes when the connection stops. But I have no problem with any connectivity the entire day until it just freaks out.
While the failure is happening
- I can ping any public IP by IP address and generally by domain name
- I can do nslookup using 8.8.8.8
- I can ping my Sonicwall router but cannot connect to it via the browser
- None of the other machines on my network have ANY problems while my machine is unable to connect
- None of my browsers will load web pages (including local websites hosted on my machine via IIS)
- If I change my static IP addresses to DHCP, I can restore lost Slack (and I think Skype) connectivity but websites are still down. Restoring static IP’s takes away Slack again.
- Turning off VPNs and/or static routes does not restore connectivity
- I can log in to my Sonicwall router from another machines on the same network and can see the ARC lease for my static IP. Killing the ARC connection in the router results in it reappearing immediately.
- Nothing is pegging Processor or RAM at the time
- Memory usage is only about 40% or so
- I don't believe email comes through unless I change static/dhcp address status
- Sometimes when this first starts, I can pull up certain websites but the CSS / included content isn't loaded. That's how I usually know it's about to hard fail the connection.
I was originally thinking DNS, virus, malware, etc. But it’s super strange that I can connect with some protocols and not with others. I’m guessing that some app / program is deciding I’ve had enough internet for the day and is trying to cut me off for my
own good. But this is a free country and it’s my right to earn a living as needed… :)
Related Software I normally run
- McAfee AntiVirus Plus (which takes over for Windows Firewall)
- Malwarebytes
What I have tried
- I know about flushing DNS, etc and have tried that
- I have resetting winsock, etc (EDIT: includes running the following as Administrator in command prompt)
- netsh winsock reset catalog
- netsh int ip reset reset.log
- I have tried clearing Chrome chrome://net-internals sockets and dns (but this affects all browsers)
- I have confirmed that IE does not have any Proxies set up that are affecting general internet traffic
- I uninstalled McAfee completely for an entire day and it still happened
- I reinstalled McAfee and uninstalled Malwarebytes (still uninstalled)
- I turned off all onboard firewalls (McAfee / Windows)
- I switched Network cards (I have tried an onboard card, a 2nd PCIx card, and then purchased a 3rd USB Gigabit ethernet adapter). All cards were the only active card while testing (any others that were installed like the onboard were disabled)
- I have tried updating drivers for each of the 3 network cards including removing them from the machine in Device Manager and reinstalling
- I changed network ports on my ProCurve switch (and also hard rebooted my ProCurve switch)
- I have rebooted my Sonicwall router (but other devices have no problems)
- I did notice after a few days of daily freezes, that my C drive was full due to 30+ GB hyperfil.sys from a previous failure and my regular 64GB pagefile.sys. So I removed the hyperfil.sys file (using command line to disable Hybernation in Windows) and then
relocated the pagefile to D drive instead. Now I have 70+GB free space on C and 270+GB on D drive.
- sfc /scannow (no issues)
- DISM /Online /Cleanup-Image /RestoreHealth (nothing found)
- Resetting DHCP in services (but again I’m using static IP)
After migrating my pagefile to D, I lost my Start menu. I had to do this and other stuff to restore them:
“Get-AppXPackage -AllUsers | Foreach {Add-AppxPackage -DisableDevelopmentMode -Register "$($_.InstallLocation)\AppXManifest.xml"}”
However, I don’t have a complete set of panels (whatever the goofy cards are called on the right hand of the Start menu that pops up). So the start menu isn’t completely back, but my programs are there now so I can get to them.
Scans I have tried
- Full McAfee AntiVirus Plus scan
- Full Malwarebytes
- Emsisoft Emergency Kit
- TrendMicro HouseCall
- Windows Defender
Other than some non-intrusive PUPs, nothing is ever found.
I usually try an hour or two of more troubleshooting and Googling every time it goes down, but I can’t seem to figure this out yet. It’s super annoying because I don’t like to reboot regularly as I have 5 monitors and routinely have 130+ Chrome tabs along with
10-20 other apps / windows running, and many background processes (web servers, etc). I haven’t had any problems with this until a few weeks ago.
I’m sure there is more than I can offer but people will probably ask questions and I’m worried I might lose connectivity again before I get to post this.
Feel free to ask questions or offer advice but please take the time to read all the stuff I tried to save us both some time! And I don’t mind trying some other things again to confirm various settings / statuses. :)