We have a Windows Server 2019 DHCP-Server Cluster which regularly fails to serve DHCP-leases (nearly every day).
A service restart resolves the problem temporarly. Both DHCP-Server are running on DomainController.
vssadmin list writers shows 'Dhcp Jet Writer' beeing in error until service restart.
Eventlogs show a lot of error messages like
- "svchost (2876,D,0) The logfile sequence in "C:\Windows\system32\dhcp" has been halted due to a fatal error. No further updates are possible for the databases that use this logfile sequence. Please correct the problem and restart or restore from backup."
- "svchost (2876,D,0) An attempt to open the file "C:\Windows\system32\dhcp\j50tmp.log" for read / write access failed with system error 32 (0x00000020): "The process cannot access the file because it is being used by another process. ". The open file operation will fail with error -1032 (0xfffffbf8)."
- "The DHCP service encountered the following error when backing up the database".
Eventlog IDs are
System: 1010, 1016
Application: 104, 215, 413, 490, 492
Indexing service is not running. AntiVirus has folder-exclusion configured for "C:\Windows\system32\dhcp".
Uninstalling and reinstalling DHCP-role did resolve the problem only for a couple of days and then reoccured.
Before we had Windows Server 2012 R2 DHCP-Cluster with exactly the same problem. We switched to Server 2019, but the problems reoccurred shortly after switching.
The problems occured some time after we did some Server hardening following CIS-Benchmarks.
We have a second ActiveDirectory Domain which had the same problem for one of both DHCP-servers, which led to the whole cluster not offering leases after some time, although one member did not show any errors. We did setup a dedicated Win2019 server as DHCP-server and built the cluster using the existing not-failing DC with DHCP-role - this works fine since a couple of weeks.
Does somebody know what causes theses problems?
Of course we could install new servers and hope that the problems don't reoccur. An again install new servers when the problem reoccurs ...