Advanced system utilities to manage, troubleshoot, and diagnose Windows and Linux systems and applications.
Deadlock caused by ImageLoadNotifyRoutine
We've run into an issue where Sysmon essentially causes a system deadlock.
Consider the following scenario involving Sysmon and an AV product:
- Scan.exe wants to load an image.
- Sysmon's driver is notified via its registered notify routine.
- The driver allocates a work item and queues it for delayed processing.
- The driver waits for the worker thread to signal completion via an event. The loader lock is being held the entire time.
- A worker thread picks up the queued item and processes it. Part of the processing is hashing the image, which requires a handle to be opened.
- Opening a file handle causes various filter drivers to be called.
- One of the filter drivers happens to belong to the AV product and notifies the userland process Scan.exe to scan the file.
- The filter driver waits for userland to conclude scanning.
- But the userland process is locked up because its loader lock is being held.
- Deadlock.
I'd like to point out that the driver is violating Microsoft's own best practices laid out here: https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/windows-kernel-mode-process-and-thread-manager#best
Particularly,
If you use System Worker Threads do not wait on the work to complete. Doing so defeats the purpose of queuing the work to be completed asynchronously.
which is item 4 in the list above.
PS: It seems the driver supports a flag that disables image hashing. It would be nice if it were possible to configure that instead of it being hardcoded to 1. This would allow us to keep receiving ImageLoad events, since at the moment the only mitigation is to completely disable it.