Using the checked version of NDIS.SYS
I assert that this is a good way to find bugs
Installing the checked version of the operating system is an effective technique to quickly find bugs in your network driver. If you’re not familiar with checked builds (and even if you are), you should read the excellent documentation here. Seriously, read it; I won’t repeat it here.
What do you get with the checked build of NDIS?
The main difference is that NDIS’s implementation has (as of Windows 8.1) approximately 2200 extra asserts. While some of these asserts verify NDIS’s internal bookkeeping is consistent, many of them verify that your driver uses NDIS’s APIs correctly. For example, NDIS asserts the current IRQL is correct when each MiniportXxx callback returns, to help catch the class of bug where your miniport driver leaks an IRQLs or spinlock.
Prior to Windows 7, using the checked build of NDIS is also the only way to see NDIS’s debug traces. But as of Windows 7, these traces are now available from WPP, so there’s no longer a need to use the checked build solely for tracing.
What’s the downside of using a checked build?
There are two downsides to using checked builds: performance and false-positives.
Checked builds are noticeably slower. But they aren’t as bad as you might think. We still compile checked builds with most compiler optimizations enabled, so the only slowdowns are a few extra verifications here and there. Still, the operating system has zillions of assertions, so those do add up. You definitely don’t want to use a checked build for any performance-related work. But they’re just fine for initial work on a new feature or functional testing.
False positives are also a problem. Sometimes you’ll see assertions that fail for reasons that don’t seem to be related to your driver. When you see an unfamiliar assertion, you’ll first want to spent a moment to convince yourself that the assertion failure is really caused by your driver. For example, if there’s an assertion in win32k.sys about an invalid HRGN, that’s probably not caused by any network driver. Prior to Windows 8, the operating system was kind of “noisy”; a nontrivial percentage of its assertions would fire for benign reasons. We worked hard to clean that up in Windows 8, so the asserts have a better signal-to-noise ratio. (Like many Windows engineers, I used a checked build of the OS as my primary workstation for some time during Windows 8 development. That was fun.)
If you discover an assertion in NDIS.SYS that you believe is a false positive, please let me know here and I’ll try to clean that up. (Unfortunately I’m not knowledgeable about non-networking drivers, so I can’t promise I can help you with any random assertion that you come across.)
How can you get the checked build of NDIS?
MSDN has the story on how to download a copy of the checked build. From there, you have two options:
- Install the complete operating system to get maximum verifications across the OS; or
- Selectively replace a few drivers.
Since MSDN already explains the first option, I won’t repeat those instructions here. Let’s talk about the second option: how to selectively replace a few drivers.
First, identify the drivers that you want to replace. Here’s a table of drivers you can consider replacing:
Driver | When to replace |
---|---|
The kernel & HAL | Always |
NDIS.SYS | Always |
TCPIP.SYS | Miniport, LWF, and WFP callout drivers |
NETIO.SYS (Windows Vista and later) | Whenever TCPIP.SYS is replaced |
FWPKCLNT.SYS | WFP callout drivers |
(your bus driver, e.g., PCI.SYS) | Miniport drivers |
NWIFI.SYS | Native 802.11 drivers |
VWIFIBUS.SYS VWIFIFLT.SYS VWIFIMP.SYS | Native 802.11 drivers that implement MAC virtualization (WFD or SoftAP) |
NDISUIO.SYS | WWAN drivers |
WMBCLASS.SYS | WWAN drivers that implement the class driver model |
VSWITCH.SYS | Hyper-V extensible switch extension |
Keep in mind — these are just guidelines. You are not required to test with any particular set of drivers, and you might want to fine-tune the list depending on what subsystem you’re targeting. If you are unsure about which binaries to replace, remember you can always just install the entire checked OS, which gives you the maximum checked build coverage.
Now that you know which drivers to replace, you can extract them from the checked build media. If you obtained installable media, you can mount the included INSTALL.WIM with DISM.EXE to get at the individual drivers, or you can just install the OS into a throw-away VM to get convenient access to its drivers.
Finally, you'll need to actually replace these drivers on your target OS. Don’t do this on a production OS machine; we can’t officially support this. The easiest way to replace binaries is to hook up a kernel debugger and use the .kdfiles feature. For example, here’s the mapfile that I use to replace NDIS.SYS on a test machine:
map
\Windows\system32\DRIVERS\NDIS.SYS
c:\path\to\ndis.sys
Note that the name of the driver will depend on how the driver is loaded. Use CTRL+D or CTRL+ALT+D in the debugger and reboot the target machine to see the official name of each driver.
Note that the process for replacing the kernel & HAL is special.
Oh, and sorry for the awful pun in the subtitle.