Hi @Erdong zhang ,
From the behavior you described, I would not treat this as a simple BAR offset or cache-type issue. Kernel-mode MMIO works on the same BAR and offset, while the user-mode mapping returns 0xFFFFFFFF only on the Thunderbolt / external PCIe setup. That tells me the issue is specific to the user-mode mapping path rather than the BAR resource itself.
Is directly mapping PCIe BAR PFNs into user-mode via
MmMapLockedPagesSpecifyCache(UserMode)officially supported/reliable on modern Windows systems?
MmMapLockedPagesSpecifyCache(..., UserMode, ...) can create a user-mode mapping, but the documentation notes that for the AccessMode parameter, "Almost all drivers should use KernelMode." Microsoft's register access guidance also points toward keeping register access inside the driver rather than exposing mapped register space directly to user mode. I could not find Microsoft documentation presenting direct user-mode BAR mapping as a recommended general-access model for device registers, so even though it works on a desktop PCIe slot, I would not rely on it as a portable design.
Are there known platform restrictions for user-mode MMIO on Thunderbolt, external PCIe bridges, laptops with Kernel DMA Protection enabled?
Thunderbolt introduces a different PCIe topology involving tunneled PCIe, bridges, hotplug security, and potentially different IOMMU handling compared to a native motherboard PCIe slot. Windows does treat externally exposed PCIe hierarchies differently, which could explain why the user-mode mapping behaves differently on this topology. You also mentioned you already disabled Kernel DMA Protection and the problem persisted, which lowers confidence that Kernel DMA Protection alone is the cause but other firmware-level or IOMMU behavior may still be present underneath even when the Windows-level setting is off.
Has anyone seen kernel MMIO working, but user-mode MMIO returning
0xffffffff?
0xFFFFFFFF from an MMIO read is commonly associated with the transaction not completing successfully, for example, the read not reaching the device or not being forwarded correctly upstream. Since kernel-mode reads still work on the same offset, this makes a completely invalid BAR or nonfunctional device less likely and points to the two access paths not being equivalent on this platform.
Worth noting: your kernel path uses READ_REGISTER_ULONG(...), which includes memory barrier semantics. Your user-mode path uses a volatile pointer dereference, which is not semantically identical. That said, if this were purely a barrier issue I would expect stale or intermittent values, not a hard constant 0xFFFFFFFF so the more likely explanation is still the transaction not getting through on the user-mode path.
Is there a recommended alternative approach besides direct user-mode BAR mapping?
Yes, keep the BAR mapped in kernel mode and expose only the required operations through IOCTLs. The application sends an IOCTL with the register offset and operation type, the driver validates the offset, performs READ_REGISTER_ULONG / WRITE_REGISTER_ULONG, and returns the result. Microsoft's driver security best practices guidance treats unconstrained device memory access from user mode as unsafe and says the driver should validate and constrain the address and size.
I would suggest:
- Keep MMIO access in the kernel driver.
- Expose only the needed register operations through IOCTLs.
- Validate register offset and size in the driver.
- Avoid exposing the whole BAR directly to user mode.
This should be more reliable across different PCIe topologies and avoids depending on platform- and topology-specific behavior of direct user-mode BAR mappings.
Hope my explanation answers your question! If you find it helpful to you so far, I would greatly appreciate it if you could follow the instructions here so others with the same problem can benefit as well. Thank you.