Thanks for the detailed response Andy, I now understand the context of your answer.
However, I still don't understand how VMM limits the IOPs used by the client as my observed figures are orders of magnitude different from the configured value.
I'm assuming it's actually implemented as some Hyper-V function under the covers. My storage subsystem is:
iSCSI LUN <-> Failover cluster CSV <-> VHD <-> VM
I have figures for observed IOPs values at the iSCSI LUN level, but of course those are several layers under the actual VHD. I'm assuming, but have no documentation to confirm it, that VMM/Hyper-V is monitoring IOPs at the VM <-> VHD layer, which would be using 4K blocks. Now the LUN is at 512 byte sector size, the NTFS volume underlying the CSV is at 64K clusters and the CSV will also have it's own caching, so what would I monitor at the Hypervisor level to see the actual IOPs presented by the VM?