Virtual Interrupt Controller
The hypervisor virtualizes interrupt delivery to virtual processors. This is done through the use of a synthetic interrupt controller (SynIC) which is an extension of a virtualized local APIC; that is, each virtual processor has a local APIC instance with the SynIC extensions. These extensions provide a simple inter-partition communication mechanism which is described in the following chapter. Interrupts delivered to a partition fall into two categories: external and internal. External interrupts originate from other partitions or devices, and internal interrupts originate from within the partition itself.
External interrupts are generated in the following situations:
- A physical hardware device generates a hardware interrupt.
- A parent partition asserts a virtual interrupt (typically in the process of emulating a hardware device).
- The hypervisor delivers a message (for example, due to an intercept) to a partition.
- Another partition posts a message.
- Another partition signals an event.
Internal interrupts are generated in the following situations:
- A virtual processor accesses the APIC interrupt command register (ICR).
- A synthetic timer expires.
Local APIC
The SynIC is a superset of a local APIC. The interface to this APIC is given by a set of 32-bit memory mapped registers. This local APIC (including the behavior of the memory mapped registers) is generally compatible with the local APIC on P4/Xeon systems as described in Intel’s and AMD's documentation.
The hypervisor’s local APIC virtualization may deviate from physical APIC operation in the following minor ways:
- On physical systems, the IA32_APIC_BASE MSR can be different for each processor in the system. The hypervisor may require that this MSR contains the same value for all virtual processors within a partition. As such, this MSR may be treated as a partition-wide value. If a virtual processor modifies this register, the value may effectively propagate to all virtual processors within the partition.
- The IA32_APIC_BASE MSR defines a “global enable” bit for enabling or disabling the APIC. The virtualized APIC may always be enabled. If so, this bit will always be set to 1.
- The hypervisor’s local APIC may not be able to generate virtual SMIs (system management interrupts).
- If multiple virtual processors within a partition are assigned identical APIC IDs, behavior of targeted interrupt delivery is boundedly undefined. That is, the hypervisor is free to deliver the interrupt to just one virtual processor, all virtual processors with the specified APIC ID, or no virtual processors. This situation is considered a guest programming error.
- Some of the memory mapped APIC registers may be accessed by way of virtual MSRs.
- The hypervisor may not allow a guest to modify its APIC IDs.
The remaining parts of this section describe only those aspects of SynIC functionality that are extensions of the local APIC.
Local APIC MSR Accesses
The hypervisor provides accelerated MSR access to high usage memory mapped APIC registers. These are the TPR, EOI, and the ICR registers. The ICR low and ICR high registers are combined into one MSR. For performance reasons, the guest operating system should follow the hypervisor recommendation for the usage of the APIC MSRs.
MSR address | Register Name | Description |
---|---|---|
0x40000070 | HV_X64_MSR_EOI | Accesses the APIC EOI |
0x40000071 | HV_X64_MSR_ICR | Accesses the APIC ICR-high and ICR-low |
0x40000072 | HV_X64_MSR_TPR | Access the APIC TPR |
HV_X64_MSR_EOI
Bits | Description | Attributes |
---|---|---|
63:32 | RsvdZ (reserved, should be zero) | Write |
31:0 | EOI value | Write |
HV_X64_MSR_ICR
Bits | Description | Attributes |
---|---|---|
63:32 | ICR high value | Read / Write |
31:0 | ICR low value | Read / Write |
HV_X64_MSR_TPR
Bits | Description | Attributes |
---|---|---|
63:8 | RsvdZ (reserved, should be zero) | Read / Write |
7:0 | TPR value | Read / Write |
This MSR is intended to accelerate access to the TPR in 32-bit mode guest partitions. 64-bit mode guest partitions should set the TPR by way of CR8.
Synthetic Cluster IPI
A hypervisor supports hypercalls which allow to send virtual fixed interrupts to an arbitrary set of virtual processors.
Hypercall | Description |
---|---|
HvCallSendSyntheticClusterIpi | Sends a virtual fixed interrupt to the specified virtual processor set. |
HvCallSendSyntheticClusterIpiEx | Similar to HvCallSendSyntheticClusterIpi, takes a sparse VP set as input. |
EOI Assist
One field in the Virtual Processor Assist Page is the EOI Assist field. The EOI Assist field resides at offset 0 of the overlay page and is 32-bits in size. The format of the EOI assist field is as follows:
Bits | Description | Attributes |
---|---|---|
31:1 | RsvdZ | Read / Write |
0 | No EOI Required | Read / Write |
The guest OS performs an EOI by atomically writing zero to the EOI Assist field of the virtual VP assist page and checking whether the “No EOI required” field was previously zero. If it was, the OS must write to the HV_X64_APIC_EOI MSR thereby triggering an intercept into the hypervisor. The following code is recommended to perform an EOI:
lea rcx, [VirtualApicAssistVa]
btr [rcx], 0
jc NoEoiRequired
mov ecx, HV_X64_APIC_EOI
wrmsr
NoEoiRequired:
The hypervisor sets the “No EOI required” bit when it injects a virtual interrupt if the following conditions are satisfied:
- The virtual interrupt is edge-triggered, and
- There are no lower priority interrupts pending
If, at a later time, a lower priority interrupt is requested, the hypervisor clears the “No EOI required” such that a subsequent EOI causes an intercept.
In case of nested interrupts, the EOI intercept is avoided only for the highest priority interrupt. This is necessary since no count is maintained for the number of EOIs performed by the OS. Therefore only the first EOI can be avoided and since the first EOI clears the “No EOI Required” bit, the next EOI generates an intercept. However nested interrupts are rare, so this is not a problem in the common case.
Note that devices and/or the I/O APIC (physical or synthetic) need not be notified of an EOI for an edge-triggered interrupt – the hypervisor intercepts such EOIs only to update the virtual APIC state. In some cases, the virtual APIC state can be lazily updated – in such cases, the “NoEoiRequired” bit is set by the hypervisor indicating to the guest that an EOI intercept is not necessary. At a later instant, the hypervisor can derive the state of the local APIC depending on the current value of the “NoEoiRequired” bit.
Enabling and disabling this enlightenment can be done at any time independently of the interrupt activity and the APIC state at that moment. While the enlightenment is enabled, conventional EOIs can still be performed irrespective of the “No EOI required” value but they will not realize the performance benefit of the enlightenment.