Re: [PATCH v3 05/28] KVM: x86: Don't inhibit APICv/AVIC if xAPIC ID mismatch is due to 32-bit ID

From: Alejandro Jimenez
Date: Wed Sep 28 2022 - 16:50:58 EST




On 9/28/2022 12:51 PM, Sean Christopherson wrote:
On Wed, Sep 28, 2022, Maxim Levitsky wrote:
On Tue, 2022-09-27 at 23:15 -0400, Alejandro Jimenez wrote:

On 9/20/2022 7:31 PM, Sean Christopherson wrote:
Truncate the vcpu_id, a.k.a. x2APIC ID, to an 8-bit value when comparing
it against the xAPIC ID to avoid false positives (sort of) on systems
with >255 CPUs, i.e. with IDs that don't fit into a u8. The intent of
APIC_ID_MODIFIED is to inhibit APICv/AVIC when the xAPIC is changed from
it's original value,

The mismatch isn't technically a false positive, as architecturally the
xAPIC IDs do end up being aliased in this scenario, and neither APICv
nor AVIC correctly handles IPI virtualization when there is aliasing.
However, KVM already deliberately does not honor the aliasing behavior
that results when an x2APIC ID gets truncated to an xAPIC ID. I.e. the
resulting APICv/AVIC behavior is aligned with KVM's existing behavior
when KVM's x2APIC hotplug hack is effectively enabled.

If/when KVM provides a way to disable the hotplug hack, APICv/AVIC can
piggyback whatever logic disables the optimized APIC map (which is what
provides the hotplug hack), i.e. so that KVM's optimized map and APIC
virtualization yield the same behavior.

For now, fix the immediate problem of APIC virtualization being disabled
for large VMs, which is a much more pressing issue than ensuring KVM
honors architectural behavior for APIC ID aliasing.

I built a host kernel with this entire series on top of mainline
v6.0-rc6, and booting a guest with AVIC enabled works as expected on the
initial boot. The issue is that during the first reboot AVIC is
inhibited due to APICV_INHIBIT_REASON_APIC_ID_MODIFIED, and I see
constant inhibition events due to APICV_INHIBIT_REASON_IRQWIN as seen in


APICV_INHIBIT_REASON_IRQWIN is OK, because that happens about every time
the good old PIT timer fires which happens on reboot.

APICV_INHIBIT_REASON_APIC_ID_MODIFIED should not happen as you noted,
this needs investigation.

Ya, I'll take a look.

It happens regardless of vCPU count (tested with 2, 32, 255, 380, and
512 vCPUs). This state persists for all subsequent reboots, until the VM
is terminated. For vCPU counts < 256, when x2apic is disabled the
problem does not occur, and AVIC continues to work properly after reboots.

Bit of a shot in the dark, but does the below fix the issue?
The patch below fixes the problems for all the scenarios I have tested so far.

Thank you,
Alejandro

There are two
issues with calling kvm_lapic_xapic_id_updated() from kvm_apic_state_fixup():

1. The xAPIC ID should only be refreshed on "set".

2. The refresh needs to be noted after memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s));

and a third bug in the helper itself, as changes to the ID should be ignored if
the APIC is hardward disabled since the ID is reset to the vcpu_id when the APIC
is hardware enabled (architecturally behavior).

---
arch/x86/kvm/lapic.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 804d529d9bfb..b8b2faf5abc7 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2159,6 +2159,9 @@ static void kvm_lapic_xapic_id_updated(struct kvm_lapic *apic)
{
struct kvm *kvm = apic->vcpu->kvm;
+ if (!kvm_apic_hw_enabled(apic))
+ return;
+
if (KVM_BUG_ON(apic_x2apic_mode(apic), kvm))
return;
@@ -2875,8 +2878,6 @@ static int kvm_apic_state_fixup(struct kvm_vcpu *vcpu,
icr = __kvm_lapic_get_reg64(s->regs, APIC_ICR);
__kvm_lapic_set_reg(s->regs, APIC_ICR2, icr >> 32);
}
- } else {
- kvm_lapic_xapic_id_updated(vcpu->arch.apic);
}
return 0;
@@ -2912,6 +2913,9 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
}
memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s));
+ if (!apic_x2apic_mode(vcpu->arch.apic))
+ kvm_lapic_xapic_id_updated(vcpu->arch.apic);
+
atomic_set_release(&apic->vcpu->kvm->arch.apic_map_dirty, DIRTY);
kvm_recalculate_apic_map(vcpu->kvm);
kvm_apic_set_version(vcpu);

base-commit: 0b502152c0b8523f399bdb53096e2d620c5795b5