Re: [PATCH 1/1] KVM: x86: Allow emulation of EOI writes with AVIC enabled

From: Sean Christopherson
Date: Tue Sep 06 2022 - 11:55:43 EST


On Sat, Sep 03, 2022, Alejandro Jimenez wrote:
> According to section 15.29.9.2 - AVIC Access to un-accelerated vAPIC register
> of the AMD APM [1]:
>
> "A guest access to an APIC register that is not accelerated by AVIC results in
> a #VMEXIT with the exit code of AVIC_NOACCEL. This fault is also generated if
> an EOI is attempted when the highest priority in-service interrupt is set for
> level-triggered mode."
>
> This is also stated in Table 15-22 - Guest vAPIC Register Access Behavior,
> confirming that AVIC hardware traps on EOI writes for level triggered
> interrupts, and leading to the following call stack:
>
> avic_unaccelerated_access_interception()
> -> avic_unaccel_trap_write()
> -> kvm_apic_write_nodecode()
> -> kvm_lapic_msr_read()
> -> kvm_lapic_reg_read()
>
> In kvm_lapic_reg_read(), the APIC_EOI offset (0xb0) is not allowed as valid, so
> the error returned triggers the assertion introduced by 'commit 70c8327c11c6
> ("KVM: x86: Bug the VM if an accelerated x2APIC trap occurs on a "bad" reg")'
> and kills the VM.
>
> Add APIC_EOI offset to the valid mask in kvm_lapic_reg_read() to allow the
> emulation of EOI behavior for level triggered interrupts.
>
> [1] https://www.amd.com/system/files/TechDocs/24593.pdf
>
> Fixes: 0105d1a52640 ("KVM: x2apic interface to lapic")
> Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> ---
>
> I am unsure as to the proper commit to use for the Fixes: tag. Technically the
> issue was introduced by the initial SVM AVIC commits in 2016, since they failed
> to add the EOI offset to the valid mask.
>
> To be safe, I used the commit that introduces the valid mask, but that is
> somewhat misleading since at the time AVIC was not available, and I believe that
> Intel posted interrupts implementation does not require access to EOI offset in
> this code.
>
> Please correct Fixes: tag if necessary.
> ---
> arch/x86/kvm/lapic.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 9dda989a1cf0..61041fecfa89 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1452,6 +1452,7 @@ static int kvm_lapic_reg_read(struct kvm_lapic *apic, u32 offset, int len,
> APIC_REG_MASK(APIC_LVR) |
> APIC_REG_MASK(APIC_TASKPRI) |
> APIC_REG_MASK(APIC_PROCPRI) |
> + APIC_REG_MASK(APIC_EOI) |

EOI is write-only for x2APIC, reads are supposed to #GP. So unfortunately, simply
adding it to the set of allowed registers won't work.

EOI is also write-only for xAPIC on Intel, but apparently AMD allows reads.

I'm leaning towards this as a fix. The only reason KVM uses kvm_lapic_msr_read()
is to play nice with the 64-bit ICR in x2APIC. I don't love that the x2APIC details
bleed further into kvm_apic_write_nodecode(), but odds are good that if there's ever
another 64-bit register, it'll need to be special cased here anyways, same as ICR.

--
From: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Tue, 6 Sep 2022 07:52:41 -0700
Subject: [PATCH] KVM: x86: Blindly get current x2APIC reg value on "nodecode
write" traps

When emulating a x2APIC write in response to an APICv/AVIC trap, get the
the written value from the vAPIC page without checking that reads are
allowed for the target register. AVIC can generate trap-like VM-Exits on
writes to EOI, and so KVM needs to get the written value from the backing
page without running afoul of EOI's write-only behavior.

Alternatively, EOI could be special cased to always write '0', e.g. so
that the sanity check could be preserved, but x2APIC on AMD is actually
supposed to disallow non-zero writes (not emulated by KVM), and the
sanity check was a byproduct of how the KVM code was written, i.e. wasn't
added to guard against anything in particular.

Fixes: 70c8327c11c6 ("KVM: x86: Bug the VM if an accelerated x2APIC trap occurs on a "bad" reg")
Fixes: 1bd9dfec9fd4 ("KVM: x86: Do not block APIC write for non ICR registers")
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@xxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/x86/kvm/lapic.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 4cebbdd3431b..76a19bf1eb55 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2349,23 +2349,18 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
struct kvm_lapic *apic = vcpu->arch.apic;
u64 val;

- if (apic_x2apic_mode(apic)) {
- if (KVM_BUG_ON(kvm_lapic_msr_read(apic, offset, &val), vcpu->kvm))
- return;
- } else {
- val = kvm_lapic_get_reg(apic, offset);
- }
-
/*
* ICR is a single 64-bit register when x2APIC is enabled. For legacy
* xAPIC, ICR writes need to go down the common (slightly slower) path
* to get the upper half from ICR2.
*/
if (apic_x2apic_mode(apic) && offset == APIC_ICR) {
+ val = kvm_lapic_get_reg64(apic, APIC_ICR);
kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
trace_kvm_apic_write(APIC_ICR, val);
} else {
/* TODO: optimize to just emulate side effect w/o one more write */
+ val = kvm_lapic_get_reg(apic, offset);
kvm_lapic_reg_write(apic, offset, (u32)val);
}
}

base-commit: 476d5fb78ea6438941559af4814a2795849cb8f0
--