Re: Nested AVIC design (was:Re: [RFC PATCH v3 04/19] KVM: x86: mmu: allow to enable write tracking externally)

From: Sean Christopherson
Date: Wed Nov 09 2022 - 19:47:25 EST


Sorry for the super slow reply, I don't have a good excuse other than I needed to
take break from AVIC code...

On Mon, Oct 03, 2022, Maxim Levitsky wrote:
> On Thu, 2022-09-29 at 22:38 +0000, Sean Christopherson wrote:
> > On Mon, Aug 08, 2022, Maxim Levitsky wrote:
> > > Hi Sean, Paolo, and everyone else who wants to review my nested AVIC work.
> >
> > Before we dive deep into design details, I think we should first decide whether
> > or not nested AVIC is worth pursing/supporting.
> >
> > - Rome has a ucode/silicon bug with no known workaround and no anticipated fix[*];
> > AMD's recommended "workaround" is to disable AVIC.
> > - AVIC is not available in Milan, which may or may not be related to the
> > aforementioned bug.
> > - AVIC is making a comeback on Zen4, but Zen4 comes with x2AVIC.
> > - x2APIC is likely going to become ubiquitous, e.g. Intel is effectively
> > requiring x2APIC to fudge around xAPIC bugs.
> > - It's actually quite realistic to effectively force the guest to use x2APIC,
> > at least if it's a Linux guest. E.g. turn x2APIC on in BIOS, which is often
> > (always?) controlled by the host, and Linux will use x2APIC.
> >
> > In other words, given that AVIC is well on its way to becoming a "legacy" feature,
> > IMO there needs to be a fairly strong use case to justify taking on this much code
> > and complexity. ~1500 lines of code to support a feature that has historically
> > been buggy _without_ nested support is going to require a non-trivial amount of
> > effort to review, stabilize, and maintain.
> >
> > [*] 1235 "Guest With AVIC (Advanced Virtual Interrupt Controller) Enabled May Fail
> > to Process IPI (Inter-Processor Interrupt) Until Guest Is Re-Scheduled" in
> > https://www.amd.com/system/files/TechDocs/56323-PUB_1.00.pdf
> >
>
> I am afraid that you mixed things up:
>
> You mistake is that x2avic is just a minor addition to AVIC. It is still for
> all practical purposes the same feature.

...

> Physid tables, apic backing pages, doorbell emulation,
> everything is pretty much unchanged.

Ya, it finally clicked for me that KVM would needs to shadow the physical ID
tables irrespective of x2APIC.

I'm still very hesitant to support full virtualization of nested (x2)AVIC. The
complexity and amount of code is daunting, and nSVM has lower hanging fruit that
we should pick before going after full nested (x2)AVIC, e.g. SVM's TLB flushing
needs a serious overhaul. And if we go through the pain for SVM, I think we'd
probably want to come up with a solution that can be at least shared shared with
VMX's IPI virtualization.

As an intermediate step, can we expose (x2)AVIC to L2 without any shadowing?
E.g. run all L2s with a single dummy physical ID table and emulate IPIs in KVM?

If that works, that seems like a logical first step even if we want to eventually
support nested IPI virtualization.