Re: [PATCH v8 003/103] KVM: Refactor CPU compatibility check on module initialization

From: Sean Christopherson
Date: Mon Aug 15 2022 - 22:22:30 EST


On Fri, Aug 12, 2022, Huang, Kai wrote:
> On Thu, 2022-08-11 at 17:39 +0000, Sean Christopherson wrote:
> > I've been poking at the "hardware enable" code this week for other reasons, and
> > have come to the conclusion that the current implementation is a mess.
>
> Thanks for the lengthy reply :)
>
> First of all, to clarify, I guess by "current implementation" you mean the
> current upstream KVM code, but not this particular patch? :)

Yeah, upstream code.

> > Of course, that path is broken for other reasons too, e.g. needs to prevent CPUs
> > from going on/off-line when KVM is enabling hardware.
> > https://lore.kernel.org/all/20220216031528.92558-7-chao.gao@xxxxxxxxx
>
> If I read correctly, the problem described in above link seems only to be true
> after we move CPUHP_AP_KVM_STARTING from STARTING section to ONLINE section, but
> this hasn't been done yet in the current upstream KVM. Currently,
> CPUHP_AP_KVM_STARTING is still in STARTING section so it is guaranteed it has
> been executed before start_secondary sets itself to online cpu mask.

The lurking issue is that for_each_online_cpu() can against hotplug, i.e. every
instance of for_each_online_cpu() in KVM is buggy (at least on the x86 side, I
can't tell at a glance whether or not arm pKVM's usage is safe).

https://lore.kernel.org/all/87bl20aa72.ffs@tglx

> Btw I saw v4 of Chao's patchset was sent Feb this year. It seems that series
> indeed improved CPU compatibility check and hotplug handling. Any reason that
> series wasn't merged?

AFAIK it was just a lack of reviews/acks for the non-KVM patches.

> Also agreed that kvm_lock should be used. But I am not sure whether
> cpus_read_lock() is needed (whether CPU hotplug should be prevented). In
> current KVM, we don't do CPU compatibility check for hotplug CPU anyway, so when
> KVM does CPU compatibility check using for_each_online_cpu(), if CPU hotplug
> (hot-removal) happens, the worst case is we lose compatibility check on that
> CPU.
>
> Or perhaps I am missing something?

On a hot-add of an incompatible CPU, KVM would potentially skip the compatibility
check and try to enable hardware on an incompatible/broken CPU.

Another possible bug is the checking of hv_get_vp_assist_page(); hot-adding a
CPU that failed to allocate the VP assist page while vmx_init() is checking online
CPUs could result in a NULL pointer deref due to KVM not rejecting the CPU as it
should.

> > mutex_lock(&kvm_lock);
> > cpus_read_lock();
> > r = kvm_arch_add_vm(kvm, kvm_usage_count);
>
> Holding cpus_read_lock() here implies CPU hotplug cannot happen during
> kvm_arch_add_vm(). This needs a justification/comment to explain why.  

Oh, for sure, the above was only intended to be a rough sketch, definitely not a
formal patch.

> Also, assuming we have a justification, since (based on your description above)
> arch _may_ choose to enable hardware within it, but it is not a _must_. So
> maybe remove cpus_read_lock() here and let kvm_arch_add_vm() to decide whether
> to use it?

My thought is that it makes sense to provide a CPU hotplug friendly arch hook since
(a) all architectures except s390 (and maybe PPC) need such a hook (or more likely,
multiple hooks) and (b) cpus_read_lock() will almost never be contended.

In other words, address the problem that's easy to solve (but also easy to forget)
in generic code, but let architectures deal with the hardware enabling problem,
which is a mess to solve in generic code (unless I'm missing something).

All that said, I haven't tried compiling yet, let alone actually running anything,
so it's entirely possible my idea won't pan out.