RE: [PATCH 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline

From: Michael Kelley (LINUX)
Date: Mon May 22 2023 - 10:13:58 EST


From: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> Sent: Monday, May 22, 2023 1:56 AM
>
> Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:
>

[snip]

> > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> > index 0f1001d..696004a 100644
> > --- a/include/linux/cpuhotplug.h
> > +++ b/include/linux/cpuhotplug.h
> > @@ -201,6 +201,7 @@ enum cpuhp_state {
> > /* Online section invoked on the hotplugged CPU from the hotplug thread */
> > CPUHP_AP_ONLINE_IDLE,
> > CPUHP_AP_KVM_ONLINE,
> > + CPUHP_AP_HYPERV_ONLINE,
>
> (Cc: KVM)
>
> I would very much prefer we swap the order with KVM here: hv_cpu_init()
> allocates and sets vCPU's VP assist page which is used by KVM on
> CPUHP_AP_KVM_ONLINE:
>
> kvm_online_cpu() -> __hardware_enable_nolock() ->
> kvm_arch_hardware_enable() -> vmx_hardware_enable():
>
> /*
> * This can happen if we hot-added a CPU but failed to allocate
> * VP assist page for it.
> */
> if (kvm_is_using_evmcs() && !hv_get_vp_assist_page(cpu))
> return -EFAULT;
>
> With the change, this is likely broken.
>
> FWIF, KVM also needs current vCPU's VP index (also set by hv_cpu_init())
> through __kvm_x86_vendor_init() -> set_hv_tscchange_cb() call chain but
> this happens upon KVM module load so CPU hotplug ordering should not
> matter as this always happens on a CPU which is already online.
>
> Generally, as 'KVM on Hyper-V' is a supported scenario, doing Hyper-V
> init before KVM's (and vice versa on teardown) makes sense.
>
> > CPUHP_AP_SCHED_WAIT_EMPTY,
> > CPUHP_AP_SMPBOOT_THREADS,
> > CPUHP_AP_X86_VDSO_VMA_ONLINE,

I have no objection to putting CPUHP_AP_HYPERV_ONLINE first. I did
not give any consideration to a possible dependency between the two. :-(
But note that in current code, hv_cpu_init() is running on the
CPUHP_AP_ONLINE_DYN state, which is also after KVM. So this patch
doesn't change the order w.r.t. KVM and the VP assist page. Are things
already broken for KVM, or is something else happening that makes it
work anyway?

Michael