Re: [PATCH RFC 02/11] KVM: x86: hyper-v: Move Hyper-V partition assist page out of Hyper-V emulation context

From: Vitaly Kuznetsov
Date: Mon Oct 16 2023 - 08:46:26 EST


Maxim Levitsky <mlevitsk@xxxxxxxxxx> writes:

> У вт, 2023-10-10 у 18:02 +0200, Vitaly Kuznetsov пише:
>> Hyper-V partition assist page is used when KVM runs on top of Hyper-V and
>> is not used for Windows/Hyper-V guests on KVM, this means that 'hv_pa_pg'
>> placement in 'struct kvm_hv' is unfortunate. As a preparation to making
>> Hyper-V emulation optional, move 'hv_pa_pg' to 'struct kvm_arch' and put it
>> under CONFIG_HYPERV.
>
> It took me a while to realize that this parition assist page is indeed something that L0,
> running above KVM consumes.
> (what a confusing name Microsoft picked...)
>
> As far as I know currently the partition assist page has only
> one shared memory variable which allows L1 to be notified of direct TLB flushes that L0 does for L2,
> but since KVM doesn't need it, it
> never touches this variable/page,
> but specs still demand that L1 does allocate that page.
>

Yes,

KVM doesn't ask L0 (Hyper-V) to deliver synthetic vmexits but the page
needs to be allocated. I'm not sure whether this is done to follow the
spec ("The partition assist page is a page-size aligned page-size region
of memory that the L1 hypervisor must allocate and zero before direct
flush hypercalls can be used.") or if anyone has ever tried writing '0'
to the corresponding field to see what happens with various Hyper-V
versions but even if it happens to work today, there's no guarantee for
the future.

>
> If you agree, it would be great to add a large comment to the code,
> explaining the above,

There' this in vmx.c:

/*
* Synthetic VM-Exit is not enabled in current code and so All
* evmcs in singe VM shares same assist page.
*/

but this can certainly get extended. Moreover, it seems that
hv_enable_l2_tlb_flush() should go vmx_onhyperv.c to make that fact that
it's for KVM-on-Hyper-V 'more obvious'.

> and fact that the partition assist page
> is something L1 exposes to L0.
>
> I don't know though where to put the comment
> because hv_enable_l2_tlb_flush is duplicated between SVM and VMX.
>
> It might be a good idea to have a helper function to allocate the partition assist page,
> which will both reduce the code duplication slightly and allow us to
> put this comment there.

OK.

>
>
> Best regards,
> Maxim Levitsky
>
>>
>> No functional change intended.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>> ---
>> arch/x86/include/asm/kvm_host.h | 2 +-
>> arch/x86/kvm/svm/svm_onhyperv.c | 2 +-
>> arch/x86/kvm/vmx/vmx.c | 2 +-
>> arch/x86/kvm/x86.c | 4 +++-
>> 4 files changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index e5d4b8a44630..711dc880a9f0 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1115,7 +1115,6 @@ struct kvm_hv {
>> */
>> unsigned int synic_auto_eoi_used;
>>
>> - struct hv_partition_assist_pg *hv_pa_pg;
>> struct kvm_hv_syndbg hv_syndbg;
>> };
>>
>> @@ -1436,6 +1435,7 @@ struct kvm_arch {
>> #if IS_ENABLED(CONFIG_HYPERV)
>> hpa_t hv_root_tdp;
>> spinlock_t hv_root_tdp_lock;
>> + struct hv_partition_assist_pg *hv_pa_pg;
>> #endif
>> /*
>> * VM-scope maximum vCPU ID. Used to determine the size of structures
>> diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
>> index 7af8422d3382..d19666f9b9ac 100644
>> --- a/arch/x86/kvm/svm/svm_onhyperv.c
>> +++ b/arch/x86/kvm/svm/svm_onhyperv.c
>> @@ -19,7 +19,7 @@ int svm_hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
>> {
>> struct hv_vmcb_enlightenments *hve;
>> struct hv_partition_assist_pg **p_hv_pa_pg =
>> - &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
>> + &vcpu->kvm->arch.hv_pa_pg;
>>
>> if (!*p_hv_pa_pg)
>> *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 72e3943f3693..b7dc7acf14be 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -524,7 +524,7 @@ static int hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
>> {
>> struct hv_enlightened_vmcs *evmcs;
>> struct hv_partition_assist_pg **p_hv_pa_pg =
>> - &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
>> + &vcpu->kvm->arch.hv_pa_pg;
>> /*
>> * Synthetic VM-Exit is not enabled in current code and so All
>> * evmcs in singe VM shares same assist page.
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 9f18b06bbda6..e273ce8e0b3f 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -12291,7 +12291,9 @@ void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu)
>>
>> void kvm_arch_free_vm(struct kvm *kvm)
>> {
>> - kfree(to_kvm_hv(kvm)->hv_pa_pg);
>> +#if IS_ENABLED(CONFIG_HYPERV)
>> + kfree(kvm->arch.hv_pa_pg);
>> +#endif
>> __kvm_arch_free_vm(kvm);
>> }
>>
>
>
>
>

--
Vitaly