Re: [PATCH 03/10] KVM: SVM: Drop pointless masking of kernel page pa's with "AVIC's" HPA mask

From: Maxim Levitsky
Date: Thu Oct 05 2023 - 12:03:23 EST


У вт, 2023-08-15 у 14:35 -0700, Sean Christopherson пише:
> Drop AVIC_HPA_MASK and all its users, the mask is just the 4KiB-aligned
> maximum theoretical physical address for x86-64 CPUs.

Typo: you mean 'current max theoretical physical address' because it might increase
the future again (Intel already did increase it, couple of times).


> All usage in KVM
> masks the result of page_to_phys(), which on x86-64 is guaranteed to be
> 4KiB aligned and a legal physical address; if either of those requirements
> doesn't hold true, KVM has far bigger problems.

I do think that a build time assert that max physical address is as expected
by AVIC, is needed.

Consider this: intel/amd releases yet another extension of the max physical address,
the kernel gets updated, but for CPUs which lack this extension the max physical
address will still be 52 bit and should still be respected.

My CPU for example has 48 bits in max physical address, while the kernel
thinks that max valid physical address is 52 bit.

I understand that this is a theoretical problem but at least a build time
assert that kernel max physical address matches avic's should be done.

Especially if we consider the fact that this patch also fixes a purely theoretical
problem as well.

I do agree that masking output of page_to_phys is pointless and even dangerous,
but asserting that we are not trying to use address which is
outside of AVIC's capabilities is a good thing to have, instead of relying blindly
on the kernel to do the right thing.


>
> The unnecessarily masking in avic_init_vmcb() also incorrectly assumes
> that SME's C-bit resides between bits 51:11; that holds true for current
> CPUs, but isn't required by AMD's architecture:
>
> In some implementations, the bit used may be a physical address bit
>
> Key word being "may".
>
> Opportunistically use the GENMASK_ULL() version for
> AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK, which is far more readable
> than a set of repeating Fs. Keep the macro even though it's unused, and
> will likely never be used, as it helps visualize the layout of an entry.




>
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/include/asm/svm.h | 6 +-----
> arch/x86/kvm/svm/avic.c | 11 +++++------
> 2 files changed, 6 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 609c9b596399..df644ca3febe 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -250,7 +250,7 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
> #define AVIC_LOGICAL_ID_ENTRY_VALID_MASK (1 << 31)
>
> #define AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK GENMASK_ULL(11, 0)
> -#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK (0xFFFFFFFFFFULL << 12)
> +#define AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK GENMASK_ULL(51, 12)
> #define AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK (1ULL << 62)
> #define AVIC_PHYSICAL_ID_ENTRY_VALID_MASK (1ULL << 63)
> #define AVIC_PHYSICAL_ID_TABLE_SIZE_MASK (0xFFULL)
> @@ -284,10 +284,6 @@ enum avic_ipi_failure_cause {
> static_assert((AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == AVIC_MAX_PHYSICAL_ID);
> static_assert((X2AVIC_MAX_PHYSICAL_ID & AVIC_PHYSICAL_MAX_INDEX_MASK) == X2AVIC_MAX_PHYSICAL_ID);
>
> -#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
> -static_assert(AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK == AVIC_HPA_MASK);
> -static_assert(AVIC_PHYSICAL_ID_ENTRY_BACKING_PAGE_MASK == GENMASK_ULL(51, 12));
> -
> #define SVM_SEV_FEAT_DEBUG_SWAP BIT(5)
>
> struct vmcb_seg {
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 442c58ef8158..b8313f2d88fa 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -248,9 +248,9 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
> phys_addr_t lpa = __sme_set(page_to_phys(kvm_svm->avic_logical_id_table_page));
> phys_addr_t ppa = __sme_set(page_to_phys(kvm_svm->avic_physical_id_table_page));
>
> - vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> - vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> + vmcb->control.avic_backing_page = bpa;
> + vmcb->control.avic_logical_id = lpa;
> + vmcb->control.avic_physical_id = ppa;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> @@ -308,7 +308,7 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
> if (!entry)
> return -EINVAL;
>
> - new_entry = __sme_set(page_to_phys(svm->avic_backing_page) & AVIC_HPA_MASK) |
> + new_entry = __sme_set(page_to_phys(svm->avic_backing_page)) |
> AVIC_PHYSICAL_ID_ENTRY_VALID_MASK;
> WRITE_ONCE(*entry, new_entry);
>
> @@ -917,8 +917,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
> struct amd_iommu_pi_data pi;
>
> /* Try to enable guest_mode in IRTE */
> - pi.base = __sme_set(page_to_phys(svm->avic_backing_page) &
> - AVIC_HPA_MASK);
> + pi.base = __sme_set(page_to_phys(svm->avic_backing_page));
> pi.ga_tag = AVIC_GATAG(to_kvm_svm(kvm)->avic_vm_id,
> svm->vcpu.vcpu_id);
> pi.is_guest_mode = true;