Re: [PATCH v9 3/6] KVM: x86: Virtualize CR3.LAM_{U48,U57}

From: Sean Christopherson
Date: Wed Jun 28 2023 - 13:40:24 EST


On Wed, Jun 28, 2023, Binbin Wu wrote:
>
>
> On 6/28/2023 7:40 AM, Sean Christopherson wrote:
> > I think I'd prefer to drop this field and avoid bikeshedding the name entirely. The
> > only reason to effectively cache "X86_CR3_LAM_U48 | X86_CR3_LAM_U57" is because
> > guest_cpuid_has() is slow, and I'd rather solve that problem with the "governed
> > feature" framework.
> Thanks for the suggestion.
>
> Is the below patch the lastest patch of "governed feature" framework
> support?
> https://lore.kernel.org/kvm/20230217231022.816138-2-seanjc@xxxxxxxxxx/

Yes, I haven't refreshed it since the original posting.

> Do you have plan to apply it to kvm-x86 repo?

I'm leaning more and more towards pushing it through sooner than later as this
isn't the first time in recent memory that a patch/series has done somewhat odd
things to workaround guest_cpuid_has() being slow. I was hoping to get feedback
before applying, but that's not looking likely at this point.

> > More below.
> >
> > > unsigned long cr4;
> > > unsigned long cr4_guest_owned_bits;
> > > unsigned long cr4_guest_rsvd_bits;
> > > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> > > index b1658c0de847..ef8e1b912d7d 100644
> > > --- a/arch/x86/kvm/cpuid.h
> > > +++ b/arch/x86/kvm/cpuid.h
> > > @@ -42,6 +42,11 @@ static inline int cpuid_maxphyaddr(struct kvm_vcpu *vcpu)
> > > return vcpu->arch.maxphyaddr;
> > > }
> > > +static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu, unsigned long cr3)
> > Heh, I think it makes sense to wrap this one. I'll probably tell you differently
> > tomorrow, but today, let's wrap.
> >
> > > +{
> > > + return !((cr3 & vcpu->arch.reserved_gpa_bits) & ~vcpu->arch.cr3_ctrl_bits);
> > Don't open code something for which there is a perfect helper, i.e. use
> > kvm_vcpu_is_legal_gpa().
> I didn't use the helper because after masking the control bits (LAM bits),
> CR3 is not  a GPA conceptally, i.e, it contains PCID or PWT/PCD in lower
> bits.
> But maybe this is my overthinking?Will use the helper instead.

You're not overthinking it, I had the exact same reaction. However, the above
also directly looks at arch.reserved_gpa_bits, i.e. treats CR3 like a GPA for
all intents and purposes, so it's not any better than using kvm_vcpu_is_legal_gpa().
And I couldn't bring myself to suggest adding a "reserved CR3 bits" mask because
CR3 *does* contain a GPA, i.e. we'd still have to check kvm_vcpu_is_legal_gpa(),
and realistically the "reserved CR3 bits" will never be a superset of "illegal
GPA bits".

> > If we go the governed feature route, this becomes:
> >
> > static inline bool kvm_vcpu_is_legal_cr3(struct kvm_vcpu *vcpu,
> > unsigned long cr3)
> > {
> > if (guest_can_use(vcpu, X86_FEATURE_LAM))
> > cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> >
> > return kvm_vcpu_is_legal_gpa(cr3);
> > }
> >
> > > +}
> > > +
> > > static inline bool kvm_vcpu_is_legal_gpa(struct kvm_vcpu *vcpu, gpa_t gpa)
> > > {
> > > return !(gpa & vcpu->arch.reserved_gpa_bits);
> > > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > > index 92d5a1924fc1..81d8a433dae1 100644
> > > --- a/arch/x86/kvm/mmu.h
> > > +++ b/arch/x86/kvm/mmu.h
> > > @@ -144,6 +144,11 @@ static inline unsigned long kvm_get_active_pcid(struct kvm_vcpu *vcpu)
> > > return kvm_get_pcid(vcpu, kvm_read_cr3(vcpu));
> > > }
> > > +static inline u64 kvm_get_active_cr3_ctrl_bits(struct kvm_vcpu *vcpu)
> > And then this becomes:
> >
> > static inline u64 kvm_get_active_cr3_lam_bits(struct kvm_vcpu *vcpu)
> > {
> > if (!guest_can_use(vcpu, X86_FEATURE_LAM))
> > return 0;
> >
> > return kvm_read_cr3(vcpu) & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57);
> > }
> >
> > > +{
> > > + return kvm_read_cr3(vcpu) & vcpu->arch.cr3_ctrl_bits;
> > > +}
> > > +
> > > static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
> > > {
> > > u64 root_hpa = vcpu->arch.mmu->root.hpa;
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index c8961f45e3b1..deea9a9f0c75 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -3812,7 +3812,13 @@ static int mmu_alloc_shadow_roots(struct kvm_vcpu *vcpu)
> > > hpa_t root;
> > > root_pgd = kvm_mmu_get_guest_pgd(vcpu, mmu);
> > > - root_gfn = root_pgd >> PAGE_SHIFT;
> > > + /*
> > > + * Guest PGD can be CR3 or EPTP (for nested EPT case). CR3 may contain
> > > + * additional control bits (e.g. LAM control bits). To be generic,
> > > + * unconditionally strip non-address bits when computing the GFN since
> > > + * the guest PGD has already been checked for validity.
> > > + */
> > Drop this comment, the code is self-explanatory, and the comment is incomplete,
> > e.g. it can also be nCR3.
> I was trying to use CR3 for both nested/non-nested cases. Sorry for the
> confusion.
> Anyway, will drop the comment.

FWIW, EPTP also has non-address bits. But the real reason I don't think this
warrants a comment is that "pgd" is a specifically not an address, i.e. it's
fully expected and intuitive that retrieving the gfn from a pgd would need to
mask off bits.