Re: [PATCH 08/12] KVM: MMU: do not consult levels when freeing roots

From: Sean Christopherson
Date: Thu Feb 10 2022 - 19:54:39 EST


On Fri, Feb 11, 2022, Sean Christopherson wrote:
> On Wed, Feb 09, 2022, Paolo Bonzini wrote:
> > Right now, PGD caching requires a complicated dance of first computing
> > the MMU role and passing it to __kvm_mmu_new_pgd, and then separately calling
>
> Nit, adding () after function names helps readers easily recognize when you're
> taking about a specific function, e.g. as opposed to a concept or whatever.
>
> > kvm_init_mmu.
> >
> > Part of this is due to kvm_mmu_free_roots using mmu->root_level and
> > mmu->shadow_root_level to distinguish whether the page table uses a single
> > root or 4 PAE roots. Because kvm_init_mmu can overwrite mmu->root_level,
> > kvm_mmu_free_roots must be called before kvm_init_mmu.
> >
> > However, even after kvm_init_mmu there is a way to detect whether the page table
> > has a single root or four, because the pae_root does not have an associated
> > struct kvm_mmu_page.
>
> Suggest a reword on the final paragraph, because there's a discrepancy with the
> code (which handles 0, 1, or 4 "roots", versus just "single or four").
>
> However, even after kvm_init_mmu() there is a way to detect whether the
> page table may hold PAE roots, as root.hpa isn't backed by a shadow when
> it points at PAE roots.
>
> > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 10 ++++++----
> > 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 3c3f597ea00d..95d0fa0bb876 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3219,12 +3219,15 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> > struct kvm *kvm = vcpu->kvm;
> > int i;
> > LIST_HEAD(invalid_list);
> > - bool free_active_root = roots_to_free & KVM_MMU_ROOT_CURRENT;
> > + bool free_active_root;
> >
> > BUILD_BUG_ON(KVM_MMU_NUM_PREV_ROOTS >= BITS_PER_LONG);
> >
> > /* Before acquiring the MMU lock, see if we need to do any real work. */
> > - if (!(free_active_root && VALID_PAGE(mmu->root.hpa))) {
> > + free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT)
> > + && VALID_PAGE(mmu->root.hpa);
>
> free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT) &&
> VALID_PAGE(mmu->root.hpa);
>
> Isn't this a separate bug fix? E.g. call kvm_mmu_unload() without a valid current
> root, but with valid previous roots? In which case we'd try to free garbage, no?
>
> > +
> > + if (!free_active_root) {
> > for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
> > if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) &&
> > VALID_PAGE(mmu->prev_roots[i].hpa))
> > @@ -3242,8 +3245,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> > &invalid_list);
> >
> > if (free_active_root) {
> > - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
> > - (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
> > + if (to_shadow_page(mmu->root.hpa)) {
> > mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
> > } else if (mmu->pae_root) {

Gah, this is technically wrong. It shouldn't truly matter, but it's wrong. root.hpa
will not be backed by shadow page if the root is pml4_root or pml5_root, in which
case freeing the PAE root is wrong. They should obviously be invalid already, but
it's a little confusing because KVM wanders down a path that may not be relevant
to the current mode.

For clarity, I think it's worth doing:

} else if (mmu->root.hpa == __pa(mmu->pae_root)) {


> > for (i = 0; i < 4; ++i) {
> > --
> > 2.31.1
> >
> >