Re: [PATCH 08/12] KVM: MMU: do not consult levels when freeing roots

From: Sean Christopherson
Date: Thu Feb 10 2022 - 20:35:49 EST


On Fri, Feb 11, 2022, Paolo Bonzini wrote:
> On 2/11/22 01:54, Sean Christopherson wrote:
> > > free_active_root = (roots_to_free & KVM_MMU_ROOT_CURRENT) &&
> > > VALID_PAGE(mmu->root.hpa);
> > >
> > > Isn't this a separate bug fix? E.g. call kvm_mmu_unload() without a valid current
> > > root, but with valid previous roots? In which case we'd try to free garbage, no?
>
> mmu_free_root_page checks VALID_PAGE(*root_hpa). If that's what you meant,
> then it wouldn't be a preexisting bug (and I think it'd be a fairly common
> case).

Ahh, yep.

> > > > +
> > > > + if (!free_active_root) {
> > > > for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
> > > > if ((roots_to_free & KVM_MMU_ROOT_PREVIOUS(i)) &&
> > > > VALID_PAGE(mmu->prev_roots[i].hpa))
> > > > @@ -3242,8 +3245,7 @@ void kvm_mmu_free_roots(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> > > > &invalid_list);
> > > > if (free_active_root) {
> > > > - if (mmu->shadow_root_level >= PT64_ROOT_4LEVEL &&
> > > > - (mmu->root_level >= PT64_ROOT_4LEVEL || mmu->direct_map)) {
> > > > + if (to_shadow_page(mmu->root.hpa)) {
> > > > mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
> > > > } else if (mmu->pae_root) {
> >
> > Gah, this is technically wrong. It shouldn't truly matter, but it's wrong. root.hpa
> > will not be backed by shadow page if the root is pml4_root or pml5_root, in which
> > case freeing the PAE root is wrong. They should obviously be invalid already, but
> > it's a little confusing because KVM wanders down a path that may not be relevant
> > to the current mode.
>
> pml4_root and pml5_root are dummy, and the first "real" level of page tables
> is stored in pae_root for that case too, so I think that should DTRT.

Ugh, completely forgot that detail. You're correct. Probably worth a comment?

> That's why I also disliked the shadow_root_level/root_level/direct check:
> even though there's half a dozen of cases involved, they all boil down to
> either 4 pae_roots or a single root with a backing kvm_mmu_page.
>
> It's even more obscure to check shadow_root_level/root_level/direct in
> fast_pgd_switch, where it's pretty obvious that you cannot cache 4 pae_roots
> in a single (hpa, pgd) pair...

Heh, apparently not obvious enough for me :-)