Re: [PATCH 07/23] KVM: MMU: remove kvm_mmu_calc_root_page_role

From: Sean Christopherson
Date: Wed Feb 09 2022 - 20:56:30 EST


On Fri, Feb 04, 2022, Paolo Bonzini wrote:
> Since the guest PGD is now loaded after the MMU has been set up
> completely, the desired role for a cache hit is simply the current
> mmu_role. There is no need to compute it again, so __kvm_mmu_new_pgd
> can be folded in kvm_mmu_new_pgd.
>
> As an aside, the !tdp_enabled case in the function was dead code,
> and that also gets mopped up as a side effect.
>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---

...

> @@ -4871,7 +4864,7 @@ void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0,
>
> shadow_mmu_init_context(vcpu, context, &regs, new_role);
> reset_shadow_zero_bits_mask(vcpu, context, is_efer_nx(context));
> - __kvm_mmu_new_pgd(vcpu, nested_cr3, new_role.base);
> + kvm_mmu_new_pgd(vcpu, nested_cr3);

I'm not a fan of this change. Conceptually it makes perfect sense, but I really,
really don't like the hidden dependency between shadow_mmu_init_context() and
kvm_mmu_new_pgd().

It's less bad once patch 01 is jettisoned, but I still don't love. Yes, it's
silly to call __kvm_mmu_new_pgd() before the role is set, but it is robust. And
either way we end up with an incoherent state, e.g. with this change, the role of
the shadow page pointed at by mmu->root_hpa doesn't match the role of the mmu itself.

Furthermore, the next patch to take advantage of this change is subtly broken/wrong.
It drops kvm_mmu_calc_root_page_role() from kvm_mmu_new_pgd() under the guise that
kvm_mmu_new_pgd() is called with an up-to-date mmu_role, but that is incorrect for
both nested VM-Enter with TDP disabled in the host and nested VM-Exit (regardless of
TDP enabling).

Both nested_vmx_load_cr3() and nested_svm_load_cr3() load a new CR3 before updating
the mmu. Why, I have no idea. Probably a historical wart

The nested mess is likely easily solved, I don't see any obvious issue with swapping
the order. But I still don't love the subtlety. I do like shaving cycles, just
not the subtlety...

If we do rework things to have kvm_mmu_new_pgd() pull the role from the mmu, then
we should first add a long overdue audit/warn that KVM never runs with a mmu_role
that isn't consistent with respect to its root SP's role. E.g. finally get rid of
mmu_audit.c, add a proper CONFIG_KVM_AUDIT_MMU (which I'll happily turn on for all
my kernels), and start adding WARNs and other assertions.