Re: [PATCH v5 7/7] KVM: x86/mmu: explicitly check nx_hugepage in disallowed_hugepage_adjust()

From: Yan Zhao
Date: Sun Oct 09 2022 - 22:28:45 EST


On Fri, Sep 30, 2022 at 11:48:54PM +0000, Sean Christopherson wrote:
> From: Mingwei Zhang <mizhang@xxxxxxxxxx>
>
> Explicitly check if a NX huge page is disallowed when determining if a
> page fault needs to be forced to use a smaller sized page. KVM currently
> assumes that the NX huge page mitigation is the only scenario where KVM
> will force a shadow page instead of a huge page, and so unnecessarily
> keeps an existing shadow page instead of replacing it with a huge page.
>
> Any scenario that causes KVM to zap leaf SPTEs may result in having a SP
> that can be made huge without violating the NX huge page mitigation.
> E.g. prior to commit 5ba7c4c6d1c7 ("KVM: x86/MMU: Zap non-leaf SPTEs when
> disabling dirty logging"), KVM would keep shadow pages after disabling
> dirty logging due to a live migration being canceled, resulting in
> degraded performance due to running with 4kb pages instead of huge pages.
>
> Although the dirty logging case is "fixed", that fix is coincidental,
> i.e. is an implementation detail, and there are other scenarios where KVM
> will zap leaf SPTEs. E.g. zapping leaf SPTEs in response to a host page
> migration (mmu_notifier invalidation) to create a huge page would yield a
> similar result; KVM would see the shadow-present non-leaf SPTE and assume
> a huge page is disallowed.
>
> Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
> Reviewed-by: Ben Gardon <bgardon@xxxxxxxxxx>
> Reviewed-by: David Matlack <dmatlack@xxxxxxxxxx>
> Signed-off-by: Mingwei Zhang <mizhang@xxxxxxxxxx>
> [sean: use spte_to_child_sp(), massage changelog]
> Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> ---
> arch/x86/kvm/mmu/mmu.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 54005b7f1499..688beed3a41e 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3110,6 +3110,11 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
> cur_level == fault->goal_level &&
> is_shadow_present_pte(spte) &&
> !is_large_pte(spte)) {
> + u64 page_mask;
> +
> + if (!spte_to_child_sp(spte)->nx_huge_page_disallowed)
> + return;
Merge this "if" with the upper level "if" ?

Thanks
Yan
> +
> /*
> * A small SPTE exists for this pfn, but FNAME(fetch)
> * and __direct_map would like to create a large PTE
> @@ -3117,8 +3122,8 @@ void disallowed_hugepage_adjust(struct kvm_page_fault *fault, u64 spte, int cur_
> * patching back for them into pfn the next 9 bits of
> * the address.
> */
> - u64 page_mask = KVM_PAGES_PER_HPAGE(cur_level) -
> - KVM_PAGES_PER_HPAGE(cur_level - 1);
> + page_mask = KVM_PAGES_PER_HPAGE(cur_level) -
> + KVM_PAGES_PER_HPAGE(cur_level - 1);
> fault->pfn |= fault->gfn & page_mask;
> fault->goal_level--;
> }
> --
> 2.38.0.rc1.362.ged0d419d3c-goog
>