Re: [PATCH v3 1/2] KVM: x86/mmu: Ensure TDP MMU roots are freed after yield

From: Sean Christopherson
Date: Thu Jan 07 2021 - 12:42:04 EST


In the future, please document the changes in each revision, e.g. in a cover
letter or in the ignored part of the diff.

On Wed, Jan 06, 2021, Ben Gardon wrote:
> Many TDP MMU functions which need to perform some action on all TDP MMU
> roots hold a reference on that root so that they can safely drop the MMU
> lock in order to yield to other threads. However, when releasing the
> reference on the root, there is a bug: the root will not be freed even
> if its reference count (root_count) is reduced to 0.
>
> To simplify acquiring and releasing references on TDP MMU root pages, and
> to ensure that these roots are properly freed, move the get/put operations
> into another TDP MMU root iterator macro.
>
> Moving the get/put operations into an iterator macro also helps
> simplify control flow when a root does need to be freed. Note that using
> the list_for_each_entry_safe macro would not have been appropriate in
> this situation because it could keep a pointer to the next root across
> an MMU lock release + reacquire, during which time that root could be
> freed.
>
> Reported-by: Maciej S. Szmigiero <maciej.szmigiero@xxxxxxxxxx>
> Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> Fixes: faaf05b00aec ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU")
> Fixes: 063afacd8730 ("kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU")
> Fixes: a6a0b05da9f3 ("kvm: x86/mmu: Support dirty logging for the TDP MMU")
> Fixes: 14881998566d ("kvm: x86/mmu: Support disabling dirty logging for the tdp MMU")
> Signed-off-by: Ben Gardon <bgardon@xxxxxxxxxx>

Reviewed-by: Sean Christopherson <seanjc@xxxxxxxxxx>

> ---
> arch/x86/kvm/mmu/tdp_mmu.c | 104 +++++++++++++++++--------------------
> 1 file changed, 48 insertions(+), 56 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 75db27fda8f3..d4191ed193cd 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -44,7 +44,48 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm)
> WARN_ON(!list_empty(&kvm->arch.tdp_mmu_roots));
> }
>
> -#define for_each_tdp_mmu_root(_kvm, _root) \
> +static void tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
> +{
> + if (kvm_mmu_put_root(kvm, root))
> + kvm_tdp_mmu_free_root(kvm, root);
> +}
> +
> +static inline bool tdp_mmu_next_root_valid(struct kvm *kvm,
> + struct kvm_mmu_page *root)
> +{
> + lockdep_assert_held(&kvm->mmu_lock);
> +
> + if (list_entry_is_head(root, &kvm->arch.tdp_mmu_roots, link))
> + return false;
> +
> + kvm_mmu_get_root(kvm, root);
> + return true;
> +
> +}
> +
> +static inline struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm,
> + struct kvm_mmu_page *root)
> +{
> + struct kvm_mmu_page *next_root;
> +
> + next_root = list_next_entry(root, link);
> + tdp_mmu_put_root(kvm, root);
> + return next_root;
> +}
> +
> +/*
> + * Note: this iterator gets and puts references to the roots it iterates over.

Maybe refer to it as "the yield_safe() variant" instead of "this" so that the
comment makes sense with minimal context?

> + * This makes it safe to release the MMU lock and yield within the loop, but
> + * if exiting the loop early, the caller must drop the reference to the most
> + * recent root. (Unless keeping a live reference is desirable.)
> + */

Rather than encourage manually dropping the reference, what adding about a scary
warning about not exiting the loop early? At this point, it seems unlikely that
we'll end up with a legitimate use case for exiting yield_safe() early. And if
we do, I think it'd be better to provide a macro to do the bookeeping instead of
open coding it in the caller. And maybe throw a blurb into the changelog about
that so future developers understand that that scary warning isn't set in stone?

/*
* The yield_safe() variant of the TDP root iterator gets and puts references to
* the roots it iterates over. This makes it safe to release the MMU lock and
* yield within the loop, but the caller MUST NOT exit the loop early.
*/