Re: [PATCH v3 06/12] KVM: x86: don't disable APICv memslot when inhibited

From: Maxim Levitsky
Date: Mon Aug 09 2021 - 14:51:48 EST


On Tue, 2021-08-03 at 10:44 +0200, Paolo Bonzini wrote:
> Reviewing this patch and the next one together.
>
> On 02/08/21 20:33, Maxim Levitsky wrote:
> > +static int avic_alloc_access_page(struct kvm *kvm)
> > {
> > void __user *ret;
> > int r = 0;
> >
> > mutex_lock(&kvm->slots_lock);
> > +
> > + if (kvm->arch.apic_access_memslot_enabled)
> > goto out;
>
> This variable is overloaded between "is access enabled" and "is the
> memslot allocated". I think you should check
> kvm->arch.apicv_inhibit_reasons instead in kvm_faultin_pfn.
>
>
> > + if (!activate)
> > + kvm_zap_gfn_range(kvm, gpa_to_gfn(APIC_DEFAULT_PHYS_BASE),
> > + gpa_to_gfn(APIC_DEFAULT_PHYS_BASE + PAGE_SIZE));
> > +
>
> Off by one, the last argument of kvm_zap_gfn_range is inclusive:

Actually is it?

There are 3 uses of this function.
Two of them (kvm_post_set_cr0 and one case in update_mtrr) use 0,~0ULL which is indeed inclusive,
but for variable mtrrs I see that in var_mtrr_range this code:

*end = (*start | ~mask) + 1;

and the *end is passed to kvm_zap_gfn_range.


Another thing I noticed that I added calls to kvm_inc_notifier_count/kvm_dec_notifier_count
in the kvm_zap_gfn_range but these do seem to have non inclusive ends, thus
I need to fix them sadly if this is the case.
This depends on mmu_notifier_ops and it is not documented well.

However at least mmu_notifier_retry_hva, does assume a non inclusive range since it checks


hva >= kvm->mmu_notifier_range_start &&
hva < kvm->mmu_notifier_range_end


Also looking at the algorithm of the kvm_zap_gfn_range.
Suppose that gfn_start == gfn_end and we have a memslot with one page at gfn_start

Then:


start = max(gfn_start, memslot->base_gfn); // start = memslot->base_gfn
end = min(gfn_end, memslot->base_gfn + memslot->npages); // end = memslot->base_gfn

if (start >= end)
continue;

In this case it seems that it will do nothing. So I suspect that kvm_zap_gfn_range
actually needs non inclusive range but due to the facts that it was used much
it didn't cause trouble.


Another thing I found in kvm_zap_gfn_range:

kvm_flush_remote_tlbs_with_address(kvm, gfn_start, gfn_end);

But kvm_flush_remote_tlbs_with_address expects (struct kvm *kvm, u64 start_gfn, u64 pages)

kvm_flush_remote_tlbs_with_address is also for some reason called twice with the same parameters.

Could you help with that? Am I missing something?

Thanks in advance,
Best regards,
Maxim Levitsky




> Also, checking "activate" is a bit ugly when we have "new" available as
> well. Yes, they are the same if !!old != !!new, but we care about the
> global state, not the single bit.
>
> Putting everything together, this could become something like
>
> trace_kvm_apicv_update_request(activate, bit);
> if (!!old != !!new) {
> /*
> * Kick all CPUs out of guest mode. When
> * kvm_vcpu_update_apicv succeeds in taking
> * apicv_update_lock, it will see the
> * new apicv_inhibit_reasons that we set below.
> */
> kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
>
> if (new) {
> unsigned long gfn = gpa_to_gfn(APIC_DEFAULT_PHYS_BASE);
> kvm_zap_gfn_range(kvm, gfn, gfn);
> }
> }
> kvm->arch.apicv_inhibit_reasons = new;
> mutex_unlock(&kvm->arch.apicv_update_lock);
>
> Paolo
>