Re: [PATCH 1/3] KVM: VMX: Retry APIC-access page reload if invalidation is in-progress

From: Yu Zhang
Date: Thu Jun 08 2023 - 03:00:29 EST


>
> Flushing when KVM zaps SPTEs is definitely necessary. But the flush in
> vmx_set_apic_access_page_addr() *should* be redundant.
>
> > Could we try to return false in kvm_unmap_gfn_range() to indicate no more
> > flush is needed, if the range to be unmapped falls within guest APIC base,
> > and leaving the TLB invalidation work to vmx_set_apic_access_page_addr()?
>
> No, because vmx_flush_tlb_current(), a.k.a. KVM_REQ_TLB_FLUSH_CURRENT, flushes
> only the current root, i.e. on the current EP4TA. kvm_unmap_gfn_range() isn't
> tied to a single vCPU and so needs to flush all roots. We could in theory more
> precisely track which roots needs to be flushed, but in practice it's highly
> unlikely to matter as there is typically only one "main" root when TDP (EPT) is
> in use. In other words, KVM could avoid unnecessarily flushing entries for other
> roots, but it would incur non-trivial complexity, and the probability of the
> precise flushing having a measurable impact on guest performance is quite low, at
> least outside of nested scenarios.

Well, I can understand the invalidation shall be performed for both current EP4TA,
and the nested EP4TA(EPT02) when host retries to reclaim a normal page, because L1
may assign this page to L2. But for APIC base address, will L1 map this address to
L2?

Also, what if the virtualize APIC access is to be supported in L2, and the backing
page is being reclaimed in L0? I saw nested_get_vmcs12_pages() will check vmcs12
and set the APIC access address in VMCS02, but not sure if this routine will be
triggered by the mmu notifier...

B.R.
Yu

>
> But as above, flushing in vmx_set_apic_access_page_addr() shouldn't be necessary.
> If there were SPTEs, then KVM would already have zapped and flushed. If there
> weren't SPTEs, then it should have been impossible for the guest to have valid
> TLB entries. KVM needs to flush when VIRTUALIZE_APIC_ACCESSES is toggled on, as
> the CPU could have non-vAPIC TLB entries, but that's handled by vmx_set_virtual_apic_mode().
>
> I'll send a follow-up patch to drop the flush from vmx_set_apic_access_page_addr(),
> I don't *think* I'm missing an edge case...