Re: [PATCH v8 05/15] KVM: pfncache: remove KVM_GUEST_USES_PFN usage

From: Sean Christopherson
Date: Mon Nov 27 2023 - 18:36:16 EST


On Tue, Nov 21, 2023, David Woodhouse wrote:
> On Tue, 2023-11-21 at 18:02 +0000, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@xxxxxxxxxx>
> >
> > As noted in [1] the KVM_GUEST_USES_PFN usage flag is never set by any
> > callers of kvm_gpc_init(), which also makes the 'vcpu' argument redundant.
> > Moreover, all existing callers specify KVM_HOST_USES_PFN so the usage
> > check in hva_to_pfn_retry() and hence the 'usage' argument to
> > kvm_gpc_init() are also redundant.
> > Remove the pfn_cache_usage enumeration and remove the redundant arguments,
> > fields of struct gfn_to_hva_cache, and all the related code.
> >
> > [1] https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@xxxxxxxxxx/
> >
> > Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx>
>
> I think it's https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@xxxxxxxxxx/

Yeah, that's the more important link.

> which is the key reference. I'm not sure I'm 100% on board, but I never
> got round to replying to Sean's email because it was one of those "put
> up or shut up situations" and I didn't have the bandwidth to actually
> write the code to prove my point.
>
> I think it *is* important to support non-pinned pages. There's a reason
> we even made the vapic page migratable. We want to support memory
> hotplug, we want to cope with machine checks telling us to move certain
> pages (which I suppose is memory hotplug). See commit 38b9917350cb
> ("kvm: vmx: Implement set_apic_access_page_addr") for example.

The vAPIC page is slightly different in that it effectively never opened a window
for page migration, i.e. once a vCPU was created that page was stuck. For nested
virtualization pages, the probability of being able to migrate a page at any given
time might be relatively low, but it's extremely unlikely for a page to be pinned
for the entire lifetime of a (L1) VM.

> I agree that in the first round of the nVMX code there were bugs. And
> sure, of *course* it isn't sufficient to wire up the invalidation
> without either a KVM_REQ_SOMETHIMG to put it back, or just a *check* on
> the corresponding gpc on the way back into the guest. We'd have worked
> that out.

Maybe. I spent most of a day, maybe longer, hacking at the nVMX code and was
unable to get line of sight to an end result that I felt would be worth pursuing.

I'm definitely not saying it's impossible, and I'm not dead set against
re-introducing KVM_GUEST_USES_PFN or similar, but a complete solution crosses the
threshold where it's unreasonable to ask/expect someone to pick up the work in
order to get their code/series merged.

Which is effectively what you said below, I just wanted to explain why I'm pushing
to remove KVM_GUEST_USES_PFN, and to say that if you or someone else were to write
the code it wouldn't be an automatic nak.

> And yes, the gpc has had bugs as we implemented it, but the point was
> that we got to something which *is* working, and forms a usable
> building block.
>
> So I'm not really sold on the idea of ditching KVM_GUEST_USES_PFN. I
> think we could get it working, and I think it's worth it. But my
> opinion is worth very little unless I express it in 'diff -up' form
> instead of prose, and reverting this particular patch is the least of
> my barriers to doing so, so reluctantly...