Re: [PATCH v2 1/4] KVM: x86: move the event handling of KVM_REQ_GET_VMCS12_PAGES into a common function

From: Sean Christopherson
Date: Wed Sep 07 2022 - 11:48:32 EST


On Wed, Sep 07, 2022, Yuan Yao wrote:
> On Tue, Sep 06, 2022 at 09:26:33PM -0700, Mingwei Zhang wrote:
> > > > @@ -10700,6 +10706,12 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
> > > > if (kvm_cpu_has_pending_timer(vcpu))
> > > > kvm_inject_pending_timer_irqs(vcpu);
> > > >
> > > > + if (vcpu->arch.nested_get_pages_pending) {
> > > > + r = kvm_get_nested_state_pages(vcpu);
> > > > + if (r <= 0)
> > > > + break;
> > > > + }
> > > > +
> > >
> > > Will this leads to skip the get_nested_state_pages for L2 first time
> > > vmentry in every L2 running iteration ? Because with above changes
> > > KVM_REQ_GET_NESTED_STATE_PAGES is not set in
> > > nested_vmx_enter_non_root_mode() and
> > > vcpu->arch.nested_get_pages_pending is not checked in
> > > vcpu_enter_guest().
> > >
> > Good catch. I think the diff won't work when vcpu is runnable.

It works, but it's inefficient if the request comes from KVM_SET_NESTED_STATE.
The pending KVM_REQ_UNBLOCK that comes with the flag will prevent actually running
the guest. Specifically, this chunk of code will detect the pending request and
bail out of vcpu_enter_guest().

if (kvm_vcpu_exit_request(vcpu)) {
vcpu->mode = OUTSIDE_GUEST_MODE;
smp_wmb();
local_irq_enable();
preempt_enable();
kvm_vcpu_srcu_read_lock(vcpu);
r = 1;
goto cancel_injection;
}

But the inefficiency is a non-issue since "true" emulation of VM-Enter will flow
through this path (the VMRESUME/VMLAUNCH/VMRUN exit handler runs at the end of
vcpu_enter_guest().

> > It only tries to catch the vcpu block case. Even for the vcpu block case,
> > the check of KVM_REQ_UNBLOCK is way too late. Ah, kvm_vcpu_check_block() is
> > called by kvm_vcpu_block() which is called by vcpu_block(). The warning is
> > triggered at the very beginning of vcpu_block(), i.e., within
> > kvm_arch_vcpu_runnable(). So, please ignore the trace in my previous email.
> >
> > In addition, my minor push back for that is
> > vcpu->arch.nested_get_pages_pending seems to be another
> > KVM_REQ_GET_NESTED_STATE_PAGES.
>
> Yeah, but in concept level it's not a REQ mask lives in the
> vcpu->requests which can be cached by e.g. kvm_request_pending().
> It's necessary to check vcpu->arch.nested_get_pages_pending in
> vcpu_enter_guest() if Sean's idea is to replace
> KVM_REQ_GET_NESTED_STATE_PAGES with nested_get_pages_pending.

Yes, they key is that it's not a request. Requests have implicit properties:
e.g. as above, effectively prevent running the vCPU until the request goes away,
they can be pended from other vCPUs, etc... And the property that is most relevant
to this bug: except for special cases, requests only need to be serviced before
running vCPU.

And the number of requests is limited due to them being stored in a bitmap. x86
still has plenty of room due to kvm_vcpu.requests being a u64, but it's still
preferable to avoid using a request unless absolutely necessary.

For this case, since using a request isn't strictly needed and using a request
would require special casing that request, my strong preference is to not use a
request.

So yes, my idea is to "just" replace the request with a flag, but there are subtly
quite a few impliciations in not using a request.