Re: [PATCH v3 3/9] KVM: x86: Defer tick-based accounting 'til after IRQ handling

From: Frederic Weisbecker
Date: Wed Apr 21 2021 - 08:19:49 EST


On Tue, Apr 20, 2021 at 11:26:34PM +0000, Sean Christopherson wrote:
> On Wed, Apr 21, 2021, Frederic Weisbecker wrote:
> > On Thu, Apr 15, 2021 at 03:21:00PM -0700, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 16fb39503296..e4d475df1d4a 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -9230,6 +9230,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > > local_irq_disable();
> > > kvm_after_interrupt(vcpu);
> > >
> > > + /*
> > > + * When using tick-based accounting, wait until after servicing IRQs to
> > > + * account guest time so that any ticks that occurred while running the
> > > + * guest are properly accounted to the guest.
> > > + */
> > > + if (!vtime_accounting_enabled_this_cpu())
> > > + vtime_account_guest_exit();
> >
> > Can we rather have instead:
> >
> > static inline void tick_account_guest_exit(void)
> > {
> > if (!vtime_accounting_enabled_this_cpu())
> > current->flags &= ~PF_VCPU;
> > }
> >
> > It duplicates a bit of code but I think this will read less confusing.
>
> Either way works for me. I used vtime_account_guest_exit() to try to keep as
> many details as possible inside vtime, e.g. in case the implemenation is tweaked
> in the future. But I agree that pretending KVM isn't already deeply intertwined
> with the details is a lie.

Ah I see, before 87fa7f3e98a131 the vtime was accounted after interrupts get
processed. So it used to work until then. I see that ARM64 waits for IRQs to
be enabled too.

PPC/book3s_hv, MIPS, s390 do it before IRQs get re-enabled (weird, how does that
work?)

And PPC/book3s_pr calls guest_exit() so I guess it has interrupts enabled.

The point is: does it matter to call vtime_account_guest_exit() before or
after interrupts? If it doesn't matter, we can simply call
vtime_account_guest_exit() once and for all once IRQs are re-enabled.

If it does matter because we don't want to account the host IRQs firing at the
end of vcpu exit, then probably we should standardize that behaviour and have
guest_exit_vtime() called before interrupts get enabled and guest_exit_tick()
called after interrupts get enabled. It's probably then beyond the scope of this
patchset but I would like to poke your opinion on that.

Thanks.