Re: Possible nohz-full/RCU issue in arm64 KVM

From: Paolo Bonzini
Date: Fri Dec 17 2021 - 12:23:40 EST


On 12/17/21 18:12, Paul E. McKenney wrote:
On Fri, Dec 17, 2021 at 06:02:23PM +0100, Paolo Bonzini wrote:
On 12/17/21 17:45, Paul E. McKenney wrote:
On Fri, Dec 17, 2021 at 05:34:04PM +0100, Paolo Bonzini wrote:
On 12/17/21 17:07, Paul E. McKenney wrote:
rcu_note_context_switch() is a point-in-time notification; it's not strictly
necessary, but it may improve performance a bit by avoiding unnecessary IPIs
from the RCU subsystem.

There's no benefit from doing it when you're back from the guest, because at
that point the CPU is just running normal kernel code.

Do scheduling-clock interrupts from guest mode have the "user" parameter
set? If so, that would keep RCU happy.

No, thread is in supervisor mode. But after every interrupt (timer tick or
anything), one of three things can happen:

* KVM will go around the execution loop and invoke rcu_note_context_switch()
again

* or KVM will go back to user space

Here "user space" is a user process as opposed to a guest OS?

Yes, that code runs from ioctl(KVM_RUN) and the ioctl will return to the
calling process.

Intriguing. A user process within the guest OS or a user process outside
of any guest OS, that is, within the host?

A user process on the host. The guest vCPU is nothing special: it's just a user thread that occasionally lets the guest run by invoking the KVM_RUN ioctl. Hopefully, KVM_RUN will be where that user thread will spend most of the time so the guest runs at full steam. KVM_RUN is the place where you have the code that Nicolas and Mark were discussing.

From the point of view of the kernel however the thread is always in kernel mode when it runs the guest, because any interrupt will be recognized while still in the ioctl.

(I'll add that from the point of view of the scheduler, there's no difference between a CPU-bound guest and a "normal" CPU-bound process on the host, e.g. wasting time with "for(;;)" or calculating digits of PI is the same no matter if you're doing it in the guest or in the host. Likewise for I/O-bound guests; e.g. doing "hlt" or "wfi" constantly in the guest looks exactly the same to the scheduler as a process that spends its time in the poll() system call).

Paolo