Re: [bug report] GICv4.1: VM performance degradation due to not trapping vCPU WFI

From: Oliver Upton
Date: Wed Jan 17 2024 - 11:51:20 EST


On Wed, Jan 17, 2024 at 10:20:32PM +0800, sundongxu (A) wrote:
> On 2024/1/16 19:13, Marc Zyngier wrote:
> > On Tue, 16 Jan 2024 03:26:08 +0000, "sundongxu (A)" <sundongxu3@xxxxxxxxxx> wrote:
> >> We found a problem about GICv4/4.1, for example:
> >> We use QEMU to start a VM (4 vCPUs and 8G memory), VM disk was
> >> configured with virtio, and the network is configured with vhost-net,
> >> the CPU affinity of the vCPU and emulator is as follows, in VM xml:

<snip>

> >> <cputune>
> >> <vcpupin vcpu='0' cpuset='4'/>
> >> <vcpupin vcpu='1' cpuset='5'/>
> >> <vcpupin vcpu='2' cpuset='6'/>
> >> <vcpupin vcpu='3' cpuset='7'/>
> >> <emulatorpin cpuset='4,5,6,7'/>
> >> </cputune>

</snip>

> > Effectively, we apply the same principle to vSGIs as to vLPIs, and it
> > was found that this heuristic was pretty beneficial to vLPIs. I'm a
> > bit surprised that vSGIs are so different in their usage pattern.
>
> IMO, the point is hypervisor not trapping vCPU WFI, rather than
> vSGI/vLPI usage pattern.

Sure, that's what's affecting your use case, but the logic in the kernel
came about because improving virtual interrupt injection has been found
to be generally useful.

> >
> > Does it help if you move your "emulatorpin" to some other physical
> > CPUs?
>
> Yes,it does, in kernel 5.10 or 6.5rc1.

Won't your VM have a poor experience in this configuration regardless of WFx
traps? The value of vCPU pinning is to *isolate* the vCPU threads from
noise/overheads of the host and scheduler latencies. Seems to me that
VMM overhead threads are being forced to take time away from the guest.

Nevertheless, disabling WFI traps isn't going to work well for
overcommitted scenarios. The thought of tacking on more hacks in KVM has be
a bit uneasy, perhaps instead we can give userspace an interface to explicitly
enable/disable WFx traps and let it pick a suitable policy.

--
Thanks,
Oliver