Re: [RFC PATCH 0/2] Adjust CFS loadbalance to adapt QEMU CPU topology.

From: Peter Zijlstra
Date: Thu Jul 20 2023 - 04:59:03 EST


On Thu, Jul 20, 2023 at 04:34:11PM +0800, Kenan.Liu wrote:
> From: "Kenan.Liu" <Kenan.Liu@xxxxxxxxxxxxxxxxx>
>
> Multithreading workloads in VM with Qemu may encounter an unexpected
> phenomenon: one hyperthread of a physical core is busy while its sibling
> is idle. Such as:

Is this with vCPU pinning? Without that, guest topology makes no sense
what so ever.

> The main reason is that hyperthread index is consecutive in qemu native x86 CPU
> model which is different from the physical topology.

I'm sorry, what? That doesn't make sense. SMT enumeration is all over
the place for Intel, but some actually do have (n,n+1) SMT. On AMD it's
always (n,n+1) IIRC.

> As the current kernel scheduler
> implementation, hyperthread with an even ID number will be picked up in a much
> higher probability during load-balancing and load-deploying.

How so?

> This RFC targets to solve the problem by adjusting CFS loabalance policy:
> 1. Explore CPU topology and adjust CFS loadbalance policy when we found machine
> with qemu native CPU topology.
> 2. Export a procfs to control the traverse length when select idle cpu.
>
> Kenan.Liu (2):
> sched/fair: Adjust CFS loadbalance for machine with qemu native CPU
> topology.
> sched/fair: Export a param to control the traverse len when select
> idle cpu.

NAK, qemu can either provide a fake topology to the guest using normal
x86 means (MADT/CPUID) or do some paravirt topology setup, but this is
quite insane.