RE: [patch 00/37] cpu/hotplug, x86: Reworked parallel CPU bringup

From: Michael Kelley (LINUX)
Date: Thu Apr 27 2023 - 10:48:32 EST


From: Thomas Gleixner <tglx@xxxxxxxxxxxxx> Sent: Friday, April 14, 2023 4:44 PM

[snip]

>
> Conclusion
> ----------
>
> Adding the basic parallel bringup mechanism as provided by this series
> makes a lot of sense. Improving particular issues as pointed out in the
> analysis makes sense too.
>
> But trying to solve an application specific problem fully in the kernel
> with tons of complexity, without exploring straight forward and simple
> approaches first, does not make any sense at all.
>
> Thanks,
>
> tglx
>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 20
> Documentation/core-api/cpu_hotplug.rst | 13
> arch/Kconfig | 23 +
> arch/arm/Kconfig | 1
> arch/arm/include/asm/smp.h | 2
> arch/arm/kernel/smp.c | 18
> arch/arm64/Kconfig | 1
> arch/arm64/include/asm/smp.h | 2
> arch/arm64/kernel/smp.c | 14
> arch/csky/Kconfig | 1
> arch/csky/include/asm/smp.h | 2
> arch/csky/kernel/smp.c | 8
> arch/mips/Kconfig | 1
> arch/mips/cavium-octeon/smp.c | 1
> arch/mips/include/asm/smp-ops.h | 1
> arch/mips/kernel/smp-bmips.c | 1
> arch/mips/kernel/smp-cps.c | 14
> arch/mips/kernel/smp.c | 8
> arch/mips/loongson64/smp.c | 1
> arch/parisc/Kconfig | 1
> arch/parisc/kernel/process.c | 4
> arch/parisc/kernel/smp.c | 7
> arch/riscv/Kconfig | 1
> arch/riscv/include/asm/smp.h | 2
> arch/riscv/kernel/cpu-hotplug.c | 14
> arch/x86/Kconfig | 45 --
> arch/x86/include/asm/apic.h | 5
> arch/x86/include/asm/cpu.h | 5
> arch/x86/include/asm/cpumask.h | 5
> arch/x86/include/asm/processor.h | 1
> arch/x86/include/asm/realmode.h | 3
> arch/x86/include/asm/sev-common.h | 3
> arch/x86/include/asm/smp.h | 26 -
> arch/x86/include/asm/topology.h | 23 -
> arch/x86/include/asm/tsc.h | 2
> arch/x86/kernel/acpi/sleep.c | 9
> arch/x86/kernel/apic/apic.c | 22 -
> arch/x86/kernel/callthunks.c | 4
> arch/x86/kernel/cpu/amd.c | 2
> arch/x86/kernel/cpu/cacheinfo.c | 21
> arch/x86/kernel/cpu/common.c | 50 --
> arch/x86/kernel/cpu/topology.c | 3
> arch/x86/kernel/head_32.S | 14
> arch/x86/kernel/head_64.S | 121 +++++
> arch/x86/kernel/sev.c | 2
> arch/x86/kernel/smp.c | 3
> arch/x86/kernel/smpboot.c | 508 ++++++++----------------
> arch/x86/kernel/topology.c | 98 ----
> arch/x86/kernel/tsc.c | 20
> arch/x86/kernel/tsc_sync.c | 36 -
> arch/x86/power/cpu.c | 37 -
> arch/x86/realmode/init.c | 3
> arch/x86/realmode/rm/trampoline_64.S | 27 +
> arch/x86/xen/enlighten_hvm.c | 11
> arch/x86/xen/smp_hvm.c | 16
> arch/x86/xen/smp_pv.c | 56 +-
> drivers/acpi/processor_idle.c | 4
> include/linux/cpu.h | 4
> include/linux/cpuhotplug.h | 17
> kernel/cpu.c | 397 +++++++++++++++++-
> kernel/smp.c | 2
> kernel/smpboot.c | 163 -------
> 62 files changed, 953 insertions(+), 976 deletions(-)
>

I smoke-tested several Linux guest configurations running on Hyper-V,
using the "kernel/git/tglx/devel.git hotplug" tree as updated on April 26th.
No functional issues, but encountered one cosmetic issue (details below).

Configurations tested:
* 16 vCPUs and 32 vCPUs
* 1 NUMA node and 2 NUMA nodes
* Parallel bring-up enabled and disabled via kernel boot line
* "Normal" VMs and SEV-SNP VMs running with a paravisor on Hyper-V.
This config can use parallel bring-up because most of the SNP-ness is
hidden in the paravisor. I was glad to see this work properly.

There's not much difference in performance with and without parallel
bring-up on the 32 vCPU VM. Without parallel, the time is about 26
milliseconds. With parallel, it's about 24 ms. So bring-up is already
fast in the virtual environment.

The cosmetic issue is in the dmesg log, and arises because Hyper-V
enumerates SMT CPUs differently from many other environments. In
a Hyper-V guest, the SMT threads in a core are numbered as <even, odd>
pairs. Guest CPUs #0 & #1 are SMT threads in core, as are #2 & #3, etc. With
parallel bring-up, here's the dmesg output:

[ 0.444345] smp: Bringing up secondary CPUs ...
[ 0.445139] .... node #0, CPUs: #2 #4 #6 #8 #10 #12 #14 #16 #18 #20 #22 #24 #26 #28 #30
[ 0.454112] x86: Booting SMP configuration:
[ 0.456035] #1 #3 #5 #7 #9 #11 #13 #15 #17 #19 #21 #23 #25 #27 #29 #31
[ 0.466120] smp: Brought up 1 node, 32 CPUs
[ 0.467036] smpboot: Max logical packages: 1
[ 0.468035] smpboot: Total of 32 processors activated (153240.06 BogoMIPS)

The function announce_cpu() is specifically testing for CPU #1 to output the
"Booting SMP configuration" message. In a Hyper-V guest, CPU #1 is the second
SMT thread in a core, so it isn't started until all the even-numbered CPUs are
started.

I don't know if this cosmetic issue is worth fixing, but I thought I'd point it out.

In any case,

Tested-by: Michael Kelley <mikelley@xxxxxxxxxxxxx>