Re: [PATCH v3] RISC-V: Increase range and default value of NR_CPUS

From: Palmer Dabbelt
Date: Tue Oct 04 2022 - 19:53:24 EST


On Thu, 18 Aug 2022 14:29:03 PDT (-0700), atishp@xxxxxxxxxxxxxx wrote:
On Tue, May 24, 2022 at 5:08 AM Anup Patel <anup@xxxxxxxxxxxxxx> wrote:

Hi Palmer,

On Wed, Apr 20, 2022 at 4:54 PM Anup Patel <apatel@xxxxxxxxxxxxxxxx> wrote:
>
> Currently, the range and default value of NR_CPUS is too restrictive
> for high-end RISC-V systems with large number of HARTs. The latest
> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
> restrictive for QEMU as well. Other major architectures (such as
> ARM64, x86_64, MIPS, etc) have a much higher range and default
> value of NR_CPUS.
>
> This patch increases NR_CPUS range to 2-512 and default value to
> XLEN (i.e. 32 for RV32 and 64 for RV64).
>
> Signed-off-by: Anup Patel <apatel@xxxxxxxxxxxxxxxx>

Can this PATCH be considered for 5.19 ?

Thanks,
Anup

> ---
> Changes since v2:
> - Rebased on Linux-5.18-rc3
> - Use a different range when SBI v0.1 is enabled
> Changes since v1:
> - Updated NR_CPUS range to 2-512 which reflects maximum number of
> CPUs supported by QEMU virt machine.
> ---
> arch/riscv/Kconfig | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 00fd9c548f26..1823f281069f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -275,10 +275,13 @@ config SMP
> If you don't know what to do here, say N.
>
> config NR_CPUS
> - int "Maximum number of CPUs (2-32)"
> - range 2 32
> + int "Maximum number of CPUs (2-512)"
> depends on SMP
> - default "8"
> + range 2 512 if !SBI_V01
> + range 2 32 if SBI_V01 && 32BIT
> + range 2 64 if SBI_V01 && 64BIT
> + default "32" if 32BIT
> + default "64" if 64BIT
>
> config HOTPLUG_CPU
> bool "Support for hot-pluggable CPUs"
> --
> 2.25.1
>


Ping ?
It would be useful to include this patch sooner than later to enable
high HART count testing by default.

Ya, I think that's reasonable: the higher CPU counts have found a bunch of issues, but they're not really Linux bugs and stopping folks from running them is just going to stop those bugs from being fixed. It seems like these higher default NR_CPUS are stable on smaller systems so that shouldn't hurt anything.

I'm still getting a bunch of issues when trying to run the larger CPU count systems in QEMU, but I think it's OK just assuming those are long-tail issues that would manifest anyway and are just more likely with the higher core counts.

So I've got this on for-next, under the rationale that the new default CPU counts are safe.