Re: [PATCH v3] ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512

From: Christoph Lameter (Ampere)
Date: Mon Mar 11 2024 - 17:07:23 EST


On Mon, 11 Mar 2024, Catalin Marinas wrote:

This patch landed in today's linux-next as commit 0499a78369ad ("ARM64:
Dynamically allocate cpumasks and increase supported CPUs to 512").
Unfortunately it triggers the following warning during boot on most of
my ARM64-based test boards. Here is an example from Odroid-N2 board:

I spent a big part of this afternoon going through the code paths but
there's nothing obvious that triggered this problem. My suspicion is
some memory corruption, algorithmically I can't see anything that could
go wrong with CPUMASK_OFFSTACK. Unfortunately I could not reproduce it
yet to be able to add some debug info.

So I decided to revert this patch. If we get to the bottom of it during
the merging window, I can still revive it. Otherwise we'll add it to
linux-next post -rc1.

I also looked through the opp source and I cannot find even anything that
even uses the functionality changed by the OFFSTACK option.

This could be an issue in the ARM64 arch code itself where there maybe an assumption elsewhere that a cpumask can always store up to NR_CPU cpus and not only nr_cpu_ids as OFFSTACK does.

How can I exercise the opp driver in order to recreate the problem?

I assume the opp driver is ARM specific? x86 defaults to OFFSTACK so if there is an issue with OFFSTACK in opp then it should fail with kernel default configuration on that platform.