Re: PROBLEM: Only one CPU active on Ultra 60 since ~4.8 (regression)

From: Sam Ravnborg
Date: Fri Mar 29 2024 - 05:46:20 EST


Hi Nick,

On Thu, Mar 28, 2024 at 05:08:50PM -0400, Nick Bowler wrote:
> On 2024-03-28 16:09, Linus Torvalds wrote:
> > On Thu, 28 Mar 2024 at 12:36, Linux regression tracking (Thorsten
> > Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
> >>
> >> [CCing Linus, in case I say something to his disliking]
> >>
> >> On 22.03.24 05:57, Nick Bowler wrote:
> >>>
> >>> Just a friendly reminder that this issue still happens on Linux 6.8 and
> >>> reverting commit 9b2f753ec237 as indicated below is still sufficient to
> >>> resolve the problem.
> >>
> >> FWIW, that commit 9b2f753ec23710 ("sparc64: Fix cpu_possible_mask if
> >> nr_cpus is set") is from v4.8. Reverting it after all that time might
> >> easily lead to even bigger trouble.
> >
> > I'm definitely not reverting a patch from almost a decade ago as a regression.
> >
> > If it took that long to find, it can't be that critical of a regression.
>
> FWIW I'm not the first person to notice this problem. Searching the sparclinux
> archive for "ultra 60" which turns up this very similar report[1] from two years
> prior to mine which also went nowhere (sadly, this reporter did not perform a
> bisection to find the problematic commit -- perhaps because nobody asked).
>
> [1] https://lore.kernel.org/sparclinux/20201009161924.c8f031c079dd852941307870@xxxxxx/

I took a look at this and may have a fix. Could you try the following
patch. It builds - but I have not tested it.

Sam