Re: [patch 0/3] lib/percpu_counter, cpu/hotplug: Cure the cpu_dying_mask woes

From: Dennis Zhou
Date: Sat Dec 30 2023 - 17:39:32 EST


Hello,

On Fri, Apr 14, 2023 at 06:30:42PM +0200, Thomas Gleixner wrote:
> Hi!
>
> The cpu_dying_mask is not only undocumented but also to some extent a
> misnomer. It's purpose is to capture the last direction of a cpu_up() or
> cpu_down() operation taking eventual rollback operations into account.
>
> cpu_dying mask is not really useful for general consumption. The
> cpu_dying_mask bits are sticky even after cpu_up() or cpu_down() completes.
>
> A recent fix to plug a race in the per CPU counter code picked
> cpu_dying_mask to cure it. Unfortunately this does not work as the author
> probably expected and the behaviour of cpu_dying_mask is not easy to change
> without breaking the only other and initial user, the scheduler.
>
> This series addresses this by:
>
> 1) Reworking the per CPU counter hotplug mechanism so the race is fully
> plugged without using cpu_dying_mask
>
> 2) Replacing the cpu_dying_mask logic with hotplug core internal state
> which is exposed to the scheduler with a properly documented
> function.
>
> The series is also available from git:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git smp/dying_mask
>
> Thanks
>
> tglx
> ---
> include/linux/cpuhotplug.h | 2 -
> include/linux/cpumask.h | 21 ----------------
> kernel/cpu.c | 45 +++++++++++++++++++++++++++++------
> kernel/sched/core.c | 4 +--
> kernel/smpboot.h | 2 +
> lib/percpu_counter.c | 57 +++++++++++++++++++--------------------------
> 6 files changed, 67 insertions(+), 64 deletions(-)

This has been on my mind and regretfully it's been a busy year for me.

I know the merge window is around the corner, but I rebased this series
onto percpu#for-6.8 [1]. I had to massage percpu_counter slightly due
to some changes but other than that it largely is intact. I need to do a
little bit of a more thorough pass and re-send it out, but I think it
remains correct to merge. I can then pull it, give it a few days to soak
in for-next and then send it to Linus either in a follow up PR or in the
2nd week of the merge window.

Thomas, how does this sound to you?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git/log/?h=percpu-hotplug

Thanks,
Dennis