Re: [PATCH 1/7] cpumask: fix checking valid cpu range

From: Yury Norov
Date: Fri Sep 30 2022 - 22:05:09 EST


On Fri, Sep 30, 2022 at 06:04:08PM +0100, Valentin Schneider wrote:
[...]

> > next_cpu is a valid CPU number for all, but not for cpumask_next().
> > The warning is valid. If we are at the very last cpu, what for we look
> > for next?
> >
>
> Consider:
>
> nr_cpu_ids=4
>
> A)
> cpumask: 0.1.1.0
> CPU 0 1 2 3
> n ^
> result: nr_cpu_ids
>
> B)
> cpumask: 0.0.1.1
> CPU 0 1 2 3
> n ^
> result: nr_cpu_ids + WARN
>
> Both scenarios are identical from a user perspective: a valid CPU number
> was passed in (either from smp_processor_id() or from a previous call to
> cpumask_next*()), but there are no more bits set in the cpumask. There's no
> more CPUs to search for in both scenarios, but only one produces as WARN.

It seems I have to repeat it for the 3rd time.

cpumask_next() takes shifted cpu index. That's why cpumask_check()
must shift the index in the other direction to keep all that
checking logic consistent.

This is a bad design, and all users of cpumask_next() must be aware of
this pitfall.

[...]

> > Maybe we should consider nr_cpu_ids as a special valid index for
> > cpumask_check(), a sign of the end of an array. This would help to
> > silence many warnings, like this one. For now I'm leaning towards that
> > it's more a hack than a meaningful change.
> >
>
> I agree, we definitely want to warn for e.g.
>
> cpumask_set_cpu(nr_cpu_ids, ...);
>
> Could we instead make cpumask_next*() immediately return nr_cpu_ids when
> passed n=nr_cpu_ids-1?

This is what FIND_NEXT_BIT() does. If you're suggesting to silence the
warning - what for do we need it at all?

> Also, what about cpumask_next_wrap()? That uses cpumask_next() under the
> hood and is bound to warn when wrapping after n=nr_cpu_ids-1, I think.

I'm working on a fix for it. Hopefully will merge it in next window.

Thanks,
Yury