Re: [syzbot] WARNING in c_start

From: Yury Norov
Date: Sat Oct 15 2022 - 21:13:13 EST


On Sun, Oct 16, 2022 at 09:24:57AM +0900, Tetsuo Handa wrote:
> On 2022/10/16 5:44, Yury Norov wrote:
> > Add people from other threads discussing this.
> >
> > On Sat, Oct 15, 2022 at 01:53:19PM +0200, Borislav Petkov wrote:
> >> On Sat, Oct 15, 2022 at 08:39:19PM +0900, Tetsuo Handa wrote:
> >>> That's an invalid command line. The correct syntax is:
> >>>
> >>> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> >>
> >> The fix is not in Linus' tree yet.
> >>
> >>> Andrew Jones proposed a fix for x86 and riscv architectures [2]. But
> >>> other architectures have the same problem. And fixing all callers will
> >>> not be in time for this merge window.
> >>
> >> Why won't there be time? That's why the -rcs are for.
> >>
> >> Also, that thing fires only when CONFIG_DEBUG_PER_CPU_MAPS is enabled.
> >>
> >> So no, we will take Andrew's fixes for all arches in time for 6.1.
> >
> > Summarizing things:
> >
> > 1. cpumask_check() was introduced to make sure that the cpu number
> > passed into cpumask API belongs to a valid range. But the check is
> > broken for a very long time. And because of that there are a lot of
> > places where cpumask API is used wrongly.
> >
> > 2. Underlying bitmap functions handle that correctly - when user
> > passes out-of-range CPU index, the nr_cpu_ids is returned, and this is
> > what expected by client code. So if DEBUG_PER_CPU_MAPS config is off,
> > everything is working smoothly.
> >
> > 3. I fixed all warnings that I was aware at the time of submitting the
> > patch. 2 follow-up series are on review: "[PATCH v2 0/4] net: drop
> > netif_attrmask_next*()" and "[PATCH 0/9] lib/cpumask: simplify
> > cpumask_next_wrap()". Also, Andrew Jones, Alexander Gordeev and Guo Ren
> > proposed fixes for c_start() in arch code.
> >
> > 4. The code paths mentioned above are all known to me that violate
> > cpumask_check() rules. (Did I miss something?)
> >
> > With all that, I agree with Borislav. Unfortunately, syzcall didn't CC
> > me about this problem with c_start(). But I don't like the idea to revert
> > cpumask_check() fix. This way we'll never clean that mess.
> >
> > If for some reason those warnings are unacceptable for -rcs (and like
> > Boris, I don't understand why), than instead of reverting commits, I'd
> > suggest moving cpumask sanity check from DEBUG_PER_CPU_MAPS under a new
> > config, say CONFIG_CPUMASK_DEBUG, which will be inactive until people will
> > fix their code. I can send a patch shortly, if we'll decide going this way.
> >
> > How people would even realize that they're doing something wrong if
> > they will not get warned about it?
>
> I'm asking you not to use BUG_ON()/WARN_ON() etc. which breaks syzkaller.

It's not me who added WARN_ON() in the cpumask_check(). You're asking
a wrong person.

What for do we have WARN_ON(), if we can't use it?

> Just printing messages (without "BUG:"/"WARNING:" string which also breaks
> syzkaller) like below diff is sufficient for people to realize that they're
> doing something wrong.
>
> Again, please do revert "cpumask: fix checking valid cpu range" immediately.

The revert is already in Jakub's batch for -rc2, AFAIK.

Thanks,
Yury