RE: [syzbot] WARNING in kthread_bind_mask

From: Zhang, Qiang1
Date: Mon Feb 21 2022 - 20:51:19 EST



On Sun, Feb 20, 2022 at 10:27:23AM -0800, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
> git tree: upstream
> console output:
> https://syzkaller.appspot.com/x/log.txt?x=11daf74a700000
> kernel config:
> https://syzkaller.appspot.com/x/.config?x=da674567f7b6043d
> dashboard link: https://syzkaller.appspot.com/bug?extid=087b7effddeec0697c66
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+087b7effddeec0697c66@xxxxxxxxxxxxxxxxxxxxxxxxx
>
> BTRFS info (device loop3): disk space caching is enabled BTRFS info
> (device loop3): has skinny extents ------------[ cut here
> ]------------
> WARNING: CPU: 0 PID: 10327 at kernel/kthread.c:525 __kthread_bind_mask
> kernel/kthread.c:525 [inline]
>
> 520 static void __kthread_bind_mask(struct task_struct *p, const struct cpumask *mask, unsigned int state)
> 521 {
> 522 unsigned long flags;
> 523
> 524 if (!wait_task_inactive(p, state)) {
> 525 WARN_ON(1);
> 526 return;
> 527 }
>

Maybe we can add some additional debugging information to view the status of the process.

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 38c6dd822da8..e707e86ee64b 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -29,7 +29,7 @@
#include <linux/numa.h>
#include <linux/sched/isolation.h>
#include <trace/events/sched.h>
-
+#include <linux/sched/debug.h>

static DEFINE_SPINLOCK(kthread_create_lock);
static LIST_HEAD(kthread_create_list);
@@ -521,8 +521,8 @@ static void __kthread_bind_mask(struct task_struct *p, const struct cpumask *mas
{
unsigned long flags;

- if (!wait_task_inactive(p, state)) {
- WARN_ON(1);
+ if (WARN_ON(!wait_task_inactive(p, state))) {
+ sched_show_task(p);
return;
}

Thanks,
Zqiang

>That seems to be some internal task state inconsistency.