RE: [Syzkaller & bisect] There is "task hung in synchronize_rcu bisect" in v6.3-rc2 kernel

From: Zhang, Qiang1
Date: Thu Apr 13 2023 - 07:51:56 EST


> Hi Huafei and kernel experts,
>
> It's a soft remind.
> This issue could be reproduced in v6.3-rc6 kernel.
> It could be reproduced on Alder lake, Raptor lake and so on x86 platforms.
> After reverted the commit "0e792b89e6800c:ftrace: Fix use-after-free for
> dynamic ftrace_ops" on top of v6.3-rc6 kernel, this issue was gone.
>
> New syzkaller reproduced code, repro.report, bisect_info.log and detailed logs
> are in link:
> https://github.com/xupengfe/syzkaller_logs/tree/main/230412_031722_synchronize_rcu
>
>I just tested against v6.3-rc6 and again all I get is this io_ring related
>issue:
>

I also did not reproduce this synchronize_rcu_tasks() hung problem, but from this link:
https://github.com/xupengfe/syzkaller_logs/blob/main/230412_031722_synchronize_rcu/02bf43c7b7f7a19aa59a75f5244f0a3408bace1a_dmesg.log

I find rcu-task stall

INFO: rcu_tasks detected stalls on tasks:
[ 1243.720952] 000000008aaceb95: .. nvcsw: 1/1 holdout: 1 idle_cpu: -1/0
[ 1243.721915] task:repro state:R running task stack:0 pid:592 ppid:585 flags:0x00004004
[ 1243.723486] Call Trace:
[ 1243.723996] <TASK>
[ 1243.725204] __schedule+0x385/0xb00
[ 1243.726212] ? __cond_resched+0x25/0x30
[ 1243.727108] preempt_schedule_common+0x11/0x90
[ 1243.727964] __cond_resched+0x25/0x30
[ 1243.729076] mousedev_write+0xd2/0x3d0
[ 1243.730168] vfs_write+0x1b6/0x780
[ 1243.730873] ? mousedev_fasync+0x40/0x40
[ 1243.731864] ? vfs_write+0x9/0x780
[ 1243.732874] ? write_comp_data+0x2f/0x90
[ 1243.733926] ksys_write+0x153/0x170
[ 1243.734929] __x64_sys_write+0x27/0x30
[ 1243.735718] do_syscall_64+0x3b/0x90
[ 1243.736760] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 1243.737586] RIP: 0033:0x7f374119759d

Thanks
Zqiang


>
>[ 448.290752] INFO: task kworker/u4:0:9 blocked for more than 294 seconds.
>[ 448.293868] Not tainted 6.3.0-rc6-kvm #1
>[ 448.296019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>[ 448.299592] task:kworker/u4:0 state:D stack:0 pid:9 ppid:2 flags:0x00004000
>[ 448.303397] Workqueue: events_unbound io_ring_exit_work
>[ 448.305884] Call Trace:
>[ 448.307147] <TASK>
>[ 448.308166] __schedule+0x422/0xc90
>[ 448.309824] ? wait_for_completion+0x77/0x170
>[ 448.311870] schedule+0x63/0xd0
>[ 448.313346] schedule_timeout+0x2fe/0x4c0
>[ 448.315255] ? __this_cpu_preempt_check+0x1c/0x30
>[ 448.317360] ? _raw_spin_unlock_irq+0x27/0x60
>[ 448.319400] ? lockdep_hardirqs_on+0x88/0x120
>[ 448.321395] ? wait_for_completion+0x77/0x170
>[ 448.323462] wait_for_completion+0x9e/0x170
>[ 448.325356] io_ring_exit_work+0x2b0/0x810
>[ 448.327300] ? __pfx_io_tctx_exit_cb+0x10/0x10
>[ 448.329345] ? _raw_spin_unlock_irq+0x27/0x60
>[ 448.331397] process_one_work+0x34e/0x720
>[ 448.333212] ? __pfx_io_ring_exit_work+0x10/0x10
>[ 448.335377] ? process_one_work+0x34e/0x720
>[ 448.337295] worker_thread+0x4e/0x530
>[ 448.339079] ? __pfx_worker_thread+0x10/0x10
>[ 448.341008] kthread+0x128/0x160
>[ 448.342513] ? __pfx_kthread+0x10/0x10
>[ 448.344305] ret_from_fork+0x2c/0x50
>[ 448.346016] </TASK>
>[ 448.347176]
>[ 448.347176] Showing all locks held in the system:
>[ 448.349887] 2 locks held by kworker/u4:0/9:
>[ 448.351829] #0: ffff88807eb6dd38 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x2b4/0x720
>[ 448.356362] #1: ffffc9000005fe68 ((work_completion)(&ctx->exit_work)){+.+.}-{0:0}, at: process_one_work+0x2b4/0x720
>[ 448.361052] 1 lock held by rcu_tasks_kthre/11:
>[ 448.363142] #0: ffffffff83963450 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x440
>[ 448.367396] 1 lock held by rcu_tasks_rude_/12:
>[ 448.369387] #0: ffffffff839631d0 (rcu_tasks_rude.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x440
>[ 448.373808] 1 lock held by rcu_tasks_trace/13:
>[ 448.375852] #0: ffffffff83962f10 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{3:3}, at: rcu_tasks_one_gp+0x31/0x440
>[ 448.380434] 1 lock held by khungtaskd/30:
>[ 448.382219] #0: ffffffff83963e60 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x17/0x1d0
>[ 448.386291]
>[ 448.387159] =============================================
>[ 448.387159]