Re: [linux] [5.19.0] task hung for indefinite time with call traces when rebooted with Kexec

From: Tasmiya Nalatwad
Date: Thu Aug 18 2022 - 05:44:20 EST


Greetings,


Please find the location in source code from where i am seeing the call traces generating

File : kernel/hung_task.c

/*
* Ok, the task did not get scheduled for more than 2 minutes,
* complain:
*/
if (sysctl_hung_task_warnings) {
if (sysctl_hung_task_warnings > 0)
sysctl_hung_task_warnings--;
pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
pr_err(" %s %s %.*s\n",
print_tainted(), init_utsname()->release,
(int)strcspn(init_utsname()->version, " "),
init_utsname()->version);
pr_err("\"echo 0 > /proc/sys/kernel/hung_task_timeout_secs\""
" disables this message.\n");
sched_show_task(t);
hung_task_show_lock = true;

if (sysctl_hung_task_all_cpu_backtrace)
hung_task_show_all_bt = true;
}

touch_nmi_watchdog();
}


On 8/6/22 15:30, Tetsuo Handa wrote:
On 2022/08/05 15:54, Tasmiya Nalatwad wrote:
Greetings,

[linux] [5.19.0] task hung for indefinite time with call traces when rebooted with Kexec, A restart is required to recover the machine.

kexec is waiting for workqueues ("kworker/3:1" and "kworker/3:0") to complete.
If this problem happens only when rebooting with kexec, something in kexec path
might be preventing these workqueues from completing.

Anyway, please repost with locations in source code like syzbot report does.

[ 1104.673153] task:kworker/3:1 state:D stack: 0 pid: 221 ppid: 2 flags:0x00000800
[ 1104.673160] Workqueue: fc_wq_0 fc_rport_final_delete [scsi_transport_fc]
[ 1104.673170] Call Trace:
[ 1104.673173] [c0000000060eb860] [0000000000000004] 0x4 (unreliable)
[ 1104.673178] [c0000000060eba50] [c00000000001e378] __switch_to+0x288/0x4a0
[ 1104.673185] [c0000000060ebab0] [c000000000d23e84] __schedule+0x2c4/0x8c0
[ 1104.673190] [c0000000060ebb80] [c000000000d244e8] schedule+0x68/0x130
[ 1104.673196] [c0000000060ebbb0] [c0000000008d4574] scsi_remove_target+0x314/0x390

[ 1104.673233] task:kworker/3:0 state:D stack: 0 pid:227332 ppid: 2 flags:0x00000880
[ 1104.673237] Workqueue: fc_wq_0 fc_rport_final_delete [scsi_transport_fc]
[ 1104.673243] Call Trace:
[ 1104.673244] [c0000000726bb860] [c0000000001b9cb4] enqueue_entity+0x184/0x4f0 (unreliable)
[ 1104.673250] [c0000000726bba50] [c00000000001e378] __switch_to+0x288/0x4a0
[ 1104.673254] [c0000000726bbab0] [c000000000d23e84] __schedule+0x2c4/0x8c0
[ 1104.673258] [c0000000726bbb80] [c000000000d244e8] schedule+0x68/0x130
[ 1104.673262] [c0000000726bbbb0] [c0000000008d4574] scsi_remove_target+0x314/0x390
[
[ 1104.673295] task:kexec state:D stack: 0 pid:228289 ppid: 1 flags:0x00040080
[ 1104.673299] Call Trace:
[ 1104.673301] [c000000069147510] [c00000000001e378] __switch_to+0x288/0x4a0
[ 1104.673305] [c000000069147570] [c000000000d23e84] __schedule+0x2c4/0x8c0
[ 1104.673309] [c000000069147640] [c000000000d244e8] schedule+0x68/0x130
[ 1104.673313] [c000000069147670] [c000000000d2e028] schedule_timeout+0x348/0x3f0
[ 1104.673317] [c000000069147750] [c000000000d2554c] wait_for_completion+0xcc/0x2b0
[ 1104.673321] [c0000000691477d0] [c00000000017cbe8] flush_workqueue+0x158/0x520
[ 1104.673325] [c000000069147870] [c00000000017d068] drain_workqueue+0xb8/0x240
[ 1104.673329] [c000000069147930] [c0000000001825e0] destroy_workqueue+0x60/0x420
[ 1104.673333] [c0000000691479c0] [c0080000009291e4] fc_remove_host+0x21c/0x280 [scsi_transport_fc]



--
Regards,
Tasmiya Nalatwad
IBM Linux Technology Center