XFS freeze (xfsaild blocked) with 4.2.5 (also 4.3)

From: David Madore
Date: Sun Nov 08 2015 - 06:49:27 EST


Compiled a 4.2.5 kernel a few days ago. This morning, my machine was
essentially unresponsive (couldn't log on) so I used
alt-sysrq-{s,t,s,u,b} to reboot it, and after reboot it appears that
the first suspicious message concerns xfsaild blocked for more than
300 seconds. The disks and filesystems are good as far as I know, and
I never had any problem with my previous (4.1.4) kernel, so I guess
this is a regression in 4.2. (I also had a similar issue with 4.3,
which is probably due to the same cause and also XFS-related: <URL:
https://lkml.org/lkml/2015/11/4/104 >.)

Please contact me for further information, or just to make me feel
less lonely and helpless when facing this kind of bug. :-(

Here are the first few and last lines of log, with links to the full
logs and full config below:

Nov 8 08:08:14 vega kernel: [313800.046048] INFO: task xfsaild/md117:14100 blocked for more than 300 seconds.
Nov 8 08:08:14 vega kernel: [313800.046054] Not tainted 4.2.5-vega #1
Nov 8 08:08:14 vega kernel: [313800.046057] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 8 08:08:14 vega kernel: [313800.046060] xfsaild/md117 D ffff88023fc14a00 0 14100 2 0x00000000
Nov 8 08:08:14 vega kernel: [313800.046066] ffff8800bbb07d18 0000000000000046 ffffffff817244c0 ffff8802345126c0
Nov 8 08:08:14 vega kernel: [313800.046071] ffff8800bbb07d08 ffffffff810a35ba ffff8800bbb07d58 ffff8800bbb08000
Nov 8 08:08:14 vega kernel: [313800.046075] ffff880231235528 0000000000000000 ffff880231235400 ffff880234460e00
Nov 8 08:08:14 vega kernel: [313800.046080] Call Trace:
Nov 8 08:08:14 vega kernel: [313800.046090] [<ffffffff810a35ba>] ? try_to_del_timer_sync+0x4a/0x60
Nov 8 08:08:14 vega kernel: [313800.046095] [<ffffffff81531692>] schedule+0x32/0x80
Nov 8 08:08:14 vega kernel: [313800.046123] [<ffffffffa0815724>] _xfs_log_force+0x154/0x240 [xfs]
Nov 8 08:08:14 vega kernel: [313800.046128] [<ffffffff8106f3c0>] ? wake_up_q+0x70/0x70
Nov 8 08:08:14 vega kernel: [313800.046143] [<ffffffffa0815835>] xfs_log_force+0x25/0x80 [xfs]
Nov 8 08:08:14 vega kernel: [313800.046159] [<ffffffffa081f301>] ? xfs_trans_ail_cursor_done+0x11/0x30 [xfs]
Nov 8 08:08:14 vega kernel: [313800.046173] [<ffffffffa081f4d8>] xfsaild+0x138/0x560 [xfs]
Nov 8 08:08:14 vega kernel: [313800.046188] [<ffffffffa081f3a0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
Nov 8 08:08:14 vega kernel: [313800.046203] [<ffffffffa081f3a0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
Nov 8 08:08:14 vega kernel: [313800.046207] [<ffffffff81065d54>] kthread+0xc4/0xe0
Nov 8 08:08:14 vega kernel: [313800.046211] [<ffffffff81065c90>] ? kthread_create_on_node+0x180/0x180
Nov 8 08:08:14 vega kernel: [313800.046215] [<ffffffff8153495f>] ret_from_fork+0x3f/0x70
Nov 8 08:08:14 vega kernel: [313800.046219] [<ffffffff81065c90>] ? kthread_create_on_node+0x180/0x180
<snip>
Nov 8 11:18:01 vega kernel: [325187.487839] Showing busy workqueues and worker pools:
Nov 8 11:18:01 vega kernel: [325187.487839] workqueue xfs-log/md117: flags=0x14
Nov 8 11:18:01 vega kernel: [325187.487839] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=2/256
Nov 8 11:18:01 vega kernel: [325187.487839] pending: xfs_buf_ioend_work [xfs], xfs_log_worker [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] workqueue xfs-log/md118: flags=0x14
Nov 8 11:18:01 vega kernel: [325187.487839] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 8 11:18:01 vega kernel: [325187.487839] pending: xfs_log_worker [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] workqueue xfs-log/md115: flags=0x14
Nov 8 11:18:01 vega kernel: [325187.487839] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 8 11:18:01 vega kernel: [325187.487839] pending: xfs_log_worker [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] workqueue xfs-log/md120: flags=0x14
Nov 8 11:18:01 vega kernel: [325187.487839] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 8 11:18:01 vega kernel: [325187.487839] pending: xfs_log_worker [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] workqueue xfs-log/md113: flags=0x14
Nov 8 11:18:01 vega kernel: [325187.487839] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=2/256
Nov 8 11:18:01 vega kernel: [325187.487839] in-flight: 31546:xfs_log_worker [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] pending: xfs_buf_ioend_work [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] workqueue xfs-log/md110: flags=0x14
Nov 8 11:18:01 vega kernel: [325187.487839] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 8 11:18:01 vega kernel: [325187.487839] pending: xfs_log_worker [xfs]
Nov 8 11:18:01 vega kernel: [325187.487839] pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 31448 idle: 31569
Nov 8 11:18:01 vega kernel: [325187.487839] pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=2 manager: 31616 idle: 31510
Nov 8 11:18:01 vega kernel: [325187.487839] pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 manager: 31499
Nov 8 11:18:01 vega kernel: [325187.487839] pool 4: cpus=2 node=0 flags=0x0 nice=0 workers=2 manager: 31503 idle: 31571
Nov 8 11:18:01 vega kernel: [325187.487839] pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 14279 idle: 31425



Kernel config:
http://www.madore.org/~david/.tmp/config.20151108

Syslog reporting hung tasks and (at 11:18) output of alt-sysrq-t:
http://www.madore.org/~david/.tmp/syslog.20151108

Initial dmesg:
http://www.madore.org/~david/.tmp/dmesg.20151108

--
David A. Madore
( http://www.madore.org/~david/ )
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/