Re: [PATCH v2] Revert "virtio-blk: support completion batching for the IRQ path"

From: Michael S. Tsirkin
Date: Thu Jun 22 2023 - 02:15:27 EST


On Tue, Jun 20, 2023 at 10:54:19PM +0000, Edward Liaw wrote:
> On Fri, Jun 09, 2023 at 03:27:24AM -0400, Michael S. Tsirkin wrote:
> > This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13.
> This commit was also breaking kernel tests on a virtual Android device
> (cuttlefish). We were seeing hangups like:
>
> [ 2889.910733] INFO: task kworker/u8:2:6312 blocked for more than 720 seconds.
> [ 2889.910967] Tainted: G OE 6.2.0-mainline-g5c05cafa8df7-ab9969617 #1
> [ 2889.911143] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2889.911389] task:kworker/u8:2 state:D stack:12160 pid:6312 ppid:2 flags:0x00004000
> [ 2889.911567] Workqueue: writeback wb_workfn (flush-254:57)
> [ 2889.911771] Call Trace:
> [ 2889.911831] <TASK>
> [ 2889.911893] __schedule+0x55f/0x880
> [ 2889.912021] schedule+0x6a/0xc0
> [ 2889.912110] schedule_timeout+0x58/0x1a0
> [ 2889.912200] wait_for_common+0xf7/0x1b0
> [ 2889.912289] wait_for_completion+0x1c/0x40
> [ 2889.912377] f2fs_issue_checkpoint+0x14c/0x210
> [ 2889.912504] f2fs_sync_fs+0x4c/0xb0
> [ 2889.912597] f2fs_balance_fs_bg+0x2f6/0x340
> [ 2889.912736] ? can_migrate_task+0x39/0x2b0
> [ 2889.912872] f2fs_write_node_pages+0x77/0x240
> [ 2889.912989] do_writepages+0xde/0x240
> [ 2889.913094] __writeback_single_inode+0x3f/0x360
> [ 2889.913231] writeback_sb_inodes+0x320/0x5f0
> [ 2889.913349] ? move_expired_inodes+0x58/0x210
> [ 2889.913470] __writeback_inodes_wb+0x97/0x100
> [ 2889.913587] wb_writeback+0x17e/0x390
> [ 2889.913682] wb_workfn+0x35f/0x500
> [ 2889.913774] process_one_work+0x1d9/0x3b0
> [ 2889.913870] worker_thread+0x251/0x410
> [ 2889.913960] kthread+0xeb/0x110
> [ 2889.914052] ? __cfi_worker_thread+0x10/0x10
> [ 2889.914168] ? __cfi_kthread+0x10/0x10
> [ 2889.914257] ret_from_fork+0x29/0x50
> [ 2889.914364] </TASK>
> [ 2889.914565] INFO: task mkdir09:6425 blocked for more than 720 seconds.
> [ 2889.916065] Tainted: G OE 6.2.0-mainline-g5c05cafa8df7-ab9969617 #1
> [ 2889.916255] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 2889.916450] task:mkdir09 state:D stack:13016 pid:6425 ppid:6423 flags:0x00004000
> [ 2889.916656] Call Trace:
> [ 2889.916900] <TASK>
> [ 2889.917004] __schedule+0x55f/0x880
> [ 2889.917129] schedule+0x6a/0xc0
> [ 2889.917273] schedule_timeout+0x58/0x1a0
> [ 2889.917425] wait_for_common+0xf7/0x1b0
> [ 2889.917535] wait_for_completion+0x1c/0x40
> [ 2889.917670] f2fs_issue_checkpoint+0x14c/0x210
> [ 2889.917844] f2fs_sync_fs+0x4c/0xb0
> [ 2889.917969] f2fs_do_sync_file+0x3a8/0x8c0
> [ 2889.918090] ? mt_find+0xa0/0x1a0
> [ 2889.918216] f2fs_sync_file+0x2f/0x60
> [ 2889.918310] vfs_fsync_range+0x74/0x90
> [ 2889.918567] __se_sys_msync+0x176/0x270
> [ 2889.918730] __x64_sys_msync+0x1c/0x40
> [ 2889.918873] do_syscall_64+0x53/0xb0
> [ 2889.918996] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 2889.919178] RIP: 0033:0x7540b08bcf47
> [ 2889.919297] RSP: 002b:00007fff5fcbeea8 EFLAGS: 00000206 ORIG_RAX: 000000000000001a
> [ 2889.919496] RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007540b08bcf47
> [ 2889.919828] RDX: 0000000000000004 RSI: 0000000000001000 RDI: 00007540b4ae7000
> [ 2889.920227] RBP: 0000000000000000 R08: 721e0000010b0016 R09: 0000000000000003
> [ 2889.920435] R10: 0000000000000100 R11: 0000000000000206 R12: 00005d2f95fd2f08
> [ 2889.920793] R13: 00005d2f95fbc310 R14: 00007540b08e1bb8 R15: 00005d2f95fbc310
> [ 2889.921072] </TASK>
>
> in random tests in the LTP (linux test project) test suite.
>
> > ---
> >
> > Since v1:
> > fix build error
> >
> > Still completely untested as I'm traveling.
> > Martin, Suwan, could you please test and report?
> > Suwan if you have a better revert in mind pls post and
> > I will be happy to drop this.
> >
> > Thanks!
> >
> This revert appears to have resolved the test failures for me.
>
> Tested-by: edliaw@xxxxxxxxxx

Oh interesting, can you share how to reproduce the failures?
Then maybe Suwan Kim can take a look at fixing up the patch.

--
MST