Re: [dm-devel] Processes hung in "D" state in ext4, mm, md and dmcrypt

From: Andrew Morton
Date: Wed Jul 26 2023 - 15:30:54 EST


On Wed, 26 Jul 2023 23:29:51 +0800 Ming Lei <tom.leiming@xxxxxxxxx> wrote:

> On Wed, Jul 26, 2023 at 6:02 PM David Howells <dhowells@xxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > With 6.5-rc2 (6.5.0-0.rc2.20230721gitf7e3a1bafdea.20.fc39.x86_64), I'm seeing
> > a bunch of processes getting stuck in the D state on my desktop after a few
> > hours of reading email and compiling stuff. It's happened every day this week
> > so far and I managed to grab stack traces of the stuck processes this morning
> > (see attached).
> >
> > There are two blockdevs involved below, /dev/md2 and /dev/md3. md3 is a raid1
> > array with two partitions with an ext4 partition on it. md2 is similar but
> > it's dm-crypted and ext4 is on top of that.
> >
> ...
>
> > ===117547===
> > PID TTY STAT TIME COMMAND
> > 117547 ? D 5:12 [kworker/u16:8+flush-9:3]
> > [<0>] blk_mq_get_tag+0x11e/0x2b0
> > [<0>] __blk_mq_alloc_requests+0x1bc/0x350
> > [<0>] blk_mq_submit_bio+0x2c7/0x680
> > [<0>] __submit_bio+0x8b/0x170
> > [<0>] submit_bio_noacct_nocheck+0x159/0x370
> > [<0>] __block_write_full_folio+0x1e1/0x400
> > [<0>] writepage_cb+0x1a/0x70
> > [<0>] write_cache_pages+0x144/0x3b0
> > [<0>] do_writepages+0x164/0x1e0
> > [<0>] __writeback_single_inode+0x3d/0x360
> > [<0>] writeback_sb_inodes+0x1ed/0x4b0
> > [<0>] __writeback_inodes_wb+0x4c/0xf0
> > [<0>] wb_writeback+0x298/0x310
> > [<0>] wb_workfn+0x35b/0x510
> > [<0>] process_one_work+0x1de/0x3f0
> > [<0>] worker_thread+0x51/0x390
> > [<0>] kthread+0xe5/0x120
> > [<0>] ret_from_fork+0x31/0x50
> > [<0>] ret_from_fork_asm+0x1b/0x30
>
> BTW, -rc3 fixes one similar issue on the above code path, so please try -rc3.
>
> 106397376c03 sbitmap: fix batching wakeup

That patch really needs a Fixes:, please. And consideration for a
-stable backport.

Looking at what has changed recently in sbitmap, it seems unlikely that
106397376c03 fixes an issue that just appeared in 6.5-rcX. But maybe
the issue you have identified has recently become easier to hit; we'll
see.