Re: 6.5.0rc5 fs hang - ext4? raid?

From: Theodore Ts'o
Date: Tue Aug 15 2023 - 08:52:46 EST


On Mon, Aug 14, 2023 at 09:02:53PM +0000, Dr. David Alan Gilbert wrote:
> dg 29594 29592 0 18:40 pts/0 00:00:00 /usr/bin/ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/13/liblto_plugin.so -csrDT src/intel/perf/libintel_perf.a src/intel/perf/libintel_perf.a.p/meson-generated_.._intel_perf_metrics.c.o src/intel/perf/libintel_perf.a.p/intel_perf.c.o src/intel/perf/libintel_perf.a.p/intel_perf_query.c.o src/intel/perf/libintel_perf.a.p/intel_perf_mdapi.c.o
>
> [root@dalek dg]# cat /proc/29594/stack
> [<0>] md_super_wait+0xa2/0xe0
> [<0>] md_bitmap_unplug+0xd2/0x120
> [<0>] flush_bio_list+0xf3/0x100 [raid1]
> [<0>] raid1_unplug+0x3b/0xb0 [raid1]
> [<0>] __blk_flush_plug+0xd7/0x150
> [<0>] blk_finish_plug+0x29/0x40
> [<0>] ext4_do_writepages+0x401/0xc90
> [<0>] ext4_writepages+0xad/0x180

If you want a few seconds and try grabbing cat /proc/29594/stack
again, what does the stack trace stay consistent as above?

Also, if you have iostat installed (usually part of the sysstat
package), does "iostat 1" show any I/O activity on the md device?
What about the underying block dvices used by the md device? If the
md device is attached to HDD's where you can see the activity light,
can you see (or hear) any disk activity?

This sure seems like either the I/O driver isn't processing requests,
or some kind of hang in the md layer....

- Ted