Re: 6.5.0rc5 fs hang - ext4? raid?

From: Dr. David Alan Gilbert
Date: Tue Aug 15 2023 - 08:56:32 EST


* Theodore Ts'o (tytso@xxxxxxx) wrote:
> On Mon, Aug 14, 2023 at 09:02:53PM +0000, Dr. David Alan Gilbert wrote:
> > dg 29594 29592 0 18:40 pts/0 00:00:00 /usr/bin/ar --plugin /usr/libexec/gcc/x86_64-redhat-linux/13/liblto_plugin.so -csrDT src/intel/perf/libintel_perf.a src/intel/perf/libintel_perf.a.p/meson-generated_.._intel_perf_metrics.c.o src/intel/perf/libintel_perf.a.p/intel_perf.c.o src/intel/perf/libintel_perf.a.p/intel_perf_query.c.o src/intel/perf/libintel_perf.a.p/intel_perf_mdapi.c.o
> >
> > [root@dalek dg]# cat /proc/29594/stack
> > [<0>] md_super_wait+0xa2/0xe0
> > [<0>] md_bitmap_unplug+0xd2/0x120
> > [<0>] flush_bio_list+0xf3/0x100 [raid1]
> > [<0>] raid1_unplug+0x3b/0xb0 [raid1]
> > [<0>] __blk_flush_plug+0xd7/0x150
> > [<0>] blk_finish_plug+0x29/0x40
> > [<0>] ext4_do_writepages+0x401/0xc90
> > [<0>] ext4_writepages+0xad/0x180
>
> If you want a few seconds and try grabbing cat /proc/29594/stack
> again, what does the stack trace stay consistent as above?

I'll get back to that and retry it.

> Also, if you have iostat installed (usually part of the sysstat
> package), does "iostat 1" show any I/O activity on the md device?
> What about the underying block dvices used by the md device? If the
> md device is attached to HDD's where you can see the activity light,
> can you see (or hear) any disk activity?

It's spinning rust, and I hear them go quiet when the hang happens.

Dave

> This sure seems like either the I/O driver isn't processing requests,
> or some kind of hang in the md layer....
>
> - Ted
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux | Happy \
\ dave @ treblig.org | | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/