Re: 6.6.8 stable: crash in folio_mark_dirty

From: Genes Lists
Date: Sat Dec 30 2023 - 14:16:51 EST

Next message: Kent Overstreet: "Re: Stack corruption in bch2_nocow_write"
Previous message: Lukas Wunner: "Re: [PATCH v3 09/10] thermal: Add PCIe cooling driver"
In reply to: Matthew Wilcox: "Re: 6.6.8 stable: crash in folio_mark_dirty"
Next in thread: Hillf Danton: "Re: 6.6.8 stable: crash in folio_mark_dirty"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Sat, 2023-12-30 at 18:02 +0000, Matthew Wilcox wrote:
>
> Thanks for the report. Apologies, I'm on holiday until the middle of
> the week so this will be extremely terse.
>

Enjoy 🙂

> >
> Dec 30 07:00:36 s6 kernel: CPU: 0 PID: 521524 Comm: rsync Not tainted
> So rsync is exiting. Do you happen to know what rsync is doing?
> .

There are 2 rsyncs I can think of:

(a) rsync from another server (s8) pushing files over the local
network to this machine (s6). rsync writes to the raid drives on s6.

s8 says the rsync completed successfully at 3:04 am (about 4 hours
prior to this error at 7.00 am).

(b) There is also a script running inotify which uses rsync to keep
the spare root drive sync'ed. System had update at 5:48 am of a few
packages, and that would have caused an rsync from root on nvme to
sapre on sdg. Most likely this is this one that triggered around 7 am.

This one runs:

/usr/bin/rsync --open-noatime --no-specials --delete --atimes -
axHAX --times <src> <dst>

> t looks llike rsync has a page from the block device mmaped? I'll
> have
> to investigate this properly when I'm back. If you haven't heard
> from
> me in a week, please ping me.

Thank you.

>
> (I don't think I caused this, but I think I stand a fighting chance
> of
> tracking down what the problem is, just not right now).

This may or may not be related, but this same machine crashed during an
rsync same as (a) above (i.e. s8 pushing files to the raid6 disks on
s6) about 3 weeks ago - then was on 6.6.4 kernel. In that case the
error was in md code.

https://lore.kernel.org/lkml/e2d47b6c-3420-4785-8e04-e5f217d09a46@xxxxxxxxxxxxx/T/

Thank you again,

gene

Next message: Kent Overstreet: "Re: Stack corruption in bch2_nocow_write"
Previous message: Lukas Wunner: "Re: [PATCH v3 09/10] thermal: Add PCIe cooling driver"
In reply to: Matthew Wilcox: "Re: 6.6.8 stable: crash in folio_mark_dirty"
Next in thread: Hillf Danton: "Re: 6.6.8 stable: crash in folio_mark_dirty"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]