Re: Linux regressions report for mainline [2023-04-16]

From: David Sterba
Date: Thu Apr 20 2023 - 15:02:56 EST


On Tue, Apr 18, 2023 at 12:11:51PM -0700, Linus Torvalds wrote:
> On Tue, Apr 18, 2023 at 11:20 AM David Sterba <dsterba@xxxxxxx> wrote:
> >
> > There's also in-memory cache of already trimmed ranges since last mount
> > so even running discard repeatedly (either fstrim or as mount option)
> > will not do extra IO. We try hard not to provoke the firmware bugs.

[...]

> Again, that's libata - odd crazy hardware. But it's exactly the odd
> crazy hardware that worries me. When the failure mode isn't "it's
> slow", but "it ATE MY WHOLE DISK", that's a scary scary problem.
>
> Hmm?
>
> I dunno. Maybe you have reason to believe that all of these cases have
> been fixed, or that some of these were caused by kernel bugs because
> we did things wrong, and those have been fixed.
>
> But the failure modes just makes me worry. From your email, it *seems*
> like you think that the failures were primarily performance-related.

No, the main concern is if discard works without destroying data,
performance is more like an optimization. I too worry about buggy
hardware, we have a page just about that
https://btrfs.readthedocs.io/en/latest/Hardware.html .

I've taken notes from your reply and will enhance the page, or page
about discard in particular. The info about device quirks/horkage could
be linked too or I thought about generating a static page from the
per-bus tables so it's on one page.

I'll send pull request with fixes for the regression.