Re: [RFC 00/23] Enable block size > page size in XFS

From: Dave Chinner
Date: Fri Sep 22 2023 - 01:03:48 EST


On Thu, Sep 21, 2023 at 12:18:13AM -0700, Luis Chamberlain wrote:
> On Thu, Sep 21, 2023 at 04:03:56PM +1000, Dave Chinner wrote:
> > On Wed, Sep 20, 2023 at 09:57:56PM -0700, Luis Chamberlain wrote:
> > > On Wed, Sep 20, 2023 at 08:00:12PM -0700, Luis Chamberlain wrote:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux.git/log/?h=large-block-linus
> > > >
> > > > I haven't tested yet the second branch I pushed though but it applied without any changes
> > > > so it should be good (usual famous last words).
> > >
> > > I have run some preliminary tests on that branch as well above using fsx
> > > with larger LBA formats running them all on the *same* system at the
> > > same time. Kernel is happy.
>
> <-- snip -->
>
> > So I just pulled this, built it and run generic/091 as the very
> > first test on this:
> >
> > # ./run_check.sh --mkfs-opts "-m rmapbt=1 -b size=64k" --run-opts "-s xfs_64k generic/091"
>
> The cover letter for this patch series acknowledged failures in fstests.

But this is a new update, which you said fixed various issues, and
you posted this in direct response to the bug report I gave you.

> For kdevops now, we borrow the same last linux-next baseline:
>
> git grep "generic/091" workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_1024.txt:generic/091 # possible regression
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_16k.txt:generic/091
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_32k.txt:generic/091
> workflows/fstests/expunges/6.6.0-rc2-large-block-linus-nobdev/xfs/unassigned/xfs_reflink_64k_4ks.txt:generic/091
>
> So well, we already know this fails.

*cough*

-You- know it already fails.

And you are expecting people who try the code to somehow know
that you've explicitly ignored this fsx failure, especially after
all your words to tell us how much fsx testing it has passed?

And that's kinda my point - you're effusing about how much fsx
testing this has passed, yet it istill fails after just a handful of
ops in generic/091. The dissonance could break windows...

----

Fundamentally, when it comes to data integrity, it important to
exercise as much of the operational application space as quickly as
possible as it is that breadth of variation in operations that
flushes out more bugs and helps stabilises the code faster.

Why do you think we talk about the massive test matrix most
filesytsems have and how long it takes to iterate so much? It's
because iterating that complex test matrix is how we find all the
whacky, weird bugs in the code.

Concentrating on a single test configuration and running it over and
over again won't find bugs in code it doesn't exercise no matter how
long it is run for. Running such a setup in an automated environment
doesn't mean you get better code coverage, it just means you cover
the same narrow set of corner cases faster and more times. If it
works once, it should work a million times. Iterating it a billion
more times doesn't tell us anything additional, either.

Put simply: performing deep, homogenous testing on code that has known
data corruption bugs outside the narrow scope of the test case is
not telling us anything useful about the overall state of the code.
Indeed, turning off failing tests that are critical to validating the
correct operation of the code you are modifying is bad practice.

For code changes like this, all fsx testing in fstests should pass
before you post anything for review - even for an RFC. There is no
point reviewing code that doesn't work properly, nor wasting
people's time by encouraging them to test it when it's clear to you
that it's going to fail in various important ways.

Hence I think your testing is focussing on the wrong things and I
suspect that you've misunderstood the statements of "we'll need
billions of fsx ops to test this code" that various people have made
really meant. You've elevated running billions of fsx ops to your
primary "it works" gating condition, at the expense of making sure
all the other parts of the filesystem still work correctly.

The reality is that the returns from fsx diminish as the number of
ops go up. Once you've run the first hundred million fsx ops for a
given operations set, the chance that the next 100M ops will find a
new problem is -greatly- reduced. The vast majority of problems will
be found in the first 10M ops that are run in any given fsx
operation, and few bugs are found beyond the 100M mark. Yes, we
occasionally find one up in the billions, but that's rare and most
definitely not somethign to focus on when still developing RFC level
code.

Different fsx configurations change the operation set that is run -
mixing DIO reads with buffered writes, turning mmap on and off,
using AIO or io_uring rather than synchronous IO, etc. These all
exercise different code paths and corner cases and have vastly
different code interactions, and that is what we need to cover when
developing new code.

IOWs, we need coverage of the *entire operation space*, not just the
same narrow set of operations run billions of time. A wide focus
requires billions of ops to cover because it requires lots of
different application configurations to be run. In constrast, there
are only three fs configurations that matter: bs < PS, bs == PS and
bs > PS.

For example, 16kB, 32kB and 64kB filesystem configs exercise exactly
the same code paths in exactly the same way (e.g. both have non-zero
miniumum folio orders but only differ by what that order is). Hence
running the same test application configs on these different
filessytem configurations does actually not improve code coverage of
the testing at all. Testing all of them only increases the resources
required to the test a change, it does not improve the quality of
coverage of the testing being performed at all....

Hence I'd strongly suggest that, for the next posting of these
cahnge, you focus on making fstests pass without turning off any
failing tests, and that fsx is run with a wide variety of
configurations (e.g. modify all the fstests cases to run for a
configurable number of ops (e.g. via SOAK_DURATION)). We just don't
care at this point about finding that 1 in 10^15 ops bug because
it's code in development; what we actually care about is that
-everything- works correctly for the vast majority of use cases....

-Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx