Re: [GIT PULL] bcachefs updates for 6.8

From: Kent Overstreet
Date: Thu Jan 11 2024 - 12:39:34 EST


On Thu, Jan 11, 2024 at 03:35:40PM +0000, Mark Brown wrote:
> On Wed, Jan 10, 2024 at 07:58:20PM -0500, Kent Overstreet wrote:
> > On Wed, Jan 10, 2024 at 04:39:22PM -0800, Kees Cook wrote:
>
> > > With no central CI, the best we've got is everyone running the same
> > > "minimum set" of checks. I'm most familiar with netdev's CI which has
> > > such things (and checkpatch.pl is included). For example see:
> > > https://patchwork.kernel.org/project/netdevbpf/patch/20240110110451.5473-3-ptikhomirov@xxxxxxxxxxxxx/
>
> > Yeah, we badly need a central/common CI. I've been making noises that my
> > own thing could be a good basis for that - e.g. it shouldn't be much
> > work to use it for running our tests in tools/tesing/selftests. Sadly no
> > time for that myself, but happy to talk about it if someone does start
> > leading/coordinating that effort.
>
> IME the actually running the tests bit isn't usually *so* much the
> issue, someone making a new test runner and/or output format does mean a
> bit of work integrating it into infrastructure but that's more usually
> annoying than a blocker.

No, the proliferation of test runners, test output formats, CI systems,
etc. really is an issue; it means we can't have one common driver that
anyone can run from the command line, and instead there's a bunch of
disparate systems with patchwork integration and all the feedback is nag
emails - after you've finished whan you were working on instead of
moving on to the next thing - with no way to get immediate feedback.

And it's because building something shiny and new is the fun part, no
one wants to do the grungy integration work.

> Issues tend to be more around arranging to
> drive the relevant test systems, figuring out which tests to run where
> (including things like figuring out capacity on test devices, or how
> long you're prepared to wait in interactive usage) and getting the
> environment on the target devices into a state where the tests can run.
> Plus any stability issues with the tests themselves of course, and
> there's a bunch of costs somewhere along the line.
>
> I suspect we're more likely to get traction with aggregating test
> results and trying to do UI/reporting on top of that than with the
> running things bit, that really would be very good to have. I've copied
> in Nikolai who's work on kcidb is the main thing I'm aware of there,
> though at the minute operational issues mean it's a bit write only.
>
> > example tests, example output:
> > https://evilpiepirate.org/git/ktest.git/tree/tests/bcachefs/single_device.ktest
> > https://evilpiepirate.org/~testdashboard/ci?branch=bcachefs-testing
>
> For example looking at the sample test there it looks like it needs
> among other things mkfs.btrfs, bcachefs, stress-ng, xfs_io, fio, mdadm,
> rsync

Getting all that set up by the end user is one command:
ktest/root_image create
and running a test is one morecommand:
build-test-kernel run ~/ktest/tests/bcachefs/single_device.ktest

> and a reasonably performant disk with 40G of space available.
> None of that is especially unreasonable for a filesystems test but it's
> all things that we need to get onto the system where we want to run the
> test and there's a lot of systems where the storage requirements would
> be unsustainable for one reason or another. It also appears to take
> about 33000s to run on whatever system you use which is distinctly
> non-trivial.

Getting sufficient coverage in filesystem land does take some amount of
resources, but it's not so bad - I'm leasing 80 core ARM64 machines from
Hetzner for $250/month and running 10 test VMs per machine, so it's
really not that expensive. Other subsystems would probably be fine with
less resources.