Re: [GIT PULL] bcachefs

From: Kent Overstreet
Date: Fri Jul 07 2023 - 09:53:06 EST


On Fri, Jul 07, 2023 at 03:13:06PM +0200, Jan Kara wrote:
> On Thu 06-07-23 18:43:14, Kent Overstreet wrote:
> > On Thu, Jul 06, 2023 at 02:19:14PM -0700, Darrick J. Wong wrote:
> > > /me shrugs, been on vacation and in hospitals for the last month or so.
> > >
> > > > bcachefs doesn't use sget() for mutual exclusion because a) sget()
> > > > is insane, what we really want is the _block device_ to be opened
> > > > exclusively (which we do), and we have our own block device opening
> > > > path - which we need to, as we're a multi device filesystem.
> > >
> > > ...and isn't jan kara already messing with this anyway?
> >
> > The blkdev_get_handle() patchset? I like that, but I don't think that's
> > related - if there's something more related to sget() I haven't seen it
> > yet
>
> There's a series on top of that that also modifies how sget() works [1].
> Christian wants that bit to be merged separately from the bdev handle stuff
> and Christoph chimed in with some other related cleanups so he'll now take
> care of that change.
>
> Anyhow we should have sget() that does not exclusively claim the bdev
> unless it needs to create a new superblock soon.

Thanks for the link

sget() felt a bit odd in bcachefs because we have our own bch2_fs_open()
path that, completely separately from the VFS opens a list of block
devices and returns a fully constructed filesystem handle.

We need this because it's also used in userspace, where we don't have
the VFS and it wouldn't make much sense to lift sget(), for e.g. fsck
and other tools.

IOW, we really do need to own the whole codepath that opens the actual
block devices; our block device open path does things like parse the
opts struct to decide whether to open the block device in write mode or
exclusive mode...

So the way around this in bcachefs is we call sget() twice - first in
"find an existing sb but don't create one" mode, then if that fails we
call bch2_fs_open() and call sget() again in "create a super_block and
attach it to this bch_fs" - a bit awkward but it works.

Not sure if this has come up in other filesystems, but here's the
relevant bcachefs code:
https://evilpiepirate.org/git/bcachefs.git/tree/fs/bcachefs/fs.c#n1756