Re: [LSF TOPIC] statx extensions for subvol/snapshot filesystems & more

From: Miklos Szeredi
Date: Thu Feb 22 2024 - 04:14:48 EST


On Wed, 21 Feb 2024 at 22:08, Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> On Wed, Feb 21, 2024 at 04:06:34PM +0100, Miklos Szeredi wrote:
> > On Wed, 21 Feb 2024 at 01:51, Kent Overstreet <kent.overstreet@xxxxxxxxx> wrote:
> > >
> > > Recently we had a pretty long discussion on statx extensions, which
> > > eventually got a bit offtopic but nevertheless hashed out all the major
> > > issues.
> > >
> > > To summarize:
> > > - guaranteeing inode number uniqueness is becoming increasingly
> > > infeasible, we need a bit to tell userspace "inode number is not
> > > unique, use filehandle instead"
> >
> > This is a tough one. POSIX says "The st_ino and st_dev fields taken
> > together uniquely identify the file within the system."
> >
>
> Which is what btrfs has done forever, and we've gotten yelled at forever for
> doing it. We have a compromise and a way forward, but it's not a widely held
> view that changing st_dev to give uniqueness is an acceptable solution. It may
> have been for overlayfs because you guys are already doing something special,
> but it's not an option that is afforded the rest of us.

Overlayfs tries hard not to use st_dev to give uniqueness and instead
partitions the 64bit st_ino space within the same st_dev. There are
various fallback cases, some involve switching st_dev and some using
non-persistent st_ino.

What overlayfs does may or may not be applicable to btrfs/bcachefs,
but that's not my point. My point is that adding a flag to statx does
not solve anything. You can't just say that from now on btrfs
doesn't have use unique st_ino/st_dev because we've just indicated
that in statx and everything is fine. That will trigger the
no-regressions rule and then it's game over. At least I would expect
that to happen.

What we can do instead is introduce a new API that is better, and
thankfully we already have one in the form of file handles. The
problem I see is that you think you can get away with then reverting
back st_dev to be uniform across subvolumes. But you can't. I see
two options:

a) do some hacks, like overlayfs does

b) introduce a new "st_dev_v2" that will do the right thing and
applications can move over.

Thanks,
Miklos