Re: silent semantic changes with reiser4

From: Jamie Lokier
Date: Wed Aug 25 2004 - 20:08:36 EST

Next message: Chris Wright: "Re: Using fs views to isolate untrusted processes: I need an assistant architect in the USA for Phase I of a DARPA funded linux kernel project"
Previous message: David S. Miller: "Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)"
In reply to: viro: "Re: silent semantic changes with reiser4"
Next in thread: viro: "Re: silent semantic changes with reiser4"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

viro@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx wrote:
> > Is this a problem if we treat entering a file-as-directory as crossing
> > a mount point (i.e. like auto-mounting)?
>
> Yes - mountpoints can't be e.g. unlinked. Moreover, having directory
> mounted on non-directory is also an interesting situation.

Ok, so can we make it so mountpoints can be unlinked? :)

The mount would continue to exist, but with no name, until its last
user disappears.

> > Simply doing a path walk would lock the file and then cross the mount
> > point to a directory.
>
> *Ugh*
>
> What would happen if you open that directory or chdir there? If it's
> "underlying file stays locked" - we are in even more obvious deadlocks.

I think the underlying file does not stay locked, and once you've
entered it as a directory, it can be unlinked.

If you have the directory open or chdir into it, then it _may_ have
the effect of keeping the file's storage allocated when you unlink it
-- just like when a file is unlinked while opened. As that is not a
user-visible property, it's a filesystem-specific implementation
detail as to whether it keeps the file's data in existence while the
metadata directories and/or files are open.

> > A way to ensure that preserves the lock order is to require that the
> > metadata is in a different filesystem to its file (i.e. not crossing a
> > bind mount to the same filesystem).
> >
> > That has the side effect of preventing hard links between metadata
> > files and non-metadata, which in my opinion is fine.
>
> We don't actually need a different fs - different vfsmount will do just fine.

> > The strict order is ensured by preventing bind mounts which create a
> > path cycle containing a file->metadata edge. One way to ensure that
> > is to prevent mounts on the metadata filesystems, but the rule doesn't
> > have to be that strict. This condition only needs to be checked in
> > the mount() syscall.
>
> You really don't want to lock mountpoint on path lookup, so I don't see
> how that would be relevant - it's a hell to clean up, for one thing
> (I've crossed ten mountpoints on the way, when do I unlock them and
> how do I prevent deadlocks from that?) Besides, different namespaces
> can have completely different mount trees, so tracking down all that
> stuff would be hell in its own right.

I didn't mean locking a chain of mountpoints, I meant the temporary
state where two dentries and/or inodes are locked, parent and child,
during a path walk. However I'm not very familiar with that part of
the VFS and I see that the current RCU dcache might not lock that much
during a path walk.

> The main issue I see with all schemes in that direction (and something
> like that could be made workable) is the semantics of unlink() on
> mountpoints. *Especially* with users being able to see attributes of
> files they do not own (e.g. reiser4 mode/uid/gid stuff). Ability to
> pin down any damn file on the system and make it impossible to replace
> is not something you want to give to any user.

I agree, users shouldn't be able to pin down a file.

I think unlink() should succeed on a file while something is visiting
inside its metadata directory.

It's a filesystem quality-of-implementation feature whether that
actually releases the file's data. It's a desirable feature because
one user shouldn't be able to pin another user's quota'd data if they
don't have permission to open the file, but if it's not implemented by
a filesystem then it doesn't break anything fundamental.

It's a semantics question whether unlinking a file makes the metadata
(i.e. "uid", "mode", "content-type" etc.) disappear at the same time,
or if the metadata stays around until the last visitor leaves it. A
filesystem might be able to keep the metadata in existence even if it
deletes the file's storage on unlink(), but it would be nice for the
VFS to declare which semantic is preferred.

One of the big potential uses for file-as-directory is to go inside
archive files, ELF files, .iso files and so on in a convenient way.
In those cases, if you open one of the virtually generated "archive
content" files, then you might expect the data to continue to exist
after the underlying file is unlinked. I think that's reasonable:
being inside an archive is very similar to having it open. There is
no quota pinning problem with this, because "archive content" files
should inherit permission restrictions from the underlying file. If
you can't read the file, then you can't read it's unpacked contents.

(reiser4 doesn't offer that last feature, but any of the myriad
userspace filesystem hooks could offer it if the VFS has approprate
auto-mounting file-as-directory hooks).

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Chris Wright: "Re: Using fs views to isolate untrusted processes: I need an assistant architect in the USA for Phase I of a DARPA funded linux kernel project"
Previous message: David S. Miller: "Re: TG3 doesn't work in kernel 2.4.27 (David S. Miller)"
In reply to: viro: "Re: silent semantic changes with reiser4"
Next in thread: viro: "Re: silent semantic changes with reiser4"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]