Re: [rfc][possible solution] RCU vfsmounts

From: Linus Torvalds
Date: Sat Sep 28 2013 - 16:43:55 EST


On Sat, Sep 28, 2013 at 1:27 PM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> FWIW, I think I have a kinda-sorta solution for that and I'd like
> to hear your comments on that. I want to replace vfsmount_lock with seqlock
> and store additional seq number in nameidata, set to vfsmount_seq in the
> beginning and rechecked in unlazy_walk/complete_walk.

Yes, that would be lovely.

> The obvious variant would be to have unlazy_walk/complete_walk to
> grab refcount, check vfsmount_seq and mntput on mismatch. The trouble
> with that is race with what would've been the final mntput() done by
> umount(2); complete_walk() would drop that temporary reference and
> fail, all right, but... we would get a umount(2) returning without having
> actually shut the filesystem down. Said shutdown would happen in whoever
> had been doing pathname resolution that stepped into the race.

Sounds reasonable to to me.

Side note: I really wish there was some way to avoid having to
finalize the vfsmount entirely for some common things. For example,
"[l]stat[at]()" really doesn't need it for the common cases (network
filesystems may need to revalidate), and is a very critical operation,
and we *could* just look up the inode under RCU and never finalize the
dentry _or_ the vfsmount. However, very annoyingly, the security layer
wants the vfsmount, and we don't know if that is RCU-safe...

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/