In what sense? readlink of /proc/PID/fd/* will provide a pathname
relative to current's root: useless for any paths not in current's
namespace.
Not readlink, but actual dereference of link will give you exactly the
vfsmount/dentry the file was opened on. If you want to bind/move
whatever on that mount, that's possible, even if it's a "detached
tree".
Removing proc_check_root essentially removes any protection that namespaces provided in the first place.
Think of it like virtual memory. You can fork off and create your own COW instance, and no one can see what you are doing unless they ptrace you or explicitly ask you. The mountfd model allows for explicit handing off of vfsmounts between namespaces without allowing arbitrary access.
Note: I didn't say we should remove proc_check_root(). I said _if_
you remove it, _then_ mounting on foreign namespace will work, which
has actually been confirmed experimentally.
So your follow_link argument doesn't hold. The proc code does the
follow_link in some clever way, that the looked up object will end up
with the same vfsmount/dentry pair as the file.
Yet even this doesn't allow userspace to define it's own policy for inter-namespace manipulation.
OK. Let's keep the kernel policy simple: just allow a process to
access it's _own_ file descriptors in proc. I.e. allow access to
/proc/self/fd/* even if it comes from a foreign namespace.
Then the policy (who passes whom the file descriptors) is entirely up
to userspace (just as with your scheme).
/proc/self/fd/FD doesn't give any extra rights to the process, it just
makes mounting from/to detached mounts, and foreign namespaces
possible without new interfaces.
Beware that due to the detached-subtrees bit, the locking became a bit ugly, requiring a global rw_lock for mntget/mntput. I still haven't figured out a better way to keep per-vfsmounts counts and per-subtree counts in sync.
Did you measure the effect on performace? Maybe it isn't so bad.