Re: [RFC] proc: don't show kernel-internal mnt_id in fdinfo (if possible)

From: Eric W. Biederman
Date: Mon Nov 07 2016 - 12:09:16 EST


Dmitry Safonov <dsafonov@xxxxxxxxxxxxx> writes:

> Some kernel special fs could be mounted by userspace, let's show
> userspace mnt_id in those cases.

You are asking the kernel to lie to userspace in the case when
the truth in inconvenient. That seems blantantly wrong.

In the case of an internal mount you may want to use the device id
of the device the filesystem is mounted on so you can tie all of
the mounts together. That would allow restore and other things
to do something useful.

That information is available with fstat on a file descriptor so I don't
think we need to export it. But if it is not available I can see
the point of a patch.

Outright ling to userpsace. No. That is just horrible.

Nacked-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>

>
> Otherwise, I got:
> [~]# cat /proc/11299/fdinfo/3
> pos: 0
> flags: 02000002
> mnt_id: 14
> [~]# cat /proc/11299/mountinfo | grep '^14'
> [~]# ls -l /proc/11299/fd/3
> lrwx------. 1 root root 64 Nov 7 18:30 /proc/11299/fd/3 -> /test-queue
> [~]# ls /dev/mqueue/
> test-queue
> [~]# cat /proc/11299/mountinfo | grep mqueue
> 32 18 0:14 / /dev/mqueue rw,relatime shared:17 - mqueue mqueue rw,seclabel
>
> This happens because mqueue fs is mounted twice:
> - the first is kernel-internal mnt on init:
> kernel_init=>do_one_initcall=>init_mqueue_fs=>mq_init_ns=>vfs_kern_mount
> - the second time it's systemd's mount-unit:
> entry_SYSCALL_64_fastpath=>SyS_mount=>do_mount=>vfs_kern_mount
>
> For the purpose of userspace parsing, having in-kernel mnt_id is less
> useful then mnt_id of mount point: afterwards I'm able to see fs type,
> path, etc:
> [~]# cat /proc/11152/mountinfo | grep mqueue
> 32 18 0:14 / /dev/mqueue rw,relatime shared:18 - mqueue mqueue rw,seclabel
> [~]# cat /proc/11152/fdinfo/3
> pos: 0
> flags: 02000002
> mnt_id: 32
>
> On a bad side - if we've no userspace mount, we still can't tell a thing
> about opened fd..
>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
> Cc: Andrey Vagin <avagin@xxxxxxxxxx>
> Cc: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx>
> Cc: Rob Landley <rob@xxxxxxxxxxx>
> Signed-off-by: Dmitry Safonov <dsafonov@xxxxxxxxxxxxx>
> ---
> fs/proc/fd.c | 24 +++++++++++++++++++++---
> 1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/fs/proc/fd.c b/fs/proc/fd.c
> index d21dafef3102..bfa8699bcd8e 100644
> --- a/fs/proc/fd.c
> +++ b/fs/proc/fd.c
> @@ -9,6 +9,7 @@
> #include <linux/file.h>
> #include <linux/seq_file.h>
> #include <linux/fs.h>
> +#include <linux/nsproxy.h>
>
> #include <linux/proc_fs.h>
>
> @@ -19,9 +20,10 @@
> static int seq_show(struct seq_file *m, void *v)
> {
> struct files_struct *files = NULL;
> - int f_flags = 0, ret = -ENOENT;
> + int f_flags = 0, ret = -ENOENT, mnt_id = 0;
> struct file *file = NULL;
> struct task_struct *task;
> + struct mount *mount;
>
> task = get_proc_task(m->private);
> if (!task)
> @@ -52,9 +54,25 @@ static int seq_show(struct seq_file *m, void *v)
> if (ret)
> return ret;
>
> + mount = real_mount(file->f_path.mnt);
> + if (mount->mnt_ns == MNT_NS_INTERNAL) {
> + struct mount *mnt;
> +
> + lock_mount_hash();
> + list_for_each_entry(mnt, &mount->mnt_instance, mnt_instance) {
> + if (current->nsproxy->mnt_ns == mnt->mnt_ns) {
> + mnt_id = mnt->mnt_id;
> + break;
> + }
> + }
> + unlock_mount_hash();
> + }
> +
> + if (mnt_id == 0)
> + mnt_id = mount->mnt_id;
> +
> seq_printf(m, "pos:\t%lli\nflags:\t0%o\nmnt_id:\t%i\n",
> - (long long)file->f_pos, f_flags,
> - real_mount(file->f_path.mnt)->mnt_id);
> + (long long)file->f_pos, f_flags, mnt_id);
>
> show_fd_locks(m, file, files);
> if (seq_has_overflowed(m))