Re: [PATCH v4 0/5] userfaultfd: add /dev/userfaultfd for fine grained access control

From: Axel Rasmussen
Date: Mon Aug 01 2022 - 18:51:39 EST


On Mon, Aug 1, 2022 at 12:53 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
>
> On Aug 1, 2022, at 10:13 AM, Axel Rasmussen <axelrasmussen@xxxxxxxxxx> wrote:
>
> > ⚠ External Email
> >
> > I finished up some other work and got around to writing a v5 today,
> > but I ran into a problem with /proc/[pid]/userfaultfd.
> >
> > Files in /proc/[pid]/* are owned by the user/group which started the
> > process, and they don't support being chmod'ed.
> >
> > For the userfaultfd device, I think we want the following semantics:
> > - For UFFDs created via the device, we want to always allow handling
> > kernel mode faults
> > - For security, the device should be owned by root:root by default, so
> > unprivileged users don't have default access to handle kernel faults
> > - But, the system administrator should be able to chown/chmod it, to
> > grant access to handling kernel faults for this process more widely.
> >
> > It could be made to work like that but I think it would involve at least:
> >
> > - Special casing userfaultfd in proc_pid_make_inode
> > - Updating setattr/getattr for /proc/[pid] to meaningfully store and
> > then retrieve uid/gid different from the task's, again probably
> > special cased for userfautlfd since we don't want this behavior for
> > other files
> >
> > It seems to me such a change might raise eyebrows among procfs folks.
> > Before I spend the time to write this up, does this seem like
> > something that would obviously be nack'ed?
>
> [ Please avoid top-posting in the future ]

I will remember this. Gmail's default behavior is annoying. :/

>
> I have no interest in making your life harder than it should be. If you
> cannot find a suitable alternative, I will not fight against it.
>
> How about this alternative: how about following KVM usage-model?
>
> IOW: You open /dev/userfaultfd, but this is not the file-descriptor that you
> use for most operations. Instead you first issue an ioctl - similarly to
> KVM_CREATE_VM - to get a file-descriptor for your specific process. You then
> use this new file-descriptor to perform your operations (read/ioctl/etc).
>
> This would make the fact that ioctls/reads from different processes refer to
> different contexts (i.e., file-descriptors) much more natural.
>
> Does it sound better?

Ah, that I think is more or less what my series already proposes, if I
understand you correctly.

The usage is:

fd = open(/dev/userfaultfd) /* This FD is only useful for creating new
userfaultfds */
uffd = ioctl(fd, USERFAULTFD_IOC_NEW) /* Now you get a real uffd */
close(fd); /* No longer needed now that we have a real uffd */

/* Use uffd to register, COPY, CONTINUE, whatever */

One thing we could do now or in the future is extend
USERFAULTFD_IOC_NEW to take a pid as an argument, to support creating
uffds for remote processes.



And then we get the benefit of permissions for /dev nodes working very
naturally - they default to root, but can be configured by the
sysadmin via chown/chmod, or udev rules, or whatever.