Re: [PATCH RESEND 2/5] seccomp: Add wait_killable semantic to seccomp user notifier

From: Tycho Andersen
Date: Wed Apr 28 2021 - 10:09:31 EST


On Wed, Apr 28, 2021 at 03:20:02PM +0200, Rodrigo Campos wrote:
> On Wed, Apr 28, 2021 at 1:10 PM Rodrigo Campos <rodrigo@xxxxxxxxxx> wrote:
> >
> > On Wed, Apr 28, 2021 at 2:22 AM Tycho Andersen <tycho@tycho.pizza> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 04:19:54PM -0700, Andy Lutomirski wrote:
> > > > User notifiers should allow correct emulation. Right now, it doesn't,
> > > > but there is no reason it can't.
> > >
> > > Thanks for the explanation.
> > >
> > > Consider fsmount, which has a,
> > >
> > > ret = mutex_lock_interruptible(&fc->uapi_mutex);
> > > if (ret < 0)
> > > goto err_fsfd;
> > >
> > > If a regular task is interrupted during that wait, it return -EINTR
> > > or whatever back to userspace.
> > >
> > > Suppose that we intercept fsmount. The supervisor decides the mount is
> > > OK, does the fsmount, injects the mount fd into the container, and
> > > then the tracee receives a signal. At this point, the mount fd is
> > > visible inside the container. The supervisor gets a notification about
> > > the signal and revokes the mount fd, but there was some time where it
> > > was exposed in the container, whereas with the interrupt in the native
> > > syscall there was never any exposure.
> >
> > IIUC, this is solved by my patch, patch 4 of the series. The
> > supervisor should do the addfd with the flag added in that patch
> > (SECCOMP_ADDFD_FLAG_SEND) for an atomic "addfd + send".
>
> Well, under Andy's proposal handling that is even simpler. If the
> signal is delivered after we added the fd (note that the container
> syscall does not return when the signal arrives, as it happens today,
> it just signals the notifier and continues to wait), we can just
> ignore the signal and return that (if that is the appropriate thing
> for that syscall, but I guess after adding an fd there isn't any other
> reasonable thing to do).

Yes, agreed. After thinking about this more, my example is bogus: the
kernel doesn't sleep after it installs the fd, so it would ignore any
signals too.

Even if the kernel *did* sleep after installing the fd, it would still
be correct emulation to install it and then do whatever the kernel did
during that sleep. So I withdraw my objection :)

Thanks,

Tycho