Re: [PATCH] ipc/mqueue.c: change __do_notify() to bypass check_kill_permission()

From: Oleg Nesterov
Date: Tue Mar 24 2020 - 06:35:39 EST


On 03/23, Eric W. Biederman wrote:
>
> Oleg Nesterov <oleg@xxxxxxxxxx> writes:
>
> > And this is what we had before cc731525f26a, so this patch tries to fix
> > the regression.
>
> I was intending to suggest a new function that took a pid and did not do
> the permission checks.

Can't we do this on top of this patch? I want a trivially backportable fix
for -stable.

And who else can use this helper? And what exactly it should do? Should it
be called with rcu lock held? Should it check sig != 0? Name?

> Looking a little further I think there is a reasonbly strong argument
> that this code should be using send_sigqueue with a preallocated signal,
> just like timers do.

Hmm. How can mqueue use SIGQUEUE_PREALLOC/send_sigqueue ? The timer can
pre-allocate sigqueue because it won't be used again until dequeued by
its single target, otherwise it just increments overrun.

mqueue can't. Just suppose that the user of mq_notify() blocks this signal,
then another task does mq_notify() and then __do_notify() is called again.

> > Oh, can we discuss the possible cleanups separately? On top of this fix,
> > if possible.
>
> I was saying that from my perspective your proposed fix appears to make
> the code more of a mess, and harder to maintain.

I don't really agree, but I won't argue.

> >> Looking at the code I currently see several places where we have this
> >> kind of semantic (sending a requested signal to a process from the
> >> context of another process): do_notify_parent, pdeath_signal, f_setown,
> >> and mq_notify.
> >
> > To me they all differ, I am not sure I understand how exactly you want
> > to unify them...
>
> The common thread is they all are requested by the receiver of the
> signal (so don't need permission checks) and thus need to be canceled by
> appropriate versions of exec.

Yes. Yet I think they all differ, in particular in how they handle the
exec case. So I still do not understand a) how you are going to unify this
logic and b) why should we do this _before_ the fix.

> > I can easily misread this code, never looked into ipc/mqueue.c before.
> > But it seems that it is not possible to send a signal after exec, suid
> > or not,
> >
> > - sys_mq_open() uses O_CLOEXEC
> >
> > - mqueue_flush_file() does
> >
> > if (task_tgid(current) == info->notify_owner)
> > remove_notification(info);
>
> That is weird. It looks like we are attempt to handle file descriptor
> passing. The unix98 description of exec says all mq file descriptors
> shall be closed, but I can't find a word about file descriptor passing
> with af_unix sockets.

Not sure I understand how this connects to fd-passing...

What I tried to say is that mqueue_fd will be closed on exec. This is not
not enough, but mqueue_flush_file==mqueue_file_operations->flush called
by filp_close() will remove the notification to ensure the signal can't
be sent after that.

> > I know absolutely nothing about ipc/mqueue, and when I read this code
> > or manpage I find the semantics of mq_notify is very strange.
>
> Well at least a small comment please that says the code started
> performing the permission check and userspace code regressed.

Ah, OK, agreed, I'll add a comment.

> Perhaps with the description of the userspace code in the commit log.
> Perhaps a test case?

OK, how about

static int notified;

static void sigh(int sig)
{
notified = 1;
}

int main(void)
{
signal(SIGIO, sigh);

int fd = mq_open("/mq", O_RDWR|O_CREAT, 0666, NULL);
assert(fd >= 0);

struct sigevent se = {
.sigev_notify = SIGEV_SIGNAL,
.sigev_signo = SIGIO,
};
assert(mq_notify(fd, &se) == 0);

if (!fork()) {
setuid(1);
mq_send(fd, "",1,0);
return 0;
}

wait(NULL);
mq_unlink("/mq");
assert(notified);
return 0;
}

? Needs root.

Just in case... it passes on my machine running 4.2.8-300.fc23.x86_64, fails
with upstream kernel, the patch seems to work and looks very simple.

Oleg.