RE: [PATCH v2] signal: Adjust error codes according to restore_user_sigmask()

From: David Laight
Date: Fri May 24 2019 - 06:01:32 EST


From: Deepa Dinamani
> Sent: 23 May 2019 19:07
> Ok, since there has been quite a bit of argument here, I will
> backtrack a little bit and maybe it will help us understand what's
> happening here.
> There are many scenarios being discussed on this thread:
> a. State of code before 854a6ed56839a
> b. State after 854a6ed56839a
> c. Proposed fix as per the patchset in question.
>
> Oleg, I will discuss these first and then we can discuss the
> additional changes you suggested.
>
> Some background on why we have these syscalls that take sigmask as an
> argument. This is just for the sake of completeness of the argument.
>
> These are particularly meant for a scenario(d) such as below:
>
> 1. block the signals you don't care about.
> 2. syscall()
> 3. unblock the signals blocked in 1.
>
> The problem here is that if there is a signal that is not blocked by 1
> and such a signal is delivered between 1 and 2(since they are not
> atomic), the syscall in 2 might block forever as it never found out
> about the signal.

I think we all agree about the underlying problem these system calls solve.

> As per [a] and let's consider the case of epoll_pwait only first for simplicity.
For simplicity you ought to consider sigwaitinfo() :-)

> As I said before, ep_poll() is what checks for signal_pending() and is
> responsible for setting errno to -EINTR when there is a signal.

Ah, there in lies the problem (well one of them).
ep_poll() should only return -EINTR if its sleep (waiting for an fd to
be ready) is interrupted.
The signal handler(s) should still be called though.
If the timeout is 0 then any signal handler should be called, but the
return value is still 0 (if no fd are 'ready').

> So if a signal is received after ep_poll() and ep_poll() returns
> success, it is never noticed by the syscall during execution.
> So the question is does the userspace have to know about this signal
> or not. From scenario [d] above, I would say it should, even if all
> the fd's completed successfully.

What is scenario [d]? You've code versions a/b/c but no scenarios.

> This does not happen in [a]. So this is what I said was already broken.
>
> What [b] does is to move the signal check closer to the restoration of
> the signal. This way it is good. So, if there is a signal after
> ep_poll() returns success, it is noticed and the signal is delivered
> when the syscall exits. But, the syscall error status itself is 0.

By 0 you mean >= 0 ??

> So now [c] is adjusting the return values based on whether extra
> signals were detected after ep_poll(). This part was needed even for
> [a].

IMHO The return value should never be changed.
Much like write() can return a partial length if a signal happens.

ISTM that a user signal handler should be scheduled to be run
if the signal is pending and not masked.
On return to user all scheduled signal handlers are called
(regardless as to whether the signal is masked at that time).
This might mean getting the 'return to user' code to restore
the original signal mask saved for epoll_pwait() and pselect() etc.

If, for some perverted reason (compatibility with broken apps),
you need epoll_pwait() to return EINTR instead of 0 the you
probably need a special 'kernel internal' errno value that
is always converted by the syscall exit code to EINTR/0
(and a second one that is EINTR/EAGAIN etc).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)