Re: [RFC] coredump: Do not interrupt dump for TIF_NOTIFY_SIGNAL

From: Eric W. Biederman
Date: Wed Jun 09 2021 - 16:48:45 EST


Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> writes:

> On Wed, Jun 9, 2021 at 1:17 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>>>
>> In short the coredump code deliberately supports being interrupted by
>> SIGKILL, and depends upon prepare_signal to filter out all other
>> signals.
>
> Hmm.
>
> I have to say, that looks like the core reason for the bug: if you
> want to be interrupted by a fatal signal, you shouldn't use
> signal_pending(), you should use fatal_signal_pending().
>
> Now, the fact that we haven't cleared TIF_NOTIFY_SIGNAL for the first
> signal is clearly the immediate cause of this, but at the same time I
> really get the feeling that that coredump aborting code should always
> had used fatal_signal_pending().
>
> We do want to be able to abort core-dumps (stuck network filesystems
> is the traditional reason), but the fact that it used signal_pending()
> looks buggy.
>
> In fact, the very comment in that dump_interrupted() function seems to
> acknowledge that signal_pending() is all kinds of silly.
>
> So regardless of the fact that io_uring does seem to have messed up
> this part of signals, I think the fix is not to change
> signal_pending() to task_sigpending(), but to just do what the comment
> suggests we should do.

It looks like it would need to be:

static bool dump_interrupted(void)
{
return fatal_signal_pending() || freezing();
}

As the original implementation of dump_interrupted 528f827ee0bb
("coredump: introduce dump_interrupted()") is deliberately allowing the
freezer to terminate the core dumps to allow for reliable system
suspend.

>
> But also:
>
>> With the io_uring code comes an extra test in signal_pending
>> for TIF_NOTIFY_SIGNAL (which is something about asking a task to run
>> task_work_run).
>
> Jens, is this still relevant? Maybe we can revert that whole series
> now, and make the confusing difference between signal_pending() and
> task_sigpending() go away again?
>
> Linus