Re: [PATCH 0/1] *** Fix kill(-1,s) returning 0 on 0 kills ***

From: Eric W. Biederman
Date: Fri Aug 11 2023 - 17:27:56 EST


Petr Skocik <pskocik@xxxxxxxxx> writes:

> Thanks for the detailed analysis, Eric W. Biederman.
>
> All my software really cares about is that I get some indication that a
> kill(-1,s) run from a non-root pid no longer had anything left to kill,
> which on Linux is currently being masked by a return value of 0 whereas BDSs
> nicely provide an ESRCH. -EPERM would work too (and would still be more useful
> to me than the current behavior), but I will still object to it because I'm
> convinced you're misreading POSIX here and ESRCH, not EPERM, is the error that
> should be returned here.

Thank you for saying any error return is good enough for your
application. It is definitely a bug that Linux reports success when no
signal has been delivered.

I dug into this a little bit more and found that Illumos and it's
ancestor OpenSolaris can return EPERM, even when sending to all
processes, by reading the Illumos source code.

Reading through the rational of kill it says that it is sometimes
desirable to hide the existence of one process from another so that the
existence of a process will not be an information leak. To accommodate
that POSIX allows ESRCH instead of EPERM as an error code.

If you want you can read it for yourself here:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/kill.html


To sum up.

The function kill(2) should always report success when it has delivered
a signal and not otherwise.

The Linux version of kill(2) is buggy because it reports success when it
has not delivered a signal.

Different implementations of kill(2) do different things in this
situation and POSIX appears to allow the variation, so there is no
strong argument for any specific behavior (other than returning an
error) from a compatibility standpoint.

>From my perspective making the implementation of sending a signal to all
processes (-1) handle errors the same as sending a signal to a process
group (-pgrp) seems like the most sensible way to fix this bug in Linux.

I can see an argument for hiding the existence of processes and
returning ESRCH but if/when we go down that road I would just ask that
we be consistent and update all of the signal sending functions at the
same time.

I will see about writing a commit message and posting a final patch in
just a little bit.

Eric