Re: [PATCH 0/1] *** Fix kill(-1,s) returning 0 on 0 kills ***

From: Petr Skocik
Date: Wed Aug 09 2023 - 08:27:36 EST


Hi.

Is there anything else I can do to help get this (or some other equivalent change that results in kill(-1,s) returning -ESRCH when it has nothing to kill (like it does on the BSDs),
as opposed to the current return value of 0 in that case) incorporated into mainline Linux?

It would rather help some of the user software I'm developing, and the slightly new semantics are IMO definitely reasonable (BSDs have them).

Basically, the current code:
        int retval = 0, count = 0;
        struct task_struct * p;

        for_each_process(p) {
            if (task_pid_vnr(p) > 1 &&
                    !same_thread_group(p, current)) {
                int err = group_send_sig_info(sig, info, p,
                                  PIDTYPE_MAX);
                ++count;
                if (err != -EPERM)
                    retval = err;
            }
        }
        ret = count ? retval : -ESRCH;

counts kill attempts at non-1, other-process pids  and sets hardcoded -ESRCH only if no such attempts are made, which will almost never happen

for a nonroot EUID, because there will typically be non-pid-1 processes unkillable by the nonroot EUID, but the code will still count those kill attempts, and thus not return the hardcoded -ESRCH even if ALL of those kill attemtpts return -EPERM, in which case -ESRCH would be in order too, because there were no processes that the current EUID had permission to kill (BDSs indeed return ESRCH in such a case).

(The kernel shouldn't need to concern itself with possible racy creation of new EUID-killable processes during the kill(-1,s) walk. Either the system can be known not to have running superuser code that could racily create such EUID-killable processes and then such a kill-returned -ESRCH would be useful, or it cannot be known not to have such running superuser code, in which case the -ESRCH is transient and should be droped by the user).

The current code also implicitly assumes either all non-EPERM kill attempts return -EINVAL (invalid signal) or they
all return 0 (success). This assumption should be valid because either the signal number is invalid and stays invalid, or it is valid and
the only possible error is -EPERM (this isn't sigqueue so the kill shouldn't ever fail with -ENOMEM). If the assumption were not valid,
then the current code could overshadow a previous failed attempt with a later succesful one, returning success even if there were some non-EPERM failures.

My change proposes:

        struct task_struct * p;

        ret = -ESRCH;
        for_each_process(p) {
            if (task_pid_vnr(p) > 1 &&
                    !same_thread_group(p, current)) {
                int err = group_send_sig_info(sig, info, p,
                                  PIDTYPE_MAX);
                if (err != -EPERM)
                    ret = err; /*either all 0 or all -EINVAL*/
            }
        }

i.e., start with -ESRCH (nothing to kill) and any non-EPERM kill attempts change it to the last return value
--either all 0 or all -EINVAL as per the implicit assumption of the original code.

It passes the tests put forth by Kees Cook.

More defensively, the implicit assumption of the original code could be made explicit:


        struct task_struct * p;
        int has_last_err = 0;

        ret = -ESRCH;
        for_each_process(p) {
            if (task_pid_vnr(p) > 1 &&
                    !same_thread_group(p, current)) {
                int err = group_send_sig_info(sig, info, p,
                                  PIDTYPE_MAX);
                if (err != -EPERM){
                    if (has_last_err)
                        BUG_ON(ret != err); /*either all 0 or all -EINVAL*/
                    has_last_err = 1;
                    ret = err;
                }
            }
        }

or dropped;

        struct task_struct * p;
        int has_last_err = 0;

        ret = -ESRCH;
        for_each_process(p) {
            if (task_pid_vnr(p) > 1 &&
                    !same_thread_group(p, current)) {
                int err = group_send_sig_info(sig, info, p,
                                  PIDTYPE_MAX);
                if (err != -EPERM){
                    if (has_last_err){
                        if (err >= 0)
                            continue; /*don't mask previous failure with later success*/
                    }
                    has_last_err = 1;
                    ret = err;
                }
            }
        }

Thanks again for consideration. Criticism welcome.

Regards,
Petr Skocik


On 11/22/22 18:15, Kees Cook wrote:
On Tue, Nov 22, 2022 at 05:12:40PM +0100, Petr Skocik wrote:
Hi. I've never sent a kernel patch before but this one seemed trivial,
so I thought I'd give it a shot.

My issue: kill(-1,s) on Linux doesn't return -ESCHR when it has nothing
to kill.
It looks like LTP already tests for this, and gets -ESRCH?
https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/containers/pidns/pidns10.c

Does it still pass with your change?