Re: [PATCH] Fix race condition when exec'ing setuid files

From: Kees Cook
Date: Thu Oct 06 2022 - 03:01:42 EST


On Wed, Oct 05, 2022 at 08:06:15PM -0700, Kees Cook wrote:
> Dave, this tracks back to commit a6f76f23d297 ("CRED: Make execve() take
> advantage of copy-on-write credentials") ... any ideas what's happening
> here?

Er, rather, it originates before git history, but moved under lock in
commit 0bf2f3aec547 ("CRED: Fix SUID exec regression").

Eric, Al, Hugh, does this ring a bell?

It originates from 1da177e4c3f4 ("Linux-2.6.12-rc2") in git...

static inline int unsafe_exec(struct task_struct *p)
{
int unsafe = 0;
...
if (atomic_read(&p->fs->count) > 1 ||
atomic_read(&p->files->count) > 1 ||
atomic_read(&p->sighand->count) > 1)
unsafe |= LSM_UNSAFE_SHARE;

return unsafe;
}

Current code is:

static void check_unsafe_exec(struct linux_binprm *bprm)
{
struct task_struct *p = current, *t;
unsigned n_fs;
...
t = p;
n_fs = 1;
spin_lock(&p->fs->lock);
rcu_read_lock();
while_each_thread(p, t) {
if (t->fs == p->fs)
n_fs++;
}
rcu_read_unlock();

if (p->fs->users > n_fs)
bprm->unsafe |= LSM_UNSAFE_SHARE;
else
p->fs->in_exec = 1;
spin_unlock(&p->fs->lock);
}


Which seemed to take its form from:

0bf2f3aec547 ("CRED: Fix SUID exec regression")

Quoting the rationale for the checks:
...
moved the place in which the 'safeness' of a SUID/SGID exec was performed to
before de_thread() was called. This means that LSM_UNSAFE_SHARE is now
calculated incorrectly. This flag is set if any of the usage counts for
fs_struct, files_struct and sighand_struct are greater than 1 at the time the
determination is made. All of which are true for threads created by the
pthread library.

So, instead, we count up the number of threads (CLONE_THREAD) that are sharing
our fs_struct (CLONE_FS), files_struct (CLONE_FILES) and sighand_structs
(CLONE_SIGHAND/CLONE_THREAD) with us. These will be killed by de_thread() and
so can be discounted by check_unsafe_exec().

So, I think this is verifying that when attempting a suid exec, there is
no process out there with our fs_struct, file_struct, or sighand_struct
that would survive the de_thread() and be able to muck with the suid's
shared environment:

if (atomic_read(&p->fs->count) > n_fs ||
atomic_read(&p->files->count) > n_files ||
atomic_read(&p->sighand->count) > n_sighand)

Current code has eliminated the n_files and n_sighand tests:

n_sighand was removed by commit
f1191b50ec11 ("check_unsafe_exec() doesn't care about signal handlers sharing")

n_files was removed by commit
e426b64c412a ("fix setuid sometimes doesn't")

The latter reads very much like the current bug report. :) So likely the n_fs
test is buggy too...

After de_thread(), I see the calls to unshare_sighand() and
unshare_files(), so those check out.

What's needed to make p->fs safe? Doing an unshare of it seems possible,
since it exists half as a helper, unshare_fs(), and half open-coded in
ksys_unshare (see "new_fw").

Should we wire this up after de_thread() like the other two?

-Kees

--
Kees Cook