Re: linux 5.14.3: free_user_ns causes NULL pointer dereference

From: Eric W. Biederman
Date: Mon Oct 04 2021 - 13:19:28 EST


ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:

> Adding Rune Kleveland to the discussion as he also seems to have
> reproduced the issue.
>
> Alex and I have been starring at the code and the reports and this
> bug is hiding well. Here is what we have figured out so far.
>
> Both the warning from free_user_ns calling dec_ucount that Jordan Glover
> reported and the KASAN error that Yu Zhao has reported appear to have
> the same cause. Using a ucounts structure after it has been freed and
> reallocated as something else.
>
> I have just skimmed through the recent report from Rune Kleveland
> and it appears also to be a use after free. Especially since the
> second failure in the log is slub complaining about trying to free
> the ucounts data structure.
>
> We looked through the users of put_ucounts and we don't see any obvious
> buggy users that would be freeing the data structure early.
>
> Alex has tried to reproduce this so far is not having any luck.
> Folks can you tell what compiler versions you are using and share your
> kernel config with us? That might help.
>
> The little debug diff below is my guess of what is happening. If the
> folks who can reproduce this issue can try the patch below and let me
> know if the warnings fire that would be appreciated. It is still not
> enough to track down the bug but at least it will confirm my current
> hypothesis about how things look before there is a use of memory after
> it is freed.

Bah. Scratch that test patch. I just double checked myself and
cred->ucounts and cred->user_ns->ucounts should never be equal,
as the user namespace is counted in it's parent user namespace.

That observation now tells me I have a parent user namespace that went
corrupt.

Back to the drawing board.


> Thank you,
> Eric
>
> diff --git a/kernel/cred.c b/kernel/cred.c
> index f784e08c2fbd..e7ffaa3cf5a6 100644
> --- a/kernel/cred.c
> +++ b/kernel/cred.c
> @@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu)
> if (cred->group_info)
> put_group_info(cred->group_info);
> free_uid(cred->user);
> +#if 1
> + if ((cred->ucounts == cred->user_ns->ucounts) &&
> + (atomic_read(&cred->ucounts->count) == 1)) {
> + WARN_ONCE(1, "put_cred_rcu: ucount count 1\n");
> + }
> +#endif
> if (cred->ucounts)
> put_ucounts(cred->ucounts);
> put_user_ns(cred->user_ns);
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 91a43e57a32e..60fd88b34c1a 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -743,6 +743,13 @@ void __noreturn do_exit(long code)
> if (unlikely(!tsk->pid))
> panic("Attempted to kill the idle task!");
>
> +#if 1
> + if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) &&
> + (atomic_read(tsk->cred->ucounts->count) == 1)) {
> + WARN_ONCE(1, "do_exit: ucount count 1\n");
> + }
> +#endif
> +
> /*
> * If do_exit is called because this processes oopsed, it's possible
> * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before