Re: [PATCH-tip v4] sched: Fix NULL user_cpus_ptr check in dup_user_cpus_ptr()

From: Will Deacon
Date: Mon Nov 28 2022 - 07:02:36 EST


On Sun, Nov 27, 2022 at 08:43:27PM -0500, Waiman Long wrote:
> On 11/24/22 21:39, Waiman Long wrote:
> > In general, a non-null user_cpus_ptr will remain set until the task dies.
> > A possible exception to this is the fact that do_set_cpus_allowed()
> > will clear a non-null user_cpus_ptr. To allow this possible racing
> > condition, we need to check for NULL user_cpus_ptr under the pi_lock
> > before duping the user mask.
> >
> > Fixes: 851a723e45d1 ("sched: Always clear user_cpus_ptr in do_set_cpus_allowed()")
> > Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
>
> This is actually a pre-existing use-after-free bug since commit 07ec77a1d4e8
> ("sched: Allow task CPU affinity to be restricted on asymmetric systems").
> So it needs to be fixed in the stable release as well. Will resend the patch
> with an additional fixes tag and updated commit log.

Please can you elaborate on the use-after-free here? Looking at
07ec77a1d4e8, the mask is only freed in free_task() when the usage refcount
has dropped to zero and I can't see how that can race with fork().

What am I missing?

Will

> > kernel/sched/core.c | 32 ++++++++++++++++++++++++++++----
> > 1 file changed, 28 insertions(+), 4 deletions(-)
> >
> > [v4] Minor comment update
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8df51b08bb38..f2b75faaf71a 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2624,19 +2624,43 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
> > int dup_user_cpus_ptr(struct task_struct *dst, struct task_struct *src,
> > int node)
> > {
> > + cpumask_t *user_mask;
> > unsigned long flags;
> > + /*
> > + * Always clear dst->user_cpus_ptr first as their user_cpus_ptr's
> > + * may differ by now due to racing.
> > + */
> > + dst->user_cpus_ptr = NULL;
> > +
> > + /*
> > + * This check is racy and losing the race is a valid situation.
> > + * It is not worth the extra overhead of taking the pi_lock on
> > + * every fork/clone.
> > + */
> > if (!src->user_cpus_ptr)
> > return 0;
> > - dst->user_cpus_ptr = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> > - if (!dst->user_cpus_ptr)
> > + user_mask = kmalloc_node(cpumask_size(), GFP_KERNEL, node);
> > + if (!user_mask)
> > return -ENOMEM;
> > - /* Use pi_lock to protect content of user_cpus_ptr */
> > + /*
> > + * Use pi_lock to protect content of user_cpus_ptr
> > + *
> > + * Though unlikely, user_cpus_ptr can be reset to NULL by a concurrent
> > + * do_set_cpus_allowed().
> > + */
> > raw_spin_lock_irqsave(&src->pi_lock, flags);
> > - cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> > + if (src->user_cpus_ptr) {
> > + swap(dst->user_cpus_ptr, user_mask);
> > + cpumask_copy(dst->user_cpus_ptr, src->user_cpus_ptr);
> > + }
> > raw_spin_unlock_irqrestore(&src->pi_lock, flags);
> > +
> > + if (unlikely(user_mask))
> > + kfree(user_mask);
> > +
> > return 0;
> > }
>