Re: Q: select_fallback_rq() && cpuset_lock()

From: Oleg Nesterov
Date: Wed Mar 10 2010 - 12:32:02 EST


On 03/10, Peter Zijlstra wrote:
>
> On Tue, 2010-03-09 at 19:06 +0100, Oleg Nesterov wrote:
> > In particular, see http://marc.info/?l=linux-kernel&m=125261083613103
>
> /me puts it on the to-review stack.

Great, thanks. In fact, you already acked it before ;)

> > But now I have another question. Since 5da9a0fb673a0ea0a093862f95f6b89b3390c31e
> > cpuset_cpus_allowed_locked() is called without callback_mutex held by
> > try_to_wake_up().
> >
> > And, without callback_mutex held, isn't it possible to race with, say,
> > update_cpumask() which changes cpuset->cpus_allowed? Yes, update_tasks_cpumask()
> > should fixup task->cpus_allowed later. But isn't it possible (at least
> > in theory) that try_to_wake_up() gets, say, all-zeroes in task->cpus_allowed
> > after select_fallback_rq()->cpuset_cpus_allowed_locked() if we race with
> > update_cpumask()->cpumask_copy() ?
>
> Hurmm,.. good point,.. yes I think that might be possible.
> p->cpus_allowed is synchronized properly, but cs->cpus_allowed is not,
> bugger.
>
> I guess the quick fix is to really bail and always use cpu_online_mask
> in select_fallback_rq().

Yes, but this breaks cpusets.

Peter, please see the attached mbox with the last discussion and patches.
Of course, the changes in sched.c need the trivial fixups. I'll resend
if you think these changes make sense.

Oleg.