Re: [PATCH] cpusets: rework guarantee_online_cpus() to fixdeadlock with cpu_down()

From: Oleg Nesterov
Date: Sun Aug 02 2009 - 03:01:28 EST


On 08/02, Lai Jiangshan wrote:
>
> Oleg Nesterov wrote:
>
> >
> > - do NOT scan cs->parent cpusets. If there are no online CPUs in
> > cs->cpus_allowed, we use cpu_online_mask. This is only possible
> > when we are called by cpu_down() hooks, in that case
> > cpuset_track_online_cpus(CPU_DEAD) will fix things later.
> >
>
>
> We must scan cs->parent cpusets.
> A task is constrained by a cpuset,

Yes, the task esacpes its cpuset. With or without this patch.
Because cs->cpus_allowed has no online CPUs.

> it must be constrained
> this cpuset's parent too.

It will be constained again, after scan_for_empty_cpusets(), no?

The only difference this patch adds is that a task temporary uses
cpu_online_mask when its cs->cpuset_allowed becomes empty. Why
this is so bad? This can only happen when CPU dies and cs becomes
empty, force majeure.

Even without this patch, the task is not actually constrained by
its cs->parent. Yes, we use ->parent->cpus_allowed. But this mask
can be changed right after guarantee_online_cpus() returns. And
since this task does not belong to cs->parent cpuset,
update_tasks_cpumask() will not fix this task. Again, until
cpuset_track_online_cpus().

Could you explain what problem this patch adds?

> cpuset_lock() is not awful at all.

OK it is not awful. But surely it complicates things. And currently
it is buggy.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/