Re: [patch] sched: prevent bound kthreads fromchanging cpus_allowed

From: Peter Zijlstra
Date: Tue Jun 10 2008 - 13:20:16 EST


On Tue, 2008-06-10 at 21:00 +0400, Oleg Nesterov wrote:
> On 06/10, Max Krasnyansky wrote:
> >
> > Peter Zijlstra wrote:
> > >
> > > Per cpu workqueues should stay on their cpu.
> > >
> > > What you're really looking for is a more fine grained alternative to
> > > flush_workqueue().
> > Actually I had a discussion on that with Oleg Nesterov. If you remember my
> > original solution (ie centralized cpu_isolate_map) was to completely redirect
> > work onto other cpus. Then you pointed out that it's the flush_() that really
> > makes the box stuck. So I started thinking about redoing the flush. While
> > looking at the code I realized that if I only change the flush_() then queued
> > work can get stale so to speak. ie Machine does not get stuck but some work
> > submitted on the isolated cpus will sit there for a long time. Oleg pointed
> > out exact same thing. So the simplest solution that does not require any
> > surgery to the workqueue is to just move the threads to other cpus.
>
> Cough... I'd like to mention that I _personally agree with Peter, cwq->thread's
> should stay on their cpu.
>
> I just meant that from the workqueue.c pov it is (afaics) OK to move cwq->thread
> to other CPUs, in a sense that this shouldn't add races or hotplug problems, etc.
> But still this doesn't look right to me.

The advantage of creating a more flexible or fine-grained flush is that
large machine also profit from it.

A simple scheme would be creating a workqueue context that is passed
along on enqueue, and then passed to flush.

This context could:

- either track the individual worklets and employ a completion scheme
to wait for them;

- or track on which cpus the worklets are enqueued and flush only those
few cpus.

Doing this would solve your case since nobody (except those having
business) will enqueue something on the isolated cpus.

And it will improve the large machine case for the same reasons - it
won't have to iterate all cpus.

Of course, things that use schedule_on_each_cpu() will still end up
doing things on your isolated cpus, but getting around those would
probably get you into some correctness trouble.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/