Re: [PATCH] RT: Add priority-queuing and priority-inheritance to workqueue infrastructure

From: Oleg Nesterov
Date: Mon Aug 06 2007 - 11:34:59 EST


On 08/06, Gregory Haskins wrote:
>
> On Mon, 2007-08-06 at 18:26 +0400, Oleg Nesterov wrote:
>
> > Immediately, A inserts the work on CPU 1.
>
> Well, if you didn't care about which CPU, that's true. But suppose we
> want to direct this specifically at CWQ for cpu 0.

please see below...

> > It can livelock if other higher-priority threads add works to this wq
> > repeteadly.
>
> That's really starvation, though. I agree with you that it is
> technically possible to happen if the bandwidth of the higher priority
> producer task is greater than the bandwidth of the consumer. But note
> that the RT priorities already forgo starvation avoidance, so IMO the
> queue can here as well. E.g. if the system designer runs a greedy app
> at a high RT priority, it can starve as many lower priority items as it
> wants anyway. This is no different.

OK.

> > > > Actually a niced (so that its priority is lower than cwq->thread's one)
> > > > can deadlock. Suppose it does
> > > >
> > > > lock(LOCK);
> > > > flush_workueue(wq);
> > > >
> > > > and we have a pending work_struct which does:
> > > >
> > > > void work_handler(struct work_struct *self)
> > > > {
> > > > if (!try_lock(LOCK)) {
> > > > // try again later...
> > > > queue_work(wq, self);
> > > > return;
> > > > }
> > > >
> > > > do_something();
> > > > }
> > > >
> > > > Deadlock.
> > >
> > > That code is completely broken, so I don't think it matters much. But
> > > regardless, the new API changes will address that.
> >
> > Sorry. This code is not very nice, but it is correct currently.
>
> Well, the "trylock+requeue" avoids the obvious recursive deadlock, but
> it introduces a more subtle error: the reschedule effectively bypasses
> the flush.

this is OK, flush_workqueue() should only care about work_struct's that are
already queued.

> E.g. whatever work was being flushed was allowed to escape
> out from behind the barrier. If you don't care about the flush working,
> why do it at all?

The caller of flush_workueue() doesn't necessary know we have such a work
on list. It just wants to flush its own works.

> But like I said, its moot...we will fix that condition at the API level
> (or move away from overloading the workqueues)

Oh, good :)

> > Well, if we can use smp_call_() we don't need these complications?
>
> With an RT99 or smp_call() solution, all invocations effectively preempt
> whatever is running on the target CPU. That is why I said it was the
> opposite problem. A low priority client can preempt a high-priority
> task, which is just as bad as the current situation.

Aha, now I see what another problem you are trying to solve. I had a false
impression that might_sleep() is the issue.

After reading the couple of Peter's emails, I guess I am starting to
understand another issue. RT has irq threads, and they have different
priorities. So, in that case I agree, it is natural to consider the work
which was queued from the higher-priority irq thread as "more important".
Yes?

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/