Re: [Lse-tech] Re: NUMA scheduler 2nd approach

From: Erich Focht (efocht@ess.nec.de)
Date: Tue Jan 14 2003 - 10:13:27 EST


On Tuesday 14 January 2003 17:52, Andrew Theurer wrote:
> > I suppose I should not have been so dang lazy and cut-n-pasted
> > the line I changed. The change was (((5*4*this_load)/4) + 4)
> > which should be the same as your second choice.
> >
> > > We def need some constant to avoid low load ping pong, right?
> >
> > Yep. Without the constant, one could have 6 processes on node
> > A and 4 on node B, and node B would end up stealing. While making
> > a perfect balance, the expense of the off-node traffic does not
> > justify it. At least on the NUMAQ box. It might be justified
> > for a different NUMA architecture, which is why I propose putting
> > this check in a macro that can be defined in topology.h for each
> > architecture.
>
> Yes, I was also concerned about one task in one node and none in the
> others. Without some sort of constant we will ping pong the task on every
> node endlessly, since there is no % threshold that could make any
> difference when the original load value is 0.. Your +4 gets rid of the 1
> task case.

That won't happen because:
 - find_busiest_queue() won't detect a sufficient imbalance
 (imbalance == 0),
 - load_balance() won't be able to steal the task, as it will be
 running (the rq->curr task)

So the worst thing that happens is that load_balance() finds out that
it cannot steal the only task running on the runqueue and
returns. But we won't get there because find_busiest_queue() returns
NULL. It happens even in the bad case:
   node#0 : 0 tasks
   node#1 : 4 tasks (each on its own CPU)
where we would desire to distribute the tasks equally among the
nodes. At least on our IA64 platform...

Regards,
Erich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Jan 15 2003 - 22:00:50 EST