Re: [PATCH, 2.6.9] improved load_balance() tolerance for pinnedtasks

From: Matthew Dobson
Date: Fri Oct 29 2004 - 20:02:29 EST


On Mon, 2004-10-25 at 09:02, John Hawkes wrote:
> From: "Nick Piggin" <nickpiggin@xxxxxxxxxxxx>
> > > From: "John Hawkes" <hawkes@xxxxxxxxxxxxxxxxxxx>
> > > Actually, there is another related problem that arises in
> > > active_load_balance() with a runqueue that holds hundreds of pinned
> processes.
> > > I'm seeing a migration_thread perpetually consuming 70% of its CPU.
> >
> > That's what I was worried about, but in your most recent
> > patch you just sent, the all_pinned path should skip over
> > the active load balance completely... basically it shouldn't
> > be running at all, and if it is then it is a bug I think?
>
> To reiterate: this is probably reproducible on smaller SMP systems, too.
> Just do a 'runon' (using sys_sched_setaffinity) of ~200 (or more) small
> computebound processes on a single CPU.
>
> My patch -- that has load_balance() skip over (busiest->active_balance = 1)
> trigger that starts up active_load_balance() -- does seem to reduce the
> frequency of bursts of long-running activity of the migration thread, but
> those burst of activity are still there, with migration_thread consuming
> 75-95% of its CPU for several seconds (as observed by 'top'). I have not yet
> determined what's happening. It might be an artifact of how long it takes to
> do those 'runon' startups of the computebound processes.

You may want to try these tests again with linux-2.6.10-rc1-mm2. It's
got 2 patches to fix some broken behavior of active_load_balance(). The
version of active_load_balance() in 2.6.9 was not considering a great
many CPUs as potential recipients of tasks due to some small logic
problems in the code.

Cheers!

-Matt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/