Re: [PATCH 10/13] remove aggressive idle balancing

From: Nick Piggin
Date: Mon Mar 07 2005 - 00:36:59 EST


Siddha, Suresh B wrote:


By code inspection, I see an issue with this patch
[PATCH 10/13] remove aggressive idle balancing

Why are we removing cpu_and_siblings_are_idle check from active_load_balance?
In case of SMT, we want to give prioritization to an idle package while
doing active_load_balance(infact, active_load_balance will be kicked
mainly because there is an idle package)

Just the re-addition of cpu_and_siblings_are_idle check to active_load_balance might not be enough. We somehow need to communicate this to move_tasks, otherwise can_migrate_task will fail and we will never be able to do active_load_balance.


Active balancing should only kick in after the prescribed number
of rebalancing failures - can_migrate_task will see this, and
will allow the balancing to take place.

That said, we currently aren't doing _really_ well for SMT on
some workloads, however with this patch we are heading in the
right direction I think.

I have been mainly looking at tuning CMP Opterons recently (they
are closer to a "traditional" SMP+NUMA than SMT, when it comes
to the scheduler's point of view). However, in earlier revisions
of the patch I had been looking at SMT performance and was able
to get it much closer to perfect:

I was working on a 4 socket x440 with HT. The problem area is
usually when the load is lower than the number of logical CPUs.
So on tbench, we do say 450MB/s with 4 or more threads without
HT, and 550MB/s with 8 or more threads with HT, however we only
do 300MB/s with 4 threads.

Those aren't the exact numbers, but that's basically what they
look like. Now I was able to bring the 4 thread + HT case much
closer to the 4 thread - HT numbers, but with earlier patchsets.
When I get a chance I will do more tests on the HT system, but
the x440 is infuriating for fine tuning performance, because it
is a NUMA system, but it doesn't tell the kernel about it, so
it will randomly schedule things on "far away" CPUs, and results
vary.

PS. Another thing I would like to see tested is a 3 level domain
setup (SMT + SMP + NUMA). I don't have access to one though.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/