newidle balancing in NUMA domain?

From: Nick Piggin
Date: Mon Nov 23 2009 - 06:22:39 EST


Hi,

I wonder why it was decided to do newidle balancing in the NUMA
domain? And with newidle_idx == 0 at that.

This means that every time the CPU goes idle, every CPU in the
system gets a remote cacheline or two hit. Not very nice O(n^2)
behaviour on the interconnect. Not to mention trashing our
NUMA locality.

And then I see some proposal to do ratelimiting of newidle
balancing :( Seems like hack upon hack making behaviour much more
complex.

One "symptom" of bad mutex contention can be that increasing the
balancing rate can help a bit to reduce idle time (because it
can get the woken thread which is holding a semaphore to run ASAP
after we run out of runnable tasks in the system due to them
hitting contention on that semaphore).

I really hope this change wasn't done in order to help -rt or
something sad like sysbench on MySQL.

And btw, I'll stay out of mentioning anything about CFS development,
but it really sucks to be continually making significant changes to
domains balancing *and* per-runqueue scheduling at the same time :(
It makes it even difficult to bisect things.

Thanks,
Nick

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/