Re: [PATCH 1/2] Customize sched domain via cpuset

From: Paul Jackson
Date: Tue Apr 01 2008 - 07:55:45 EST


Interesting ...

So, we have two flags here. One flag "sched_wake_idle_far" that will
cause the current task to search farther for an idle CPU when it wakes
up another task that needs a CPU on which to run, and the other flag
"sched_balance_newidle_far" that will cause a soon-to-idle CPU to search
farther for a task it might pull over and run, instead of going idle.

I am tempted to ask if we should not elaborate this in one dimension,
and simplify it in another dimension.

First the simplification side: do we need both flags? Yes, they are
two distinct cases in the code, but perhaps practical uses will always
end up setting both flags the same way. If that's the case, then we
are just burdening the user of these flags with understanding a detail
that didn't matter to them: did a waking task or an idle CPU provoke
the search? Do you have or know of a situation where you actually
desire to enable one flag while disabling the other?

For the elaboration side: your proposal has just two-level's of
distance, near and far. Perhaps, as architectures become more
elaborate and hierarchies deeper, we would want N-level's of distance,
and the ability to request such load balancing for all levels "n"
for our choice of "n" <= N.

If we did both the above, then we might have a single per-cpuset file
that took an integer value ... this "n". If (n == 0), that might mean
no such balancing at all. If (n == 1), that might mean just the
nearest balancing, for example, to the hyperthread within the same core,
on some current Intel architectures. If (n == 2), then that might mean,
on the same architectures, that balancing could occur across cores
within the same package. If (n == 3) then that might mean, again on
that architecture, that balancing could occur across packages on the
same node board. As architectures evolve over time, the exact details
of what each value of "n" mean would evolve, but always higher "n"
would enable balancing across a wider portion of the system.

Please understand I am just brain storming here. I don't know that
the alternatives I considered above are preferrable or not to what
your patch presents.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.940.382.4214
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/