Re: [PATCH 0/10] bulk cpu removal support

From: Ashok Raj
Date: Thu May 11 2006 - 13:40:40 EST


On Thu, May 11, 2006 at 12:19:20PM -0500, Nathan Lynch wrote:
>
> But offlining all the cpus in a node is already something that just
> works. If the user is all that concerned about not thrashing the
> tasks running on that node, they would have a workload manager that
> migrates the tasks off the node before shooting down cpus. Similar
> argument applies to interrupt affinity.
>
> I really haven't seen a compelling argument for why this is needed,
> just a bunch of handwaving so far, sorry.

Hand waving? Dont think that was intensional though.. i think we are trying
to address a real problem, if there is a reasonable alternate already
that we are not aware of, no problemo...


1. Regarding process migration, someone needs to make sure they run
something like a taskset away from all the cpus that are planned to be
removed upfront. This needs to be done on all processes on the system.

[nick, is there an easier way to do this in today's sched infrastructure or
otherwise]

2. For interrrupt migration, today when we take a cpu offline, we pick
a random online cpu today. So if you have a cpu going offline, and the
next logical cpu is also part of the same package, or node, we have
no smarts today to keep migration away from those "to be offlined" cpus.

If we have a solution to these already, or a simpler alternative, we are
open to those... and iam getting help on large system validation.. it might
not be easy right away, but its comming.

Cheers,
ashok


--
Cheers,
Ashok Raj
- Open Source Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/