Re: [PATCH 0/10] bulk cpu removal support

From: Ashok Raj
Date: Thu May 11 2006 - 12:53:15 EST


On Wed, May 10, 2006 at 11:06:06PM -0700, Andrew Morton wrote:
Hi Andrew,

> Shaohua Li <shaohua.li@xxxxxxxxx> wrote:
> >
> > CPU hotremove will migrate tasks and redirect interrupts off dead cpu.
>
> This seems an awful lot of code for something which happens so infrequently.
>
> How big is the problem you're fixing here, and what are the
> user-observeable effects of these changes?

This is useful when say a NUMA node is being removed. With new multi-core
CPUs comming up, considering a 2 core with HT, we could have up to 4 logical
per socket. On NUMA node with 4 sockets, a node removal will mean we
do 16 single cpu offlines. Each time the process and interrupts could
end up on a CPU that might be removed just immediatly.

The same is also useful for SMP Suspend/resume cases since the logical offline
is same here as well.

Even thought the code changes seem a lot, most of it is just preparation of
functions ready to accept a cpumask_t instead of a single cpu like earlier.
The reason we split them to smaller chunks so the scope of change is well
understood with each patch.

The major changes are

- stop machine to run cpu offline functions on each cpu going offline
- prepare offline functions in offline path to take cpumask_t
- Some task migrate dead lock removal consideration that we ran into
during stress test.

I know Shaohua ran tests for more than 20+ hrs with the patch, both on i386
and x86_64.

once we get some time deltas on a bigger machine it will help a lot.
Iam also trying to check with some OEM';s who have such large machines for
some data.. keep posted.

ashokr

--
Cheers,
Ashok Raj
- Open Source Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/