[rfc git pull] cpus4096 fixes, take 2

From: Ingo Molnar
Date: Mon Jul 28 2008 - 16:57:34 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> > But I'll redo the patch again.
>
> Here's a trivial setup, that is even tested. It's _small_ too.
>
> /* cpu_bit_bitmap[0] is empty - so we can back into it */
> #define MASK_DECLARE_1(x) [x+1][0] = 1ul << (x)
> #define MASK_DECLARE_2(x) MASK_DECLARE_1(x), MASK_DECLARE_1(x+1)
> #define MASK_DECLARE_4(x) MASK_DECLARE_2(x), MASK_DECLARE_2(x+2)
> #define MASK_DECLARE_8(x) MASK_DECLARE_4(x), MASK_DECLARE_4(x+4)
>
> static const unsigned long cpu_bit_bitmap[BITS_PER_LONG+1][BITS_TO_LONGS(NR_CPUS)] = {
> MASK_DECLARE_8(0), MASK_DECLARE_8(8),
> MASK_DECLARE_8(16), MASK_DECLARE_8(24),
> #if BITS_PER_LONG > 32
> MASK_DECLARE_8(32), MASK_DECLARE_8(40),
> MASK_DECLARE_8(48), MASK_DECLARE_8(56),
> #endif
> };
>
> static inline const cpumask_t *get_cpu_mask(unsigned int nr)
> {
> const unsigned long *p = cpu_bit_bitmap[1 + nr % BITS_PER_LONG];
> p -= nr / BITS_PER_LONG;
> return (const cpumask_t *)p;
> }
>
> that should be all you need to do.
>
> Honesty in advertizing: my "testing" was some trivial user-space
> harness, maybe I had some bug in it. But at least it's not _horribly_
> wrong.

Amazing! Your code, once plugged into the kernel proper, booted fine on
5 different x86 testsystems, it booted fine an allyesconfig kernel with
MAXSMP and NR_CPUS=4096, it booted fine on allnoconfig as well (and
allmodconfig and on a good number of randconfigs as well).

> And yes, this has the added optimization from Viro of overlapping the
> cpumask_t's internally too, rather than making them twice the size. So
> with 4096 CPU's, this should result 32.5kB of static const data.

What do you think about the commit below?

I've put your fix into:

# e56b3bc: cpu masks: optimize and clean up cpumask_of_cpu()

It's quoted fully below after the pull request diffstat. Have i missed
anything obvious?

What i'm unsure about are other architectures. (Small detail i just
noticed: HAVE_CPUMASK_OF_CPU_MAP is now orphaned in arch/x86/Kconfig,
will get rid of that in a separate change, i dont want to upset the test
results.)

( And ... because v1 of your code was so frustratingly and
mind-blowingly stable in testing (breaking a long track record of v1
patches in this area of kernel), and because the perfect patch does
not exist by definition, i thought i'd mention that after a long
search i found and fixed a serious showstopper bug in your code: you
used "1ul" in your macros, instead of the more proper "1UL" style. The
ratio between the use of 1ul versus 1UL is 1:30 in the tree, so your
choice of integer literals type suffix capitalization was deemed
un-Linuxish, and was fixed up for good. )

Ingo

---------------->
Linus,

Please pull the latest cpus4096 git tree from:

git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git cpus4096

Thanks,

Ingo

------------------>
Ingo Molnar (1):
cpumask: export cpumask_of_cpu_map

Linus Torvalds (1):
cpu masks: optimize and clean up cpumask_of_cpu()

Mike Travis (3):
cpumask: make cpumask_of_cpu_map generic
cpumask: put cpumask_of_cpu_map in the initdata section
cpumask: change cpumask_of_cpu_ptr to use new cpumask_of_cpu


arch/x86/kernel/acpi/cstate.c | 3 +-
arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c | 10 +---
arch/x86/kernel/cpu/cpufreq/powernow-k8.c | 15 ++----
arch/x86/kernel/cpu/cpufreq/speedstep-centrino.c | 12 ++---
arch/x86/kernel/cpu/cpufreq/speedstep-ich.c | 3 +-
arch/x86/kernel/cpu/intel_cacheinfo.c | 3 +-
arch/x86/kernel/ldt.c | 6 +--
arch/x86/kernel/microcode.c | 17 ++----
arch/x86/kernel/reboot.c | 11 +---
arch/x86/kernel/setup_percpu.c | 21 -------
drivers/acpi/processor_throttling.c | 11 +---
drivers/firmware/dcdbas.c | 3 +-
drivers/misc/sgi-xp/xpc_main.c | 3 +-
include/linux/cpumask.h | 63 ++++++++-------------
kernel/cpu.c | 25 +++++++++
kernel/stop_machine.c | 3 +-
kernel/time/tick-common.c | 8 +--
kernel/trace/trace_sysprof.c | 4 +-
lib/smp_processor_id.c | 5 +--
net/sunrpc/svc.c | 3 +-
20 files changed, 86 insertions(+), 143 deletions(-)

---->

# e56b3bc: cpu masks: optimize and clean up cpumask_of_cpu()