Re: [x86-tip] strange nr_cpus= boot regression

From: Thomas Gleixner
Date: Mon Sep 26 2016 - 15:38:20 EST



On Mon, 26 Sep 2016, Thomas Gleixner wrote:
> Can you please provide your .config and the dmesg of a bad and a good run?

Don't bother. I found it.

It's a merge artifact. So git bisect pointing at the merge commit is
entirely correct.

mainline moves

num_processors++;

to a different place in the function. See commit c291b0151585.

Now the nodeid patch set in x86/apic does not have this commit and so
f7c28833c2520 removes num_processors++ from the original location before
c291b0151585.

Now merging both branches does not conflict because both remove it from the
original location. Though both add it to new locations and it ends up with
both instances of num_processors++ in place.

Which of course makes each invocation increment twice and therefor cuts the
number of cpus in half.

So it's my fault that I did not merge x86/urgent into x86/apic before I
added the nodeid bits. And of course because the thing did not reject and
the merge of it into master gave no conflicts I did not notice ....

Here is a patch against tip/master which fixes the issue at least for
Boris. I'm going to merge that other commit into x86/apic and fix it up so
we don't end up with that mess again.

Thanks,

tglx

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 46bb29958509..f266b8a92a9e 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2171,8 +2171,6 @@ int __generic_processor_info(int apicid, int version, bool enabled)
return -ENOSPC;
}

- num_processors++;
-
/*
* Validate version
*/