Re: [PATCH] random: Fix kernel panic due to system_wq use before init

From: Waiman Long
Date: Mon Sep 19 2016 - 10:48:29 EST


On 09/19/2016 08:43 AM, Matt Fleming wrote:
On Sun, 18 Sep, at 11:09:08PM, Waiman Long wrote:
On 09/14/2016 03:19 PM, Linus Torvalds wrote:
On Wed, Sep 14, 2016 at 12:14 PM, Waiman Long<waiman.long@xxxxxxx> wrote:
In the stack backtrace above, the kernel hadn't even reached SMP boot after
about 50s. That was extremely slow. I tried the 4.7.3 kernel and it booted
up fine. So I suspect that there may be too many interrupts going on and it
consumes most of the CPU cycles. The prime suspect is the random driver, I
think.
Any chance of bisecting it at least partially? The random driver
doesn't do interrupts itself, it just gets called by other drivers
doing intterrupts. So if there are too many of them, that would be
something else..

Linus
I have finally finished bisecting the problem. I was wrong in saying that
the 4.7.3 kernel had no problem. It did have. There were some slight
differences between the 4.8 and 4.7 kernel config files that I used. After
some further testing, it was found that the bootup problem only happened
when the following kernel config option was defined:

CONFIG_EFI_MIXED=y

Could you try this patch? It won't be the final version, because it
doesn't address the root cause of the crash, which looks like page
table corruption of some kind, but it should at least confirm that
this is the buggy code,

---

diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 677e29e29473..8dd3784eb075 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -245,7 +245,7 @@ int __init efi_setup_page_tables(unsigned long pa_memmap, unsigned num_pages)
* text and allocate a new stack because we can't rely on the
* stack pointer being< 4GB.
*/
- if (!IS_ENABLED(CONFIG_EFI_MIXED))
+ if (!IS_ENABLED(CONFIG_EFI_MIXED) || efi_is_native())
return 0;

/*

With this patch applied, I am able to successfully boot both the 16-socket 12-TB and 8-socket 6TB configurations without problem.

Tested-by: Waiman Long <Waiman.Long@xxxxxxx>