Broken boot on tip/master ( and now linux-next )

From: Gabriel C
Date: Fri Jan 12 2018 - 17:16:42 EST


Hey guys,

I have an Supermicro H11DSi-NT box with 2 x EPYC 7281 CPUs.

I've notice already a bit earlier something is wrong when booting tip/master
on that box but didn't got any time to investigate that.


Today I did an linux-next build which failed in the same way tip/master did
so I did an tip/master build too which failed to boot as well.

Some things I noticed:

With CONFIG_AMD_MEM_ENCRYPT=y and mem_encrypt=on the box hangs right after grub
with no way to see what is going on.

With mem_encrypt=off the box boots to an point but something trashes APCI tables.

With:

CONFIG_AMD_MEM_ENCRYPT=n
CONFIG_RETPOLINE=n

The box boots to an point but same , ACPI seems broken , eg this :

...

[ 0.000000] ACPI: \xc0\xde\xdb\xc2 0x00000000C2DC8DA0 000000 (v10 ?(<- 00000000 C2DB56A0)
[ 0.000000] ACPI: 0x000000002D3C2808 000000 (v00 00000000 00000000)
[ 0.000000] ACPI BIOS Error (bug): Invalid table length 0x0 in RSDT/XSDT (20170831/tbutils-325)
[ 0.000000] No NUMA configuration found

...

From here on hell break :)


I got a dmesg from the broken boot , this can be found there:

http://sigsegv.24-7.ro/~crazy/tip-master/dmesg-tip-master-broken-boot.txt
http://sigsegv.24-7.ro/~crazy/tip-master/config-4.15.0-rc7-00557-g16ccd38ce1c1

A good dmesg from linus tree + patches from this series https://marc.info/?l=linux-kernel&m=151561236821659&w=2

http://sigsegv.24-7.ro/~crazy/tip-master/dmesg-OK.txt

Does someone have any idea what could have broke that ?

Would be nice to have some hints before starting to bisect that.

Regards,

Gabriel C