Re: Boot regression with bacf6b499e11 ("x86/mm: Use a struct to reduce parameters for SME PGD mapping") on top of -rc8

From: Tom Lendacky
Date: Sat Jan 20 2018 - 12:35:14 EST


On 1/20/2018 10:52 AM, Laura Abbott wrote:
> On 01/20/2018 05:13 AM, Ingo Molnar wrote:
>>
>> * Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>>
>>> 2)
>>>
>>> using global variables, which is unsafe in early code if the kernel is
>>> relocatable.
>>>
>>> The bisected to commit uses a new sme_populate_pgd_data to collect
>>> variables that
>>> were already on the stack, which should be position independent and safe.
>>>
>>> But the other commits use sme_active(), which does:
>>>
>>> bool sme_active(void)
>>> {
>>> ÂÂÂÂÂÂÂÂ return sme_me_mask && !sev_enabled;
>>> }
>>> EXPORT_SYMBOL(sme_active);
>>>
>>> And that looks PIC-unsafe to me, as both are globals:
>>>
>>> u64 sme_me_mask __section(.data) = 0;
>>> EXPORT_SYMBOL(sme_me_mask);
>>>
>>> Does the code start working if you force sme_active() to 0 while
>>> keeping the
>>> function call, i.e. something like the hack below?
>>
>> BTW., this aspect of the boot code is really fragile, and depending on
>> compiler
>> there could be unsafe relocations generated without it being 'obvious'
>> from the
>> patch itself. It's also pretty compiler and code layout dependent ...
>>
>> A good way to check this I think would be to turn off
>> CONFIG_RELOCATABLE=y in the
>> .config - does that make the kernel boot again?
>>
>> If that makes a difference then we need to take a look at the
>> relocations in the
>> two key files, with CONFIG_RELOCATABLE=y turned back on:
>>
>> ÂÂ objdump -r arch/x86/kernel/head64.o
>> ÂÂ objdump -r arch/x86/mm/mem_encrypt.o
>>
>> There's three types of relocations that should be there normally:
>>
>> #define R_X86_64_64 1 /* Direct 64 bit */
>> #define R_X86_64_PC32ÂÂÂÂÂÂÂÂÂÂ 2ÂÂÂÂÂÂ /* PC relative 32 bit signed */
>> #define R_X86_64_32SÂÂÂÂÂÂÂÂÂÂÂ 11ÂÂÂÂÂ /* Direct 32 bit sign extended */
>>
>> Only R_X86_64_PC32 is safe as-is, R_X86_64_32S needs to be used via
>> fixup_pointer().
>>
>> What makes this difficult in the SME context is that the early boot
>> portion of
>> arch/x86/mm/mem_encrypt.c is not separated out, but mixed in with later
>> code.
>>
>> I missed this aspect when reviewing and merging this code :-(
>>
>> Maybe a diff of the list of relocations of the before/after commit
>> points would be
>> nice.
>>
>> I.e. does something like:
>>
>> ÂÂ git checkout <last_working_commit_sha1>
>>  objdump -r arch/x86/mm/mem_encrypt.o | grep R_X86 | cut -d' ' -f2- >
>> working.relocs
>>
>> ÂÂ git checkout <first_broken_commit_sha1>
>>  objdump -r arch/x86/mm/mem_encrypt.o | grep R_X86 | cut -d' ' -f2- >
>> broken.relocs
>>
>> ÂÂ diff -up working.relocs broken.relocs
>>
>> show any changes to the relocations?
>>
>> Side note:
>>
>> Regardless of whether it's the root cause for this regression we
>> definitely need
>> to improve the relocations robustness of early boot code: at minimum we
>> should
>> isolate all critical functionality into a separate section, and then add
>> tooling
>> checks to make sure all relocations are safe.
>>
>> Thanks,
>>
>> ÂÂÂÂIngo
>>
>
> For the previous question, changing it to sme_active _does_ make the
> kernel work. Unfortunately, I can't test without relocations since
> I need to boot with CONFIG_EFI_STUB, but the relocations did show
> something interesting:
>
> +R_X86_64_PC32ÂÂÂÂ __stack_chk_fail-0x0000000000000004
>
> There's a new call to __stack_chk_fail and if I dump the end of
> sme_encrypt_kernel I do see that stuck in there. I bet the size
> of struct sme_populate_pgd_data is now large enough to trigger
> a stack check. If I add __nostackprotector to sme_encrypt_kernel
> like sme_enable has, it boots fine. This would explain why that
> particular commit showed as the problem in bisection.

Great find Laura. It must have something to do with compiler levels
since my level didn't insert that check.

Thanks,
Tom

>
> Thanks,
> Laura