Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

From: Dave Hansen
Date: Fri Dec 15 2023 - 17:09:26 EST


On 12/15/23 13:38, Chris Koch wrote:
> On Fri, Dec 15, 2023 at 1:17 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
...
>> Are you reporting a bug and is this a bug fix? It's not super clear
>> from the changelog.
>
> I reported it as a bug yesterday in
> https://lkml.org/lkml/2023/12/14/1529 -- I'm happy to reword this in
> some way that indicates it's a bug fix.

Ahh, thanks for the link!

Please do include things like this as a "Link:" tag in the changelog
(and use lore.kernel.org URLs when possible).

>>> diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
>>> index 22cc7a040dae..49bea8986620 100644
>>> --- a/Documentation/arch/x86/boot.rst
>>> +++ b/Documentation/arch/x86/boot.rst
>>> @@ -878,7 +878,8 @@ Protocol: 2.10+
>>> address if possible.
>>>
>>> A non-relocatable kernel will unconditionally move itself and to run
>>> - at this address.
>>> + at this address. A relocatable kernel will move itself to this address if it
>>> + loaded below this address.
>>
>> I think we should avoid saying the same things over and over again in
>> different spots.
>>
>> Here, it doesn't really help to enumerate the different interpretations
>> of 'pref_address'. All that matters is that the bootloader can avoid
>> the overhead of a later copy if it can place the kernel at
>> 'pref_address'. The exact reasons that various kernels might decide to
>> relocate are unimportant here.
>
> I think it's important documentation for bootloader authors. It's not
> about avoiding overhead, it's about avoiding clobbering areas of
> memory that may be reserved in e820 / EFI memory map, which the kernel
> will do when it relocates itself to pref_address without checking
> what's reserved and what's not. It emphasizes the importance of
> choosing an address above pref_address. Happy to reword some way to
> reflect that.

Could we give this as a direction to bootloader authors? They should
read 'pref_address' as a big, fat, "the kernel may blow this address
away" message.

...
>>> In normal kdump cases one does not have to set/change this option
>>> as now bzImage can be compiled as a completely relocatable image
>>> diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>>> index a61c12c01270..5dcd232d58bf 100644
>>> --- a/arch/x86/kernel/kexec-bzimage64.c
>>> +++ b/arch/x86/kernel/kexec-bzimage64.c
>>> @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
>>> kbuf.bufsz = kernel_len - kern16_size;
>>> kbuf.memsz = PAGE_ALIGN(header->init_size);
>>> kbuf.buf_align = header->kernel_alignment;
>>> - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>>> + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
>>> + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
>>> + else
>>> + kbuf.buf_min = header->pref_address;
>>> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>>> ret = kexec_add_buffer(&kbuf);
>>> if (ret)
>>
>> Comment, please.
>>
>> It isn't clear from this hunk why or how this fixes the bug. How does
>> this manage to avoid clobbering reserved areas?
>
> When allocated above pref_address, the kernel will not relocate itself
> to an area that potentially overlaps with reserved memory. I'll add a
> comment.

Sounds good.

> Not sure what the etiquette is on immediately sending a patch v2, or
> waiting for more comments. I'll err on waiting on a couple more
> comments before sending v2. Thanks for the review

Please wait another few days.