Re: [PATCH] x86/kexec: set MIN_KERNEL_LOAD_ADDR to 0x01000000

From: Baoquan He
Date: Tue Nov 14 2023 - 09:17:31 EST


Hi John,

On 10/23/23 at 02:54pm, John Sperbeck wrote:
> On Sun, Oct 22, 2023 at 7:42 PM H. Peter Anvin <hpa@xxxxxxxxx> wrote:
......
> > >---
> > > arch/x86/kernel/kexec-bzimage64.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > >diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
> > >index a61c12c01270..d6bf6c13dab1 100644
> > >--- a/arch/x86/kernel/kexec-bzimage64.c
> > >+++ b/arch/x86/kernel/kexec-bzimage64.c
> > >@@ -36,7 +36,7 @@
> > > */
> > > #define MIN_PURGATORY_ADDR 0x3000
> > > #define MIN_BOOTPARAM_ADDR 0x3000
> > >-#define MIN_KERNEL_LOAD_ADDR 0x100000
> > >+#define MIN_KERNEL_LOAD_ADDR 0x1000000
> > > #define MIN_INITRD_LOAD_ADDR 0x1000000
> > >
> > > /*
> >
> > This doesn't make any sense to me. There is already a high water mark for his much memory the kernel needs until an initrd or setup_data item can appear. This is just a hack, please fix it properly.
>
> The startup_64() code in head_64.S changes behavior based on whether
> it's running below or above LOAD_PHYSICAL_ADDR:
>
> #ifdef CONFIG_RELOCATABLE
> leaq startup_32(%rip) /* - $startup_32 */, %rbp
> movl BP_kernel_alignment(%rsi), %eax
> decl %eax
> addq %rax, %rbp
> notq %rax
> andq %rax, %rbp
> cmpq $LOAD_PHYSICAL_ADDR, %rbp
> jae 1f
> #endif
> movq $LOAD_PHYSICAL_ADDR, %rbp
> 1:
>
> In my example, we were running from address 0x00400000. The %rbp
> register will start with 0x00400000, but will be changed to 0x01000000
> after the check against LOAD_PHYSICAL_ADDR fails.
>
> The 0x01000000 value in %rbp is passed to extract_kernel as the
> 'output' argument. Unless choose_random_location() decides
> differently, this will be where the kernel is decompressed to. The
> size of the kernel is large enough in my example that the
> decompression overruns the initrd.
>
> If the startup_64() code didn't have the LOAD_PHYSICAL_ADDR check and
> used %rpb as is, then there would be no issue. The decompression
> would have been to 0x00400000 and would have completed before reaching
> the initrd memory.
>
> That is, the kexec code is being careful to ensure that the kernel and
> initrd memory doesn't overlap, but isn't paying attention to what
> happens if the kernel memory is below LOAD_PHYSICAL_ADDR (the kernel
> address is effectively changed to a different location). My proposed
> change makes it aware, and avoids such addresses.

Wondering why kexec-ed kernel is located under 0x1000000. The loading
code will search physical memory regions bottom up for an available one.
Usually, kexec kernel will be loaded above 16M.

I have posted a patchset to load kernel at top of system RAM for kexec_file
load just as kexec_load has been doing. Do you think it's helpful?

[PATCH 0/2] kexec_file: Load kernel at top of system RAM if required
https://lore.kernel.org/all/20231114091658.228030-1-bhe@xxxxxxxxxx/T/#u

Thanks
Baoquan