Re: [PATCH] kexec: allocate kernel above bzImage's pref_address

From: Dave Hansen
Date: Fri Dec 15 2023 - 16:17:36 EST


On 12/15/23 11:05, Chris Koch wrote:
> A relocatable kernel will relocate itself to pref_address if it is
> loaded below pref_address. This means a booted kernel may be relocating
> itself to an area with reserved memory on modern systems, potentially
> clobbering arbitrary data that may be important to the system.
>
> This is often the case, as the default value of PHYSICAL_START is
> 0x1000000 and kernels are typically loaded at 0x100000 or above by
> bootloaders like iPXE or kexec. GRUB behaves like this patch does.
>
> Also fixes the documentation around pref_address and PHYSICAL_START to
> be accurate.

Are you reporting a bug and is this a bug fix? It's not super clear
from the changelog.


> diff --git a/Documentation/arch/x86/boot.rst b/Documentation/arch/x86/boot.rst
> index 22cc7a040dae..49bea8986620 100644
> --- a/Documentation/arch/x86/boot.rst
> +++ b/Documentation/arch/x86/boot.rst
> @@ -878,7 +878,8 @@ Protocol: 2.10+
> address if possible.
>
> A non-relocatable kernel will unconditionally move itself and to run
> - at this address.
> + at this address. A relocatable kernel will move itself to this address if it
> + loaded below this address.

I think we should avoid saying the same things over and over again in
different spots.

Here, it doesn't really help to enumerate the different interpretations
of 'pref_address'. All that matters is that the bootloader can avoid
the overhead of a later copy if it can place the kernel at
'pref_address'. The exact reasons that various kernels might decide to
relocate are unimportant here.

> ============ =======
> Field name: init_size
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 3762f41bb092..1370f43328d7 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2109,11 +2109,11 @@ config PHYSICAL_START
> help
> This gives the physical address where the kernel is loaded.
>
> - If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
> - bzImage will decompress itself to above physical address and
> - run from there. Otherwise, bzImage will run from the address where
> - it has been loaded by the boot loader and will ignore above physical
> - address.
> + If the kernel is not relocatable (CONFIG_RELOCATABLE=n) then bzImage
> + will decompress itself to above physical address and run from there.
> + Otherwise, bzImage will run from the address where it has been loaded
> + by the boot loader. The only exception is if it is loaded below the
> + above physical address, in which case it will relocate itself there.

I kinda dislike how this is written. It's written almost like code
where you're spelling out the conditions. I prefer something much
higher-level.

This gives a minimum physical address at which the kernel can be
loaded.

CONFIG_RELOCATABLE=n kernels will be decompressed to and must
run at PHYSICAL_START exactly.

CONFIG_RELOCATABLE=y kernels can run at any address above
PHYSICAL_START. If a kernel is loaded below PHYSICAL_START, it
will relocate itself to PHYSICAL_START.

> In normal kdump cases one does not have to set/change this option
> as now bzImage can be compiled as a completely relocatable image
> diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
> index a61c12c01270..5dcd232d58bf 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -498,7 +498,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> kbuf.bufsz = kernel_len - kern16_size;
> kbuf.memsz = PAGE_ALIGN(header->init_size);
> kbuf.buf_align = header->kernel_alignment;
> - kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> + if (header->pref_address < MIN_KERNEL_LOAD_ADDR)
> + kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> + else
> + kbuf.buf_min = header->pref_address;
> kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
> ret = kexec_add_buffer(&kbuf);
> if (ret)

Comment, please.

It isn't clear from this hunk why or how this fixes the bug. How does
this manage to avoid clobbering reserved areas?