Re: [PATCH v2 00/15] x86/boot: Rework PE header generation

From: Ard Biesheuvel
Date: Mon Oct 23 2023 - 07:23:31 EST


On Tue, 3 Oct 2023 at 04:03, Jan Hendrik Farr <kernel@xxxxxxxx> wrote:
>
> On 12 09:00:51, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@xxxxxxxxxx>
> >
> > Now that the EFI stub boot flow no longer relies on memory that is
> > executable and writable at the same time, we can reorganize the PE/COFF
> > view of the kernel image and expose the decompressor binary's code and
> > r/o data as a .text section and data/bss as a .data section, using 4k
> > alignment and limited permissions.
> >
> > Doing so is necessary for compatibility with hardening measures that are
> > being rolled out on x86 PCs built to run Windows (i.e., the majority of
> > them). The EFI boot environment that the Linux EFI stub executes in is
> > especially sensitive to safety issues, given that a vulnerability in the
> > loader of one OS can be abused to attack another.
>
> This split is also useful for the work of kexecing the next kernel as an
> EFI application. With the current EFI stub I have to set the memory both
> writable and executable which results in W^X warnings with a default
> config.
>
> What made this more confusing was that the flags of the .text section in
> current EFI stub bzImages are set to
> IMAGE_SCN_MEM_EXECUTE | IMAGE_SCN_MEM_READ. So if you load that section
> according to those flags the EFI stub will quickly run into issues.
>
> I assume current firmware on x86 machines does not set any restricted
> permissions on the memory. Can someone enlighten me on their behavior?
>

No current x86 firmware does not use restricted permissions at all.
All memory is mapped with both writable and executable permissions,
except maybe the stack.

The x86 Linux kernel has been depending on this behavior too, up until
recently (fixes are in -rc now for the v6.6 release). Before this, it
would copy its own executable image around in memory.

So EFI based kexec will need to support this behavior if it targets
older x86 kernels, although I am skeptical that this is a useful
design goal.

I have been experimenting with running the EFI stub code in user space
all the way until ExitBootServices(). The same might work for UKI if
it is layered cleanly on top of the EFI APIs (rather than poking into
system registers or page tables under the hood).

How this would work with signed images etc is TBD but I quite like the
idea of running everything in user space and having a minimal
purgatory (or none at all) if we can simply populate the entire
address space while running unprivileged, and just branch to it in the
kexec() syscall. I imagine this being something like a userspace
helper that is signed/trusted itself, and gets invoked by the kernel
to run EFI images that are trusted and tagged as being executable
unprivileged.