[RFC PATCH 0/5] x86: Build the core kernel using PIC codegen

From: Ard Biesheuvel
Date: Mon Jan 22 2024 - 04:12:51 EST


From: Ard Biesheuvel <ardb@xxxxxxxxxx>

Originally, only arch/x86/kernel/head64.c had some code that required
special care because it executes very early from the 1:1 mapping of the
kernel rather than the ordinary kernel virtual mapping.

This is no longer the case, and there is a lot of SEV related code that
is reachable from the primary startup path, with no guarantees that the
toolchain will produce code that runs correctly. This is especially
problematic when it comes to things like string literals, which are
emitted by the compiler as data objects, and subsequently referenced via
an absolute address that is not mapped yet this early in the boot [0].

Kevin has been looking into failures resulting from the fact that Clang
behaves slightly differently from GCC in this regard, by selectively
applying PIC codegen to the objects in question. However, while this
fixes the observed issues, it does not offer any guarantees, given that
the set of reachable code from startup_64() does not appear to be
bounded when running on SEV hardware.

Instead of applying this change piecemeal to objects that happen to have
caused issues in the past, this series convert the core kernel to PIC
codegen entirely.

Note that this does not entirely solve the problem of the unbounded set
of reachable code from the early SEV entrypoint: there might be code
that attempts to access global objects via their kernel virtual address
(which is not mapped yet). But at least all implicit accesses will be
made via the same translation that the code is running from.

This does result in a slight increase in code size (see below) but it
also reduces the size of the KASLR relocation table (applied by the
decompressor) by roughly half.


Before

$ size -x vmlinux
text data bss dec hex filename
0x1b78ec1 0xdde145 0x381000 47022086 2cd8006 vmlinux

After

$ size -x vmlinux
text data bss dec hex filename
0x1b8371b 0xde0d1d 0x370000 47006776 2cd4438 vmlinux


[0] arch/x86/mm/mem_encrypt_identity.c has some nice examples of this,
where RIP-relative references are emitted using inline asm.

[1] https://lkml.kernel.org/r/20240111223650.3502633-1-kevinloughlin%40google.com

Cc: Kevin Loughlin <kevinloughlin@xxxxxxxxxx>
Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
Cc: Dionna Glaze <dionnaglaze@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Arnd Bergmann <arnd@xxxxxxxx>
Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
Cc: Nathan Chancellor <nathan@xxxxxxxxxx>
Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
Cc: Justin Stitt <justinstitt@xxxxxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: linux-arch@xxxxxxxxxxxxxxx
Cc: bpf@xxxxxxxxxxxxxxx
Cc: llvm@xxxxxxxxxxxxxxx

Ard Biesheuvel (5):
kallsyms: Avoid weak references for kallsyms symbols
vmlinux: Avoid weak reference to notes section
btf: Avoid weak external references
x86/head64: Replace pointer fixups with PIE codegen
x86: Build the core kernel with position independent codegen

arch/x86/Makefile | 18 ++-
arch/x86/boot/compressed/Makefile | 2 +-
arch/x86/entry/vdso/Makefile | 2 +-
arch/x86/include/asm/init.h | 2 -
arch/x86/include/asm/setup.h | 2 +-
arch/x86/kernel/head64.c | 117 +++++++-------------
arch/x86/realmode/rm/Makefile | 1 +
include/asm-generic/vmlinux.lds.h | 23 ++++
kernel/bpf/btf.c | 4 +-
kernel/kallsyms.c | 6 -
kernel/kallsyms_internal.h | 30 ++---
kernel/ksysfs.c | 4 +-
lib/buildid.c | 4 +-
13 files changed, 104 insertions(+), 111 deletions(-)

--
2.43.0.429.g432eaa2c6b-goog