[PATCH 00/46] gcc-LTO support for the kernel

From: Jiri Slaby (SUSE)
Date: Mon Nov 14 2022 - 06:44:07 EST


Hi,

this is the first call for comments (and kbuild complaints) for this
support of gcc (full) LTO in the kernel. Most of the patches come from
Andi. Me and Martin rebased them to new kernels and fixed the to-use
known issues. Also I updated most of the commit logs and reordered the
patches to groups of patches with similar intent.

The very first patch comes from Alexander and is pending on some x86
queue already (I believe). I am attaching it only for completeness.
Without that, the kernel does not boot (LTO reorders a lot).

In our measurements, the performance differences are negligible.

The kernel is bigger with gcc LTO due to more inlining. The next step
might be to play with non-static functions as we export everything, so
the compiler cannot actually drop anything (esp. inlined and no longer
needed functions).

Cc: Alexander Potapenko <glider@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
Cc: Alexey Makhalov <amakhalov@xxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrey Konovalov <andreyknvl@xxxxxxxxx>
Cc: Andrey Ryabinin <ryabinin.a.a@xxxxxxxxx>
Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Ard Biesheuvel <ardb@xxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Ben Segall <bsegall@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
Cc: Don Zickus <dzickus@xxxxxxxxxx>
Cc: Hao Luo <haoluo@xxxxxxxxxx>
Cc: H.J. Lu <hjl.tools@xxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Huang Rui <ray.huang@xxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Jan Hubicka <jh@xxxxxxx>
Cc: Jason Baron <jbaron@xxxxxxxxxx>
Cc: Jiri Kosina <jikos@xxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Joe Lawrence <joe.lawrence@xxxxxxxxxx>
Cc: John Fastabend <john.fastabend@xxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: Juergen Gross <jgross@xxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: KP Singh <kpsingh@xxxxxxxxxx>
Cc: Mark Rutland <mark.rutland@xxxxxxx>
Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
Cc: Martin Liska <mliska@xxxxxxx>
Cc: Masahiro Yamada <masahiroy@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Miguel Ojeda <ojeda@xxxxxxxxxx>
Cc: Michal Marek <michal.lkml@xxxxxxxxxxx>
Cc: Miroslav Benes <mbenes@xxxxxxx>
Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Petr Mladek <pmladek@xxxxxxxx>
Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
Cc: Richard Biener <RGuenther@xxxxxxxx>
Cc: Sedat Dilek <sedat.dilek@xxxxxxxxx>
Cc: Song Liu <song@xxxxxxxxxx>
Cc: Stanislav Fomichev <sdf@xxxxxxxxxx>
Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Vincenzo Frascino <vincenzo.frascino@xxxxxxx>
Cc: Viresh Kumar <viresh.kumar@xxxxxxxxxx>
Cc: VMware PV-Drivers Reviewers <pv-drivers@xxxxxxxxxx>
Cc: Yonghong Song <yhs@xxxxxx>

Alexander Lobakin (1):
x86/boot: robustify calling startup_{32,64}() from the decompressor
code

Andi Kleen (36):
Compiler Attributes, lto: introduce __noreorder
tracepoint, lto: Mark static call functions as __visible
static_call, lto: Mark static keys as __visible
static_call, lto: Mark static_call_return0() as __visible
static_call, lto: Mark func_a() as __visible_on_lto
x86/alternative, lto: Mark int3_*() as global and __visible
x86/paravirt, lto: Mark native_steal_clock() as __visible_on_lto
x86/preempt, lto: Mark preempt_schedule_*thunk() as __visible
x86/xen, lto: Mark xen_vcpu_stolen() as __visible
x86, lto: Mark gdt_page and native_sched_clock() as __visible
amd, lto: Mark amd pmu and pstate functions as __visible_on_lto
entry, lto: Mark raw_irqentry_exit_cond_resched() as __visible
export, lto: Mark __kstrtab* in EXPORT_SYMBOL() as global and
__visible
softirq, lto: Mark irq_enter/exit_rcu() as __visible
btf, lto: Make all BTF IDs global on LTO
init.h, lto: mark initcalls as __noreorder
bpf, lto: mark interpreter jump table as __noreorder
sched, lto: mark sched classes as __noreorder
linkage, lto: use C version for SYSCALL_ALIAS() / cond_syscall()
scripts, lto: re-add gcc-ld
scripts, lto: use CONFIG_LTO for many LTO specific actions
Kbuild, lto: Add Link Time Optimization support
x86/purgatory, lto: Disable gcc LTO for purgatory
x86/realmode, lto: Disable gcc LTO for real mode code
x86/vdso, lto: Disable gcc LTO for the vdso
scripts, lto: disable gcc LTO for some mod sources
Kbuild, lto: disable gcc LTO for bounds+asm-offsets
lib/string, lto: disable gcc LTO for string.o
Compiler attributes, lto: disable __flatten with LTO
Kbuild, lto: don't include weak source file symbols in System.map
x86, lto: Disable relative init pointers with gcc LTO
x86/livepatch, lto: Disable live patching with gcc LTO
x86/lib, lto: Mark 32bit mem{cpy,move,set} as __used
scripts, lto: check C symbols for modversions
scripts/bloat-o-meter, lto: handle gcc LTO
x86, lto: Finally enable gcc LTO for x86

Jiri Slaby (5):
kbuild: pass jobserver to cmd_ld_vmlinux.o
compiler.h: introduce __visible_on_lto
compiler.h: introduce __global_on_lto
btf, lto: pass scope as strings
x86/apic, lto: Mark apic_driver*() as __noreorder

Martin Liska (4):
kbuild: lto: preserve MAKEFLAGS for module linking
x86/sev, lto: Mark cpuid_table_copy as __visible_on_lto
mm/kasan, lto: Mark kasan mem{cpy,move,set} as __used
kasan, lto: remove extra BUILD_BUG() in memory_is_poisoned

Documentation/kbuild/index.rst | 2 +
Documentation/kbuild/lto-build.rst | 76 +++++++++++++++++++++++++++++
Kbuild | 3 ++
Makefile | 6 ++-
arch/Kconfig | 52 ++++++++++++++++++++
arch/x86/Kconfig | 5 +-
arch/x86/boot/compressed/head_32.S | 2 +-
arch/x86/boot/compressed/head_64.S | 2 +-
arch/x86/boot/compressed/misc.c | 16 +++---
arch/x86/entry/vdso/Makefile | 2 +
arch/x86/events/amd/core.c | 2 +-
arch/x86/include/asm/apic.h | 4 +-
arch/x86/include/asm/preempt.h | 4 +-
arch/x86/kernel/alternative.c | 5 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/kernel/paravirt.c | 2 +-
arch/x86/kernel/sev-shared.c | 2 +-
arch/x86/kernel/tsc.c | 2 +-
arch/x86/lib/memcpy_32.c | 6 +--
arch/x86/purgatory/Makefile | 2 +
arch/x86/realmode/Makefile | 1 +
drivers/cpufreq/amd-pstate.c | 15 +++---
drivers/xen/time.c | 2 +-
include/asm-generic/vmlinux.lds.h | 2 +-
include/linux/btf_ids.h | 24 ++++-----
include/linux/compiler.h | 8 +++
include/linux/compiler_attributes.h | 15 ++++++
include/linux/export.h | 6 ++-
include/linux/init.h | 2 +-
include/linux/linkage.h | 16 +++---
include/linux/static_call.h | 12 ++---
include/linux/tracepoint.h | 4 +-
kernel/bpf/core.c | 2 +-
kernel/entry/common.c | 2 +-
kernel/kallsyms.c | 2 +-
kernel/livepatch/Kconfig | 1 +
kernel/sched/sched.h | 1 +
kernel/softirq.c | 4 +-
kernel/static_call.c | 2 +-
kernel/static_call_inline.c | 6 +--
kernel/time/posix-stubs.c | 19 +++++++-
lib/Makefile | 2 +
mm/kasan/generic.c | 2 +-
mm/kasan/shadow.c | 6 +--
scripts/Makefile.build | 17 ++++---
scripts/Makefile.lib | 2 +-
scripts/Makefile.lto | 43 ++++++++++++++++
scripts/Makefile.modfinal | 2 +-
scripts/Makefile.vmlinux | 3 +-
scripts/Makefile.vmlinux_o | 6 +--
scripts/bloat-o-meter | 2 +-
scripts/gcc-ld | 40 +++++++++++++++
scripts/link-vmlinux.sh | 9 ++--
scripts/mksysmap | 2 +
scripts/mod/Makefile | 3 ++
scripts/module.lds.S | 2 +-
56 files changed, 384 insertions(+), 100 deletions(-)
create mode 100644 Documentation/kbuild/lto-build.rst
create mode 100644 scripts/Makefile.lto
create mode 100755 scripts/gcc-ld

--
2.38.1