Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure

From: Bill Wendling
Date: Sat Jun 12 2021 - 13:26:29 EST


On Sat, Jun 12, 2021 at 9:59 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Wed, Apr 07, 2021 at 02:17:04PM -0700, Bill Wendling wrote:
> > From: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> > $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> > $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
>
> *sigh*, and not a single x86 person on Cc, how nice :-/
>
This tool is generic and, despite the fact that it's first enabled for
x86, it contains no x86-specific code. The reason we're restricting it
to x86 is because it's the platform we tested on.

> > Note that this method of profiling the kernel is clang-native, unlike
> > the clang support in kernel/gcov.
> >
> > [1] https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization
>
> Also, and I don't see this answered *anywhere*, why are you not using
> perf for this? Your link even mentions Sampling Profilers (and I happen
> to know there's been significant effort to make perf output work as
> input for the PGO passes of the various compilers).
>
Instruction-based (non-sampling) profiling gives us a better
context-sensitive profile, making PGO more impactful. It's also useful
for coverage whereas sampling profiles cannot.

> > Signed-off-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> > Co-developed-by: Bill Wendling <morbo@xxxxxxxxxx>
> > Signed-off-by: Bill Wendling <morbo@xxxxxxxxxx>
> > Tested-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
> > Reviewed-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
> > Reviewed-by: Fangrui Song <maskray@xxxxxxxxxx>
> > ---
> > Documentation/dev-tools/index.rst | 1 +
> > Documentation/dev-tools/pgo.rst | 127 +++++++++
> > MAINTAINERS | 9 +
> > Makefile | 3 +
> > arch/Kconfig | 1 +
> > arch/x86/Kconfig | 1 +
> > arch/x86/boot/Makefile | 1 +
> > arch/x86/boot/compressed/Makefile | 1 +
> > arch/x86/crypto/Makefile | 4 +
> > arch/x86/entry/vdso/Makefile | 1 +
> > arch/x86/kernel/vmlinux.lds.S | 2 +
> > arch/x86/platform/efi/Makefile | 1 +
> > arch/x86/purgatory/Makefile | 1 +
> > arch/x86/realmode/rm/Makefile | 1 +
> > arch/x86/um/vdso/Makefile | 1 +
> > drivers/firmware/efi/libstub/Makefile | 1 +
> > include/asm-generic/vmlinux.lds.h | 34 +++
> > kernel/Makefile | 1 +
> > kernel/pgo/Kconfig | 35 +++
> > kernel/pgo/Makefile | 5 +
> > kernel/pgo/fs.c | 389 ++++++++++++++++++++++++++
> > kernel/pgo/instrument.c | 189 +++++++++++++
> > kernel/pgo/pgo.h | 203 ++++++++++++++
> > scripts/Makefile.lib | 10 +
> > 24 files changed, 1022 insertions(+)
> > create mode 100644 Documentation/dev-tools/pgo.rst
> > create mode 100644 kernel/pgo/Kconfig
> > create mode 100644 kernel/pgo/Makefile
> > create mode 100644 kernel/pgo/fs.c
> > create mode 100644 kernel/pgo/instrument.c
> > create mode 100644 kernel/pgo/pgo.h
>
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -660,6 +660,9 @@ endif # KBUILD_EXTMOD
> > # Defaults to vmlinux, but the arch makefile usually adds further targets
> > all: vmlinux
> >
> > +CFLAGS_PGO_CLANG := -fprofile-generate
> > +export CFLAGS_PGO_CLANG
> > +
> > CFLAGS_GCOV := -fprofile-arcs -ftest-coverage \
> > $(call cc-option,-fno-tree-loop-im) \
> > $(call cc-disable-warning,maybe-uninitialized,)
>
> And which of the many flags in noinstr disables this?
>
These flags aren't used with PGO. So there's no need to disable them.

> Basically I would like to NAK this whole thing until someone can
> adequately explain the interaction with noinstr and why we need those
> many lines of kernel code and can't simply use perf for this.

-bw