Re: [PATCH v2] pgo: add clang's Profile Guided Optimization infrastructure

From: Nick Desaulniers
Date: Tue Jan 12 2021 - 12:37:56 EST


On Mon, Jan 11, 2021 at 9:14 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
>
> From: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
>
> Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> profile, the kernel is instrumented with PGO counters, a representative
> workload is run, and the raw profile data is collected from
> /sys/kernel/debug/pgo/profraw.
>
> The raw profile data must be processed by clang's "llvm-profdata" tool
> before it can be used during recompilation:
>
> $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
>
> Multiple raw profiles may be merged during this step.
>
> The data can now be used by the compiler:
>
> $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
>
> This initial submission is restricted to x86, as that's the platform we

Please drop all changes to arch/* that are not to arch/x86/ then; we
can cross that bridge when we get to each arch. For example, there's
no point disabling PGO for architectures LLVM doesn't even have a
backend for.

> know works. This restriction can be lifted once other platforms have
> been verified to work with PGO.
>
> Note that this method of profiling the kernel is clang-native and isn't
> compatible with clang's gcov support in kernel/gcov.

Then the Kconfig option should depend on !GCOV so that they are
mutually exclusive and can't be selected together accidentally; such
as by bots doing randconfig tests.

<large snip>

> +static inline int inst_prof_popcount(unsigned long long value)
> +{
> + value = value - ((value >> 1) & 0x5555555555555555ULL);
> + value = (value & 0x3333333333333333ULL) +
> + ((value >> 2) & 0x3333333333333333ULL);
> + value = (value + (value >> 4)) & 0x0F0F0F0F0F0F0F0FULL;
> +
> + return (int)((unsigned long long)(value * 0x0101010101010101ULL) >> 56);
> +}

The kernel has a portable popcnt implementation called hweight64 if
you #include <asm-generic/bitops/hweight.h>; does that work here?
https://en.wikipedia.org/wiki/Hamming_weight
--
Thanks,
~Nick Desaulniers