Re: [PATCH v2] pgo: add clang's Profile Guided Optimization infrastructure

From: Fāng-ruì Sòng
Date: Tue Jan 12 2021 - 12:46:12 EST


On Tue, Jan 12, 2021 at 9:37 AM 'Nick Desaulniers' via Clang Built
Linux <clang-built-linux@xxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Jan 11, 2021 at 9:14 PM Bill Wendling <morbo@xxxxxxxxxx> wrote:
> >
> > From: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> >
> > Enable the use of clang's Profile-Guided Optimization[1]. To generate a
> > profile, the kernel is instrumented with PGO counters, a representative
> > workload is run, and the raw profile data is collected from
> > /sys/kernel/debug/pgo/profraw.
> >
> > The raw profile data must be processed by clang's "llvm-profdata" tool
> > before it can be used during recompilation:
> >
> > $ cp /sys/kernel/debug/pgo/profraw vmlinux.profraw
> > $ llvm-profdata merge --output=vmlinux.profdata vmlinux.profraw
> >
> > Multiple raw profiles may be merged during this step.
> >
> > The data can now be used by the compiler:
> >
> > $ make LLVM=1 KCFLAGS=-fprofile-use=vmlinux.profdata ...
> >
> > This initial submission is restricted to x86, as that's the platform we
>
> Please drop all changes to arch/* that are not to arch/x86/ then; we
> can cross that bridge when we get to each arch. For example, there's
> no point disabling PGO for architectures LLVM doesn't even have a
> backend for.
>
> > know works. This restriction can be lifted once other platforms have
> > been verified to work with PGO.
> >
> > Note that this method of profiling the kernel is clang-native and isn't
> > compatible with clang's gcov support in kernel/gcov.
>
> Then the Kconfig option should depend on !GCOV so that they are
> mutually exclusive and can't be selected together accidentally; such
> as by bots doing randconfig tests.

The profile formats (Clang PGO, Clang gcov, GCC gcov/PGO) are
different but Clang PGO can be used with Clang's gcov implementation:
clang -fprofile-generate --coverage a.cc; ./a.out => default*.profraw + a.gcda

> <large snip>
>
> > +static inline int inst_prof_popcount(unsigned long long value)
> > +{
> > + value = value - ((value >> 1) & 0x5555555555555555ULL);
> > + value = (value & 0x3333333333333333ULL) +
> > + ((value >> 2) & 0x3333333333333333ULL);
> > + value = (value + (value >> 4)) & 0x0F0F0F0F0F0F0F0FULL;
> > +
> > + return (int)((unsigned long long)(value * 0x0101010101010101ULL) >> 56);
> > +}
>
> The kernel has a portable popcnt implementation called hweight64 if
> you #include <asm-generic/bitops/hweight.h>; does that work here?
> https://en.wikipedia.org/wiki/Hamming_weight
> --
> Thanks,
> ~Nick Desaulniers
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@xxxxxxxxxxxxxxxx.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/CAKwvOdk%2BNqhzC_4wFbQMJmLMQWoDSjQiRJyCGe5dsWkqK_NJJQ%40mail.gmail.com.