Re: BPF skels in perf .Re: [GIT PULL] perf tools changes for v6.4

From: Arnaldo Carvalho de Melo
Date: Fri May 05 2023 - 09:33:24 EST


Em Fri, May 05, 2023 at 01:03:14AM +0200, Jiri Olsa escreveu:
> On Thu, May 04, 2023 at 03:03:42PM -0700, Ian Rogers wrote:
> > On Thu, May 4, 2023 at 2:48 PM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> > >
> > > Em Thu, May 04, 2023 at 04:07:29PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Thu, May 04, 2023 at 11:50:07AM -0700, Andrii Nakryiko escreveu:
> > > > > On Thu, May 4, 2023 at 10:52 AM Arnaldo Carvalho de Melo <acme@xxxxxxxxxx> wrote:
> > > > > > Andrii, can you add some more information about the usage of vmlinux.h
> > > > > > instead of using kernel headers?
> > > >
> > > > > I'll just say that vmlinux.h is not a hard requirement to build BPF
> > > > > programs, it's more a convenience allowing easy access to definitions
> > > > > of both UAPI and kernel-internal structures for tracing needs and
> > > > > marking them relocatable using BPF CO-RE machinery. Lots of real-world
> > > > > applications just check-in pregenerated vmlinux.h to avoid build-time
> > > > > dependency on up-to-date host kernel and such.
> > > >
> > > > > If vmlinux.h generation and usage is causing issues, though, given
> > > > > that perf's BPF programs don't seem to be using many different kernel
> > > > > types, it might be a better option to just use UAPI headers for public
> > > > > kernel type definitions, and just define CO-RE-relocatable minimal
> > > > > definitions locally in perf's BPF code for the other types necessary.
> > > > > E.g., if perf needs only pid and tgid from task_struct, this would
> > > > > suffice:
> > > >
> > > > > struct task_struct {
> > > > > int pid;
> > > > > int tgid;
> > > > > } __attribute__((preserve_access_index));
> > > >
> > > > Yeah, that seems like a way better approach, no vmlinux involved, libbpf
> > > > CO-RE notices that task_struct changed from this two integers version
> > > > (of course) and does the relocation to where it is in the running kernel
> > > > by using /sys/kernel/btf/vmlinux.
> > >
> > > Doing it for one of the skels, build tested, runtime untested, but not
> > > using any vmlinux, BTF to help, not that bad, more verbose, but at least
> > > we state what are the fields we actually use, have those attribute
> > > documenting that those offsets will be recorded for future use, etc.
> > >
> > > Namhyung, can you please check that this works?
> > >
> > > Thanks,
> > >
> > > - Arnaldo
> > >
> > > diff --git a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > index 6a438e0102c5a2cb..f376d162549ebd74 100644
> > > --- a/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > +++ b/tools/perf/util/bpf_skel/bperf_cgroup.bpf.c
> > > @@ -1,11 +1,40 @@
> > > // SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > > // Copyright (c) 2021 Facebook
> > > // Copyright (c) 2021 Google
> > > -#include "vmlinux.h"
> > > +#include <linux/types.h>
> > > +#include <linux/bpf.h>
> >
> > Compared to vmlinux.h here be dragons. It is easy to start dragging in
> > all of libc and that may not work due to missing #ifdefs, etc.. Could
> > we check in a vmlinux.h like libbpf-tools does?
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools#vmlinuxh-generation
> > https://github.com/iovisor/bcc/tree/master/libbpf-tools/arm64
> >
> > This would also remove some of the errors that could be introduced by
> > copy+pasting enums, etc. and also highlight issues with things being
> > renamed as build time rather than runtime failures.
>
> we already have to deal with that, right? doing checks on fields in
> structs like mm_struct___old
>
> > Could this be some shared resource for the different linux tools
> > projects using a vmlinux.h? e.g. tools/lib/vmlinuxh with an
> > install_headers target that builds a vmlinux.h.
>
> I tried to do the minimal header and it's not too big,
> I pushed it in here:
> https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=perf/vmlinux_h
>
> compile tested so far

I see it and it makes the change to be minimal, which is good at the
current stage, but I wonder if it wouldn't be better for us to define
just the ones not in UAPI and use the #include <linux/bpf.h>,
<linux/perf_event.h> as I did in the patches I posted here and Namhyung
tested at least one, this way the added vmlinux.h file get even smaller
by not including things like:

[acme@quaco perf-tools]$ egrep -w '(perf_event_sample_format|bpf_perf_event_value|perf_sample_weight|perf_mem_data_src) {' include/uapi/linux/*.h
include/uapi/linux/bpf.h:struct bpf_perf_event_value {
include/uapi/linux/perf_event.h:enum perf_event_sample_format {
include/uapi/linux/perf_event.h:union perf_mem_data_src {
include/uapi/linux/perf_event.h:union perf_mem_data_src {
include/uapi/linux/perf_event.h:union perf_sample_weight {
[acme@quaco perf-tools]$

Also why do we need these:

+struct mm_struct {
+} __attribute__((preserve_access_index));
+
+struct raw_spinlock {
+} __attribute__((preserve_access_index));
+
+typedef struct raw_spinlock raw_spinlock_t;
+
+struct spinlock {
+} __attribute__((preserve_access_index));
+
+typedef struct spinlock spinlock_t;
+
+struct sighand_struct {
+ spinlock_t siglock;
+} __attribute__((preserve_access_index));

We don't use them, they're just pointers you kept on:

+struct task_struct {
+ struct css_set *cgroups;
+ pid_t pid;
+ pid_t tgid;
+ char comm[16];
+ struct mm_struct *mm;
+ struct sighand_struct *sighand;
+ unsigned int flags;
+} __attribute__((preserve_access_index));

That with the preserve_access_index isn't needed, we need just the
fields that we access in the tools, right?

- Arnaldo