Re: [RFC PATCH 0/5] mm: Select victim memcg using BPF_OOM_POLICY

From: Alexei Starovoitov
Date: Thu Jul 27 2023 - 11:57:20 EST


On Thu, Jul 27, 2023 at 4:45 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On 27/07/2023 08:36, Chuyi Zhou wrote:
> > This patchset tries to add a new bpf prog type and use it to select
> > a victim memcg when global OOM is invoked. The mainly motivation is
> > the need to customizable OOM victim selection functionality so that
> > we can protect more important app from OOM killer.
> >
>
> It's a nice use case, but at a high level, the approach pursued here
> is, as I understand it, discouraged for new BPF program development.
> Specifically, adding a new BPF program type with semantics like this
> is not preferred. Instead, can you look at using something like
>
> - using "fmod_ret" instead of a new program type
> - use BPF kfuncs instead of helpers.
> - add selftests in tools/testing/selftests/bpf not samples.

+1 to what Alan said above and below.

Also as Michal said there needs to be a design doc with pros and cons.
We see far too often that people attempt to use BPF in places where it
shouldn't be.
If programmability is not strictly necessary then BPF is not a good fit.

> There's some examples of how solutions have evolved from the traditional
> approach (adding a new program type, helpers etc) to using kfuncs etc on
> this list - for example HID-BPF and the BPF scheduler series - which
> should help orient you. There are presentations from Linux Plumbers 2022
> that walk through some of this too.
>
> Judging by the sample program example, all you should need here is a way
> to override the return value of bpf_oom_set_policy() - a noinline
> function that by default returns a no-op. It can then be overridden by
> an "fmod_ret" BPF program.
>
> One thing you lose is cgroup specificity at BPF attach time, but you can
> always add predicates based on the cgroup to your BPF program if needed.
>
> Alan
>
> > Chuyi Zhou (5):
> > bpf: Introduce BPF_PROG_TYPE_OOM_POLICY
> > mm: Select victim memcg using bpf prog
> > libbpf, bpftool: Support BPF_PROG_TYPE_OOM_POLICY
> > bpf: Add a new bpf helper to get cgroup ino
> > bpf: Sample BPF program to set oom policy
> >
> > include/linux/bpf_oom.h | 22 ++++
> > include/linux/bpf_types.h | 2 +
> > include/linux/memcontrol.h | 6 ++
> > include/uapi/linux/bpf.h | 21 ++++
> > kernel/bpf/core.c | 1 +
> > kernel/bpf/helpers.c | 17 +++
> > kernel/bpf/syscall.c | 10 ++
> > mm/memcontrol.c | 50 +++++++++
> > mm/oom_kill.c | 185 +++++++++++++++++++++++++++++++++
> > samples/bpf/Makefile | 3 +
> > samples/bpf/oom_kern.c | 42 ++++++++
> > samples/bpf/oom_user.c | 128 +++++++++++++++++++++++
> > tools/bpf/bpftool/common.c | 1 +
> > tools/include/uapi/linux/bpf.h | 21 ++++
> > tools/lib/bpf/libbpf.c | 3 +
> > tools/lib/bpf/libbpf_probes.c | 2 +
> > 16 files changed, 514 insertions(+)
> > create mode 100644 include/linux/bpf_oom.h
> > create mode 100644 samples/bpf/oom_kern.c
> > create mode 100644 samples/bpf/oom_user.c
> >