Re: [PATCH] bpf: Unregister fentry when bpf_trampoline_unlink_prog fails to update image

From: Chen Zhongjin
Date: Thu Apr 27 2023 - 21:06:22 EST




On 2023/4/28 2:54, Yonghong Song wrote:

...



Fix this when bpf_trampoline_update failed in bpf_trampoline_unlink_prog,
unregister fentry to disable the trampoline. Then other progs on the
trampoline can be unlinked safely and finally the trampoline will be
released.

Do we still leak tr->cur_image here?

No, bpf_tramp_image_put() will free everything when all progs_cnt decline to zero in bpf_trampoline_update(). It is a release function, but called 'put'.

Okay, I see. But it depends on where memory allocation failure
happens. For example, see below:

    static int bpf_trampoline_update(struct bpf_trampoline *tr,
    bool lock_direct_mutex)
    {
        struct bpf_tramp_image *im;
        struct bpf_tramp_links *tlinks;
        u32 orig_flags = tr->flags;
        bool ip_arg = false;
        int err, total;

        tlinks = bpf_trampoline_get_progs(tr, &total, &ip_arg);
        if (IS_ERR(tlinks))
                return PTR_ERR(tlinks);

        if (total == 0) {
                err = unregister_fentry(tr, tr->cur_image->image);
                bpf_tramp_image_put(tr->cur_image);
                tr->cur_image = NULL;
                tr->selector = 0;
                goto out;
        }

        im = bpf_tramp_image_alloc(tr->key, tr->selector);
        if (IS_ERR(im)) {
                err = PTR_ERR(im);
                goto out;
        }

    ...

In bpf_trampoline_get_progs(),

    static struct bpf_tramp_links *
    bpf_trampoline_get_progs(const struct bpf_trampoline *tr, int *total, bool *ip_arg)
    {
        struct bpf_tramp_link *link;
        struct bpf_tramp_links *tlinks;
        struct bpf_tramp_link **links;
        int kind;

        *total = 0;
        tlinks = kcalloc(BPF_TRAMP_MAX, sizeof(*tlinks), GFP_KERNEL);

If we have memory allocation failure, PTR_ERR(tlinks) will be returned
and there is no chance to reach 'total == 0' so tr->cur_image will not
be freed.

But if the memory allocation failure happens in bpf_tramp_image_alloc(...), yes, it is possible eventually when
all progs are unlinked, 'total' could be 0, unregister_fentry()
will be called again (hopefully no side effect) and the image
will be freed.



Fixes: 88fd9e5352fe ("bpf: Refactor trampoline update code")

If the above is a refactoring patch, you should not use that
as 'Fixes' patch, you should pick one truely introduced the issue.

Signed-off-by: Chen Zhongjin <chenzhongjin@xxxxxxxxxx>
---
  kernel/bpf/trampoline.c | 5 ++++-
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index d0ed7d6f5eec..6daa93b30e81 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -604,7 +604,10 @@ static int __bpf_trampoline_unlink_prog(struct bpf_tramp_link *link, struct bpf_
      }
      hlist_del_init(&link->tramp_hlist);
      tr->progs_cnt[kind]--;
-    return bpf_trampoline_update(tr, true /* lock_direct_mutex */);
+    err =  bpf_trampoline_update(tr, true /* lock_direct_mutex */);
+    if (err && tr->cur_image)
+        unregister_fentry(tr, tr->cur_image->image);

If error happens for the all subsequent programs,
unregister_fentry() will be called multiple times. Any side effect?
It will fail with no side effect. Actually if there is no error, modify_fentry() will fail in update() as well. The fentry is available until all progs are unlinked and the broken image is freed by bpf_tramp_image_put().

This will cause a sudden behavior change if there are other active
programs. For example, prog1 and prog2, prog1 unlink triggered
memory allocation failure and caused unregister_fentry which restored
the original ip. Then prog2 even does not have a chance to run any
more even if memory suddenly not an issue any more.


However with an extra state to record this happens, it's possible to re-register the fentry with new image when the next link/unlink calls update(). It will generate a new image and replace/free the error one.

Overall, I think this is an extreme corner case which happens
when kernel memory is extreme tight. If this is the case, not
sure whether it is worthwhile to fix it or not.

Yes, it's a really rare case. I'm just not sure whether it needs some best-effort to avoid kernel panic at this point.

If you think it's not necessary. Just let it go.

At least with current patch, even we could avoid kernel panic, the
user expected behavior might change. To handle all error conditions
in bpf_trampoline_update() might be a big task. So I suggest to just
let it go.

You are right. It's difficult to handle all errors, especially when
problem occurs inside the ftrace process.
So just keep it as is until there is better solution.


Thanks for your time!

Best,
Chen
+    return err;
  }
  /* bpf_trampoline_unlink_prog() should never fail. */