Re: [PATCH] error-injection: Add prompt for function error injection

From: Benjamin Tissoires
Date: Fri Dec 02 2022 - 09:56:46 EST




On 12/1/22 22:13, Linus Torvalds wrote:
On Thu, Dec 1, 2022 at 8:59 AM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:

The hid-bpf framework depends on it.

Ok, this is completely unacceptably disgusting hack.

[foreword: I have read the other replies, just replying to this one
because it is the explicit ask for a fix]


That needs fixing.

Either hid-bpf or bpf core can add
"depends on FUNCTION_ERROR_INJECTION"

No, it needs to be narrowed down a lot. Nobody sane wants error
injection just because they want some random HID thing.

And no, BPF shouldn't need it either.

This needs to be narrowed down to the point where HID can say "I want
*this* particular call to be able to be a bpf call.

So, would the following be OK? I still have a few concerns I'll explain
after the patch.

---
From 1290561304eb3e48e1e6d727def13c16698a26f1 Mon Sep 17 00:00:00 2001
From: Benjamin Tissoires <benjamin.tissoires@xxxxxxxxxx>
Date: Fri, 2 Dec 2022 12:40:29 +0100
Subject: [PATCH] bpf: do not rely on ALLOW_ERROR_INJECTION for fmod_ret

The current way of expressing that a non-bpf kernel component is willing
to accept that bpf programs can be attached to it and that they can change
the return value is to abuse ALLOW_ERROR_INJECTION.
This is debated in the link below, and the result is that it is not a
reasonable thing to do.

An easy fix is to keep the list of valid functions in the BPF verifier
in the same way we keep the non-sleepable ALLOW_ERROR_INJECTION ones.
However, this kind of defeat the point of being able to add bpf APIs in
non-BPF subsystems, so we probably need to rethink that part.


Link: https://lore.kernel.org/all/20221121104403.1545f9b5@xxxxxxxxxxxxxxxxxx/
Suggested-by: Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx>
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@xxxxxxxxxx>
---
drivers/hid/bpf/hid_bpf_dispatch.c | 2 --
drivers/hid/bpf/hid_bpf_jmp_table.c | 1 -
kernel/bpf/verifier.c | 20 +++++++++++++++++++-
3 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
index 3c989e74d249..d1f6a1d4ae60 100644
--- a/drivers/hid/bpf/hid_bpf_dispatch.c
+++ b/drivers/hid/bpf/hid_bpf_dispatch.c
@@ -44,7 +44,6 @@ __weak noinline int hid_bpf_device_event(struct hid_bpf_ctx *ctx)
{
return 0;
}
-ALLOW_ERROR_INJECTION(hid_bpf_device_event, ERRNO);
u8 *
dispatch_hid_bpf_device_event(struct hid_device *hdev, enum hid_report_type type, u8 *data,
@@ -105,7 +104,6 @@ __weak noinline int hid_bpf_rdesc_fixup(struct hid_bpf_ctx *ctx)
{
return 0;
}
-ALLOW_ERROR_INJECTION(hid_bpf_rdesc_fixup, ERRNO);
u8 *call_hid_bpf_rdesc_fixup(struct hid_device *hdev, u8 *rdesc, unsigned int *size)
{
diff --git a/drivers/hid/bpf/hid_bpf_jmp_table.c b/drivers/hid/bpf/hid_bpf_jmp_table.c
index 579a6c06906e..207972b028d9 100644
--- a/drivers/hid/bpf/hid_bpf_jmp_table.c
+++ b/drivers/hid/bpf/hid_bpf_jmp_table.c
@@ -103,7 +103,6 @@ __weak noinline int __hid_bpf_tail_call(struct hid_bpf_ctx *ctx)
{
return 0;
}
-ALLOW_ERROR_INJECTION(__hid_bpf_tail_call, ERRNO);
int hid_bpf_prog_run(struct hid_device *hdev, enum hid_bpf_prog_type type,
struct hid_bpf_ctx_kern *ctx_kern)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 225666307bba..4eac440d659f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -24,6 +24,7 @@
#include <linux/bpf_lsm.h>
#include <linux/btf_ids.h>
#include <linux/poison.h>
+#include <linux/hid_bpf.h>
#include "disasm.h"
@@ -14827,6 +14828,20 @@ static int check_non_sleepable_error_inject(u32 btf_id)
return btf_id_set_contains(&btf_non_sleepable_error_inject, btf_id);
}
+/* Manually tag fmod_ret functions to not misuse ALLOW_ERROR_INJECTION */
+BTF_SET_START(btf_modify_return)
+#if CONFIG_HID_BPF
+BTF_ID(func, hid_bpf_device_event)
+BTF_ID(func, hid_bpf_rdesc_fixup)
+BTF_ID(func, __hid_bpf_tail_call)
+#endif /* CONFIG_HID_BPF */
+BTF_SET_END(btf_modify_return)
+
+static int check_function_modify_return(u32 btf_id)
+{
+ return btf_id_set_contains(&btf_modify_return, btf_id);
+}
+
int bpf_check_attach_target(struct bpf_verifier_log *log,
const struct bpf_prog *prog,
const struct bpf_prog *tgt_prog,
@@ -15047,7 +15062,10 @@ int bpf_check_attach_target(struct bpf_verifier_log *log,
bpf_log(log, "can't modify return codes of BPF programs\n");
return -EINVAL;
}
- ret = check_attach_modify_return(addr, tname);
+ ret = -EINVAL;
+ if (!check_function_modify_return(btf_id) ||
+ check_attach_modify_return(addr, tname))
+ ret = 0;
if (ret) {
bpf_log(log, "%s() is not modifiable\n", tname);
return ret;
--
2.38.1
---

While this patch removes the need for ALLOW_ERROR_INJECTION it has a
couple of drawbacks:
- suddenly we lose the nice separation of concerns between bpf core and
its users (HID in my case)
- it would need to be changed in 6.3 simply because of the previous
point, so it is just a temporary fix.

So I am not sure if this would qualify HID-BPF for 6.2. Please speak up.


Stop this crazy "bpf / hid needs error injection" when that then
implies a _lot_ more than that, plus is documented to be something
entirely different anyway.

I realize that HID has mis-used the "we could just use error injection
here to instead insert random bpf code", but that's a complete hack.

Just to be fair, HID only happens to be the first on the front line for
this particular problem. I was told to use that mis-use because that was
probably what was available, and adding that decorator didn't seem to be
such a big deal to me.

But with a bigger picture in mind, I am happy we get to that point now
before it is merged because I know that behind me there are a few other
people in other subsystems also willing to take advantage of BPF in
their own subsystem. And at Plumbers, everybody was saying: look at the
HID-BPF patch series.

Cheers,
Benjamin


Plus it seems to happily not even have made it into mainline anyway,
and only exists in linux-next. Let's head that disgusting hack off at
the pass.

I'm going to apply Steven's patch, because honestly, we need to fix
this disgusting mess *before* it gets to mainline, rather than say
"oh, we already have broken users in next, so let's bend over
backwards for that".

The code is called "error injection", not "random bpf extension"

Linus