Re: [PATCH bpf-next v3 6/7] bpf: Let bpf_iter_task_new accept null task ptr

From: Chuyi Zhou
Date: Sun Oct 01 2023 - 04:33:23 EST


Hello,

在 2023/9/28 07:37, Andrii Nakryiko 写道:
On Mon, Sep 25, 2023 at 3:56 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:

When using task_iter to iterate all threads of a specific task, we enforce
that the user must pass a valid task pointer to ensure safety. However,
when iterating all threads/process in the system, BPF verifier still
require a valid ptr instead of "nullable" pointer, even though it's
pointless, which is a kind of surprising from usability standpoint. It
would be nice if we could let that kfunc accept a explicit null pointer
when we are using BPF_TASK_ITER_ALL/BPF_TASK_ITER_PROC and a valid pointer
when using BPF_TASK_ITER_THREAD.

Given a trival kfunc:
__bpf_kfunc void FN(struct TYPE_A *obj)

BPF Prog would reject a nullptr for obj. The error info is:
"arg#x pointer type xx xx must point to scalar, or struct with scalar"
reported by get_kfunc_ptr_arg_type(). The reg->type is SCALAR_VALUE and
the btf type of ref_t is not scalar or scalar_struct which leads to the
rejection of get_kfunc_ptr_arg_type.

This patch reuse the __opt annotation which was used to indicate that
the buffer associated with an __sz or __szk argument may be null:
__bpf_kfunc void FN(struct TYPE_A *obj__opt)
Here __opt indicates obj can be optional, user can pass a explicit nullptr
or a normal TYPE_A pointer. In get_kfunc_ptr_arg_type(), we will detect
whether the current arg is optional and register is null, If so, return
a new kfunc_ptr_arg_type KF_ARG_PTR_TO_NULL and skip to the next arg in
check_kfunc_args().

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
kernel/bpf/task_iter.c | 7 +++++--
kernel/bpf/verifier.c | 13 ++++++++++++-
2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index 9bcd3f9922b1..7ac007f161cc 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -867,7 +867,7 @@ struct bpf_iter_task_kern {
unsigned int type;
} __attribute__((aligned(8)));

-__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task, unsigned int type)
+__bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *task__opt, unsigned int type)
{
struct bpf_iter_task_kern *kit = (void *)it;
BUILD_BUG_ON(sizeof(struct bpf_iter_task_kern) != sizeof(struct bpf_iter_task));
@@ -877,14 +877,17 @@ __bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it, struct task_struct *
switch (type) {
case BPF_TASK_ITER_ALL:
case BPF_TASK_ITER_PROC:
+ break;
case BPF_TASK_ITER_THREAD:
+ if (!task__opt)
+ return -EINVAL;
break;
default:
return -EINVAL;
}

if (type == BPF_TASK_ITER_THREAD)
- kit->task = task;
+ kit->task = task__opt;
else
kit->task = &init_task;
kit->pos = kit->task;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a065e18a0b3a..a79204c75a90 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10331,6 +10331,7 @@ enum kfunc_ptr_arg_type {
KF_ARG_PTR_TO_CALLBACK,
KF_ARG_PTR_TO_RB_ROOT,
KF_ARG_PTR_TO_RB_NODE,
+ KF_ARG_PTR_TO_NULL,
};

enum special_kfunc_type {
@@ -10425,6 +10426,12 @@ static bool is_kfunc_bpf_rcu_read_unlock(struct bpf_kfunc_call_arg_meta *meta)
return meta->func_id == special_kfunc_list[KF_bpf_rcu_read_unlock];
}

+static inline bool is_kfunc_arg_optional_null(struct bpf_reg_state *reg,
+ const struct btf *btf, const struct btf_param *arg)
+{
+ return register_is_null(reg) && is_kfunc_arg_optional(btf, arg);
+}
+
static enum kfunc_ptr_arg_type
get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
struct bpf_kfunc_call_arg_meta *meta,
@@ -10497,6 +10504,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
*/
if (!btf_type_is_scalar(ref_t) && !__btf_type_is_scalar_struct(env, meta->btf, ref_t, 0) &&
(arg_mem_size ? !btf_type_is_void(ref_t) : 1)) {
+ if (is_kfunc_arg_optional_null(reg, meta->btf, &args[argno]))
+ return KF_ARG_PTR_TO_NULL;

This nested check seems misplaced. Maybe we shouldn't reuse __opt
suffix which already has a different meaning (coupled with __sz). Why
not add "__nullable" convention and just check it separately?


IIUC, do you mean:

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index dbba2b806017..05d197365fcb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10458,6 +10458,8 @@ get_kfunc_ptr_arg_type(struct bpf_verifier_env *env,
if (is_kfunc_arg_callback(env, meta->btf, &args[argno]))
return KF_ARG_PTR_TO_CALLBACK;

+ if (is_kfunc_arg_nullable(meta->btf, &args[argno]) && register_is_null(reg))
+ return KF_ARG_PTR_TO_NULL;

if (argno + 1 < nargs &&
(is_kfunc_arg_mem_size(meta->btf, &args[argno + 1], &regs[regno + 1]) ||


OK, I would change in next version.

Thanks.