Re: [PATCH bpf-next v3 3/7] bpf: Introduce task open coded iterator kfuncs

From: Chuyi Zhou
Date: Sun Oct 01 2023 - 04:21:43 EST


Hello, Andrii

在 2023/9/30 05:27, Andrii Nakryiko 写道:
On Wed, Sep 27, 2023 at 8:29 PM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:

Hello,

在 2023/9/28 07:20, Andrii Nakryiko 写道:
On Mon, Sep 25, 2023 at 3:56 AM Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx> wrote:

This patch adds kfuncs bpf_iter_task_{new,next,destroy} which allow
creation and manipulation of struct bpf_iter_task in open-coded iterator
style. BPF programs can use these kfuncs or through bpf_for_each macro to
iterate all processes in the system.

The API design keep consistent with SEC("iter/task"). bpf_iter_task_new()
accepts a specific task and iterating type which allows:
1. iterating all process in the system

2. iterating all threads in the system

3. iterating all threads of a specific task
Here we also resuse enum bpf_iter_task_type and rename BPF_TASK_ITER_TID
to BPF_TASK_ITER_THREAD, rename BPF_TASK_ITER_TGID to BPF_TASK_ITER_PROC.

The newly-added struct bpf_iter_task has a name collision with a selftest
for the seq_file task iter's bpf skel, so the selftests/bpf/progs file is
renamed in order to avoid the collision.

Signed-off-by: Chuyi Zhou <zhouchuyi@xxxxxxxxxxxxx>
---
include/linux/bpf.h | 8 +-
kernel/bpf/helpers.c | 3 +
kernel/bpf/task_iter.c | 96 ++++++++++++++++---
.../testing/selftests/bpf/bpf_experimental.h | 5 +
.../selftests/bpf/prog_tests/bpf_iter.c | 18 ++--
.../{bpf_iter_task.c => bpf_iter_tasks.c} | 0
6 files changed, 106 insertions(+), 24 deletions(-)
rename tools/testing/selftests/bpf/progs/{bpf_iter_task.c => bpf_iter_tasks.c} (100%)




[...]

+get_next_task:
+ kit->pos = next_task(kit->pos);
+ kit->task = kit->pos;
+ if (kit->pos == &init_task)
+ kit->pos = NULL;

I can't say I completely follow the logic (e.g., for
BPF_TASK_ITER_PROC, why do we do next_task() on first next() call)?
Can you elabore the expected behavior for various combinations of
types and starting task argument?


Thanks for the review.

The expected behavior of current implementation is:

BPF_TASK_ITER_PROC:

init_task->first_process->second_process->...->last_process->init_task

We would exit before visiting init_task again.

ah, ok, so in this case it's more like BPF_TASK_ITER_ALL_PROCS, i.e.,
we iterate all processes in the system. Input `task` that we provide
is ignored/meaningless, right? Maybe we should express it as
ALL_PROCS?


BPF_TASK_ITER_THREAD:

group_task->first_thread->second_thread->...->last_thread->group_task

We would exit before visiting group_task again.


And this one is iterating threads of a process specified by given
`task`, right? This is where my confusion comes from. ITER_PROC and
ITER_THREAD, by their name, seems to be very similar, but in reality
ITER_PROC is more like ITER_ALL (except process vs thread iteration),
while ITER_THREAD is parameterized by input `task`.

I'm not sure what's the least confusing way to name and organize
everything, but I think it's quite confusing right now, unfortunately.
I wonder if you or someone else have a better suggestion on making
this more straightforward?


Maybe here we can introduce new enums and not reuse or rename BPF_TASK_ITER_TID/BPF_TASK_ITER_TGID?

{
BPF_TASK_ITER_ALL_PROC,
BPF_TASK_ITER_ALL_THREAD,
BPF_TASK_ITER_THREAD
}

BPF_TASK_ITER_TID/BPF_TASK_ITER_TGID are inner flags. Looking at the
example usage of SEC("iter/task"), unlike BPF_CGROUP_ITER_DESCENDANTS_PRE/BPF_CGROUP_ITER_DESCENDANTS_POST, we actually don't use BPF_TASK_ITER_TID/BPF_TASK_ITER_TGID directly. When using SEC("iter/task"), we just set pid/tid for struct bpf_iter_link_info. Exposing new enums to users for open coded task_iters will not confuse users.

Thanks.