Re: [PATCH v2 1/1] bpf: Drop unprotected find_vpid() in favour of find_get_pid()

From: Yonghong Song
Date: Wed Aug 10 2022 - 11:09:41 EST




On 8/10/22 4:03 AM, Lee Jones wrote:
On Tue, 09 Aug 2022, Alexei Starovoitov wrote:

On Mon, Aug 8, 2022 at 11:50 PM Lee Jones <lee@xxxxxxxxxx> wrote:

On Thu, 04 Aug 2022, Alexei Starovoitov wrote:

On Wed, Aug 3, 2022 at 6:48 AM Lee Jones <lee@xxxxxxxxxx> wrote:

The documentation for find_pid() clearly states:

"Must be called with the tasklist_lock or rcu_read_lock() held."

Presently we do neither.

Let's use find_get_pid() which searches for the vpid, then takes a
reference to it preventing early free, all within the safety of
rcu_read_lock(). Once we have our reference we can safely make use of
it up until the point it is put.

Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Cc: John Fastabend <john.fastabend@xxxxxxxxx>
Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
Cc: Song Liu <song@xxxxxxxxxx>
Cc: Yonghong Song <yhs@xxxxxx>
Cc: KP Singh <kpsingh@xxxxxxxxxx>
Cc: Stanislav Fomichev <sdf@xxxxxxxxxx>
Cc: Hao Luo <haoluo@xxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: bpf@xxxxxxxxxxxxxxx
Fixes: 41bdc4b40ed6f ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY")
Signed-off-by: Lee Jones <lee@xxxxxxxxxx>
---

v1 => v2:
* Commit log update - no code differences

kernel/bpf/syscall.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 83c7136c5788d..c20cff30581c4 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4385,6 +4385,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
const struct perf_event *event;
struct task_struct *task;
struct file *file;
+ struct pid *ppid;
int err;

if (CHECK_ATTR(BPF_TASK_FD_QUERY))
@@ -4396,7 +4397,9 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
if (attr->task_fd_query.flags != 0)
return -EINVAL;

- task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
+ ppid = find_get_pid(pid);
+ task = get_pid_task(ppid, PIDTYPE_PID);
+ put_pid(ppid);

rcu_read_lock/unlock around this line
would be a cheaper and faster alternative than pid's
refcount inc/dec.

This was already discussed here:

https://lore.kernel.org/all/YtsFT1yFtb7UW2Xu@krava/

Since several people thought about rcu_read_lock instead of your
approach it means that it's preferred.
Sooner or later somebody will send a patch to optimize
refcnt into rcu_read_lock.
So let's avoid the churn and do it now.

I'm not wed to either approach. Please discuss it with Yonghong and
Jiri and I'll do whatever is agreed upon.

Hi, Lee, Let us just do rcu_read_lock() approach then. I am okay with that. Thanks!