Re: [PATCH v2 1/1] bpf: Drop unprotected find_vpid() in favour of find_get_pid()

From: Jiri Olsa
Date: Wed Aug 10 2022 - 07:53:02 EST


On Wed, Aug 10, 2022 at 12:03:33PM +0100, Lee Jones wrote:
> On Tue, 09 Aug 2022, Alexei Starovoitov wrote:
>
> > On Mon, Aug 8, 2022 at 11:50 PM Lee Jones <lee@xxxxxxxxxx> wrote:
> > >
> > > On Thu, 04 Aug 2022, Alexei Starovoitov wrote:
> > >
> > > > On Wed, Aug 3, 2022 at 6:48 AM Lee Jones <lee@xxxxxxxxxx> wrote:
> > > > >
> > > > > The documentation for find_pid() clearly states:
> > > > >
> > > > > "Must be called with the tasklist_lock or rcu_read_lock() held."
> > > > >
> > > > > Presently we do neither.
> > > > >
> > > > > Let's use find_get_pid() which searches for the vpid, then takes a
> > > > > reference to it preventing early free, all within the safety of
> > > > > rcu_read_lock(). Once we have our reference we can safely make use of
> > > > > it up until the point it is put.
> > > > >
> > > > > Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> > > > > Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> > > > > Cc: John Fastabend <john.fastabend@xxxxxxxxx>
> > > > > Cc: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > > > Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx>
> > > > > Cc: Song Liu <song@xxxxxxxxxx>
> > > > > Cc: Yonghong Song <yhs@xxxxxx>
> > > > > Cc: KP Singh <kpsingh@xxxxxxxxxx>
> > > > > Cc: Stanislav Fomichev <sdf@xxxxxxxxxx>
> > > > > Cc: Hao Luo <haoluo@xxxxxxxxxx>
> > > > > Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> > > > > Cc: bpf@xxxxxxxxxxxxxxx
> > > > > Fixes: 41bdc4b40ed6f ("bpf: introduce bpf subcommand BPF_TASK_FD_QUERY")
> > > > > Signed-off-by: Lee Jones <lee@xxxxxxxxxx>
> > > > > ---
> > > > >
> > > > > v1 => v2:
> > > > > * Commit log update - no code differences
> > > > >
> > > > > kernel/bpf/syscall.c | 5 ++++-
> > > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> > > > > index 83c7136c5788d..c20cff30581c4 100644
> > > > > --- a/kernel/bpf/syscall.c
> > > > > +++ b/kernel/bpf/syscall.c
> > > > > @@ -4385,6 +4385,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
> > > > > const struct perf_event *event;
> > > > > struct task_struct *task;
> > > > > struct file *file;
> > > > > + struct pid *ppid;
> > > > > int err;
> > > > >
> > > > > if (CHECK_ATTR(BPF_TASK_FD_QUERY))
> > > > > @@ -4396,7 +4397,9 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
> > > > > if (attr->task_fd_query.flags != 0)
> > > > > return -EINVAL;
> > > > >
> > > > > - task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
> > > > > + ppid = find_get_pid(pid);
> > > > > + task = get_pid_task(ppid, PIDTYPE_PID);
> > > > > + put_pid(ppid);
> > > >
> > > > rcu_read_lock/unlock around this line
> > > > would be a cheaper and faster alternative than pid's
> > > > refcount inc/dec.
> > >
> > > This was already discussed here:
> > >
> > > https://lore.kernel.org/all/YtsFT1yFtb7UW2Xu@krava/
> >
> > Since several people thought about rcu_read_lock instead of your
> > approach it means that it's preferred.
> > Sooner or later somebody will send a patch to optimize
> > refcnt into rcu_read_lock.
> > So let's avoid the churn and do it now.
>
> I'm not wed to either approach. Please discuss it with Yonghong and
> Jiri and I'll do whatever is agreed upon.

yea, I thought using rcu_read_lock would be better, but I did not
have strong feelings against doing the pid's refcount inc/dec when
Yonghong supported that.. now with Alexei it's 2 against 1 in favour
of using rcu_read_lock ;-)

jirka