Re: [PATCH 2/5] bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type

From: David Vernet
Date: Fri Aug 12 2022 - 12:21:14 EST


On Thu, Aug 11, 2022 at 04:22:50PM -0700, Andrii Nakryiko wrote:
> On Mon, Aug 8, 2022 at 8:54 AM David Vernet <void@xxxxxxxxxxxxx> wrote:
> >
> > We want to support a ringbuf map type where samples are published from
> > user-space to BPF programs. BPF currently supports a kernel -> user-space
> > circular ringbuffer via the BPF_MAP_TYPE_RINGBUF map type. We'll need to
> > define a new map type for user-space -> kernel, as none of the helpers
> > exported for BPF_MAP_TYPE_RINGBUF will apply to a user-space producer
> > ringbuffer, and we'll want to add one or more helper functions that would
> > not apply for a kernel-producer ringbuffer.
> >
> > This patch therefore adds a new BPF_MAP_TYPE_USER_RINGBUF map type
> > definition. The map type is useless in its current form, as there is no way
> > to access or use it for anything until we add more BPF helpers. A follow-on
> > patch will therefore add a new helper function that allows BPF programs to
> > run callbacks on samples that are published to the ringbuffer.
> >
> > Signed-off-by: David Vernet <void@xxxxxxxxxxxxx>
> > ---
> > include/linux/bpf_types.h | 1 +
> > include/uapi/linux/bpf.h | 1 +
> > kernel/bpf/ringbuf.c | 70 +++++++++++++++++++++++++++++-----
> > kernel/bpf/verifier.c | 3 ++
> > tools/include/uapi/linux/bpf.h | 1 +
> > tools/lib/bpf/libbpf.c | 1 +
> > 6 files changed, 68 insertions(+), 9 deletions(-)
> >
>
> [...]
>
> >
> > -static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma)
> > +static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma,
> > + bool kernel_producer)
> > {
> > struct bpf_ringbuf_map *rb_map;
> >
> > rb_map = container_of(map, struct bpf_ringbuf_map, map);
> >
> > if (vma->vm_flags & VM_WRITE) {
> > - /* allow writable mapping for the consumer_pos only */
> > - if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
> > + if (kernel_producer) {
> > + /* allow writable mapping for the consumer_pos only */
> > + if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
> > + return -EPERM;
> > + /* For user ringbufs, disallow writable mappings to the
> > + * consumer pointer, and allow writable mappings to both the
> > + * producer position, and the ring buffer data itself.
> > + */
> > + } else if (vma->vm_pgoff == 0)
> > return -EPERM;
>
> the asymmetrical use of {} in one if branch and not using them in
> another is extremely confusing, please don't do that

Ah, sorry, I misread the style guide and thought this was stipulated there,
but I now see that it's explicitly stated that you should include a brace
if only one branch is in a single statement. I'll fix this (by splitting
these into their own implementations, as mentioned below).

> the way you put big comment inside the wrong if branch also throws me
> off, maybe move it before return -EPERM instead with proper
> indentation?

Yeah, fair enough.

> sorry for nitpicks, but I've been stuck for a few minutes trying to
> figure out what exactly is happening here :)

Not a problem at all, sorry for the less-than-readable code.

> > } else {
> > vma->vm_flags &= ~VM_MAYWRITE;
> > @@ -242,6 +271,16 @@ static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma)
> > vma->vm_pgoff + RINGBUF_PGOFF);
> > }
> >
> > +static int ringbuf_map_mmap_kern(struct bpf_map *map, struct vm_area_struct *vma)
> > +{
> > + return ringbuf_map_mmap(map, vma, true);
> > +}
> > +
> > +static int ringbuf_map_mmap_user(struct bpf_map *map, struct vm_area_struct *vma)
> > +{
> > + return ringbuf_map_mmap(map, vma, false);
> > +}
>
> I wouldn't mind if you just have two separate implementations of
> ringbuf_map_mmap for _kern and _user cases, tbh, probably would be
> clearer as well

Yeah, I can do this. I was trying to avoid any copy-pasta at all cost, but
I think it's doing more harm than good. I'll just split them into totally
separate implementations.

> > +
> > static unsigned long ringbuf_avail_data_sz(struct bpf_ringbuf *rb)
> > {
> > unsigned long cons_pos, prod_pos;
> > @@ -269,7 +308,7 @@ const struct bpf_map_ops ringbuf_map_ops = {
> > .map_meta_equal = bpf_map_meta_equal,
> > .map_alloc = ringbuf_map_alloc,
> > .map_free = ringbuf_map_free,
> > - .map_mmap = ringbuf_map_mmap,
> > + .map_mmap = ringbuf_map_mmap_kern,
> > .map_poll = ringbuf_map_poll,
> > .map_lookup_elem = ringbuf_map_lookup_elem,
> > .map_update_elem = ringbuf_map_update_elem,
> > @@ -278,6 +317,19 @@ const struct bpf_map_ops ringbuf_map_ops = {
> > .map_btf_id = &ringbuf_map_btf_ids[0],
> > };
> >
>
> [...]