Re: [PATCH 2/5] bpf: Define new BPF_MAP_TYPE_USER_RINGBUF map type

From: David Vernet
Date: Fri Aug 12 2022 - 12:23:36 EST


On Thu, Aug 11, 2022 at 04:29:02PM -0700, Andrii Nakryiko wrote:

[...]

> > - /* Consumer and producer counters are put into separate pages to allow
> > - * mapping consumer page as r/w, but restrict producer page to r/o.
> > - * This protects producer position from being modified by user-space
> > - * application and ruining in-kernel position tracking.
> > + /* Consumer and producer counters are put into separate pages to
> > + * allow each position to be mapped with different permissions.
> > + * This prevents a user-space application from modifying the
> > + * position and ruining in-kernel tracking. The permissions of the
> > + * pages depend on who is producing samples: user-space or the
> > + * kernel.
> > + *
> > + * Kernel-producer
> > + * ---------------
> > + * The producer position and data pages are mapped as r/o in
> > + * userspace. For this approach, bits in the header of samples are
> > + * used to signal to user-space, and to other producers, whether a
> > + * sample is currently being written.
> > + *
> > + * User-space producer
> > + * -------------------
> > + * Only the page containing the consumer position, and whether the
> > + * ringbuffer is currently being consumed via a 'busy' bit, are
> > + * mapped r/o in user-space. Sample headers may not be used to
> > + * communicate any information between kernel consumers, as a
> > + * user-space application could modify its contents at any time.
> > */
> > - unsigned long consumer_pos __aligned(PAGE_SIZE);
> > + struct {
> > + unsigned long consumer_pos;
> > + atomic_t busy;
>
> one more thing, why does busy have to be exposed into user-space
> mapped memory at all? Can't it be just a private variable in
> bpf_ringbuf?

It could be moved elsewhere in the struct. I put it here to avoid
increasing the size of struct bpf_ringbuf unnecessarily, as we had all of
this extra space on the consumer_pos page. Specifically, I was trying to
avoid taxing kernel-producer ringbuffers. If you'd prefer, I can just put
it elsewhere in the struct.