Re: [Resend v1 1/5] linux/bitqueue.h: add the bit queue implementation

From: Alexander Potapenko
Date: Wed Jul 12 2023 - 06:25:19 EST


On Tue, Jul 11, 2023 at 9:20 PM Yury Norov <yury.norov@xxxxxxxxx> wrote:
>
> + Andy and Rasmus
>
> On Tue, Jul 11, 2023 at 04:42:29PM +0200, Alexander Potapenko wrote:
> > struct bitq represents a bit queue with external storage.
> >
> > Its purpose is to easily pack sub-byte values, which can be used, for
> > example, to implement RLE algorithms.
>
> Whatever it is, it's not a queue. The queue implies O(1) for insertion
> and deletion, but your 'dequeue' is clearly an O(n) procedure.

Thanks for spotting this!
I have indeed done a poor job implementing the dequeue method.

> I'm not sure if I completely understand the purpose of the series,

To implement tag compression, we need to serialize/deserialize "bit
fields" looking e.g. like this:

int largest_idx : 6;
unsigned char tags[N] : 4*N;
unsigned char sizes[N-1] : 7*(N-1)

to/from a byte array. This actually needs to be done only once, and
enqueue()/dequeue() operations do not interleave, so there is no need
for an actual queue.

I'll try to come up with something simple - maybe reimplement it as a
ring buffer, or even skip the "ring" part, because it is not needed
for my purpose.
(The struct may end up being less generic - in that case I'll move it
from include/linux arch/arm64/mm/)


> but
> from this description:
> enqueueing/dequeueing of sub-byte values
>
> I think, the simplest solution would be a circular queue (ringbuffer)
> based on bitmaps:


> > +/**
> > + * bitq_init - initialize an empty bit queue.
> > + * @q: struct bitq to be initialized.
> > + * @data: external data buffer to use.
> > + * @size: capacity in bytes.
> > + *
> > + * Return: 0 in the case of success, -1 if either of the pointers is NULL.
>
> ENIVAL?

Ack, better use the common error values.

>
> > + */
> > +static inline int bitq_init(struct bitq *q, u8 *data, int size)
> > +{
> > + if (!q || !data)
> > + return -1;
>
> This is a useless check. Erroneous code may (and often does) pass a
> broken pointer other than NULL.

I am actually a fan of defensive programming, but it's a good point
that it does not defend against non-NULL pointers, and NULL is anyway
an unexpected input value.

>
> > + q->data = data;
> > + q->size = size;
> > + memset(data, 0, size);
>
> Useless memset?

An overly cautious one, that lets us fetch values from partially
initialized bytes. This code will be removed anyway.

> > +static inline int bitq_init_full(struct bitq *q, u8 *data, int size)
> > +{
> > + if (!q || !data)
> > + return -1;
> > + q->data = data;
> > + q->size = size;
> > + q->bit_pos = q->size * 8;
> > + return 0;
> > +}
>
> This all should not reside in a header.

There's a handful of examples in include/linux where meaningful code
is written in the headers, but I agree that in this particular case it
is probably not justified by performance reasons.

> > + if (!q || (bits < 1) || (bits > 8))
> > + return -1;
>
> Pushing 0 elements in queue is usually not an error. Implementations
> usually return and do nothing. From the malloc() man page:

Agreed.

> If size is 0, then malloc() returns a unique pointer value that
> can later be successfully passed to free().
>
> > + max_pos = q->size * 8;
> > + if ((max_pos - q->bit_pos) < bits)
> > + return -1;
>
> ENOMEM? Or probably better to resize the queue.

This "queue" relies on the external storage that may not be easily
resizeable (e.g. when we are using a local u64 as a storage).
ENOMEM sounds better (should we stick to this interface).


> > + /*
> > + * @value needs to be split between the current and the
> > + * following bytes.
> > + */
> > + hi = value >> (bits - left_in_byte);
> > + q->data[byte_pos] |= hi;
> > + byte_pos++;
> > + lo = value << (8 - (bits - left_in_byte));
> > + q->data[byte_pos] |= lo;
> > + }
>
> This piece should be a bitmap_append() function, like:
> bitmap_append(addr, 3, 2, 0b11) would append 0b11 to the bitmap at
> offset 3. We already have bitmap_{set,get}_value8, so I suggest
> to extend the interface for unaligned offsets and lengths up to
> BITS_PER_LONG.

Interesting. Yeah, this could be part of bitmap.h instead.


> > + /*
> > + * Shift every byte in the queue to the left by @bits, carrying over to
> > + * the previous byte.
> > + */
> > + for (i = 0; i < q->size - 1; i++) {
> > + q->data[i] = (q->data[i] << bits) |
> > + (q->data[i + 1] >> rem_bits);
> > + }
>
> As I already mentioned, this is O(N), which is wrong for queues. Add a
> pointer to the head in the bitq structure to avoid shifting every
> byte.
>
> BTW, we've got bitmap_shift_{left,right} for this.

Ack.