Re: [PATCH net-next 0/4] netlink: add 'bitmap' attribute type and API

From: Alexander Lobakin
Date: Mon Jul 25 2022 - 06:26:23 EST


From: Andy Shevchenko <andy.shevchenko@xxxxxxxxx>
Date: Fri, 22 Jul 2022 19:02:35 +0200

> On Friday, July 22, 2022, Alexander Lobakin <alexandr.lobakin@xxxxxxxxx>
> wrote:
>
> > From: Jakub Kicinski <kuba@xxxxxxxxxx>
> > Date: Thu, 21 Jul 2022 11:13:18 -0700
> >
> > > On Thu, 21 Jul 2022 17:59:46 +0200 Alexander Lobakin wrote:
> > > > BTW, Ethtool bitsets provide similar functionality, but it operates
> > > > with u32s (u64 is more convenient and optimal on most platforms) and
> > > > Netlink bitmaps is a generic interface providing policies and data
> > > > verification (Ethtool bitsets are declared simply as %NLA_BINARY),
> > > > generic getters/setters etc.
> > >
> > > Are you saying we don't need the other two features ethtool bitmaps
> > > provide? Masking and compact vs named representations?
> >
> > Nah I didn't say that. I'm not too familiar with Ethtool bitsets,
> > just know that they're represented as arrays of u32s.
> >
> > >
> > > I think that straight up bitmap with a fixed word is awkward and leads
> > > to too much boilerplate code. People will avoid using it. What about
> > > implementing a bigint type instead? Needing more than 64b is extremely
> > > rare, so in 99% of the cases the code outside of parsing can keep using
> > > its u8/u16/u32.
> >
> > In-kernel code can still use single unsigned long for some flags if
> > it wouldn't need more than 64 bits in a couple decades and not
> > bother with the bitmap API. Same with userspace -- a single 64 is
> > fine for that API, just pass a pointer to it to send it as a bitmap
> > to the kernel.
> >
> > Re 64b vs extremely rare -- I would say so 5 years go, but now more
> > and more bitfields run out of 64 bits. Link modes, netdev features,
> > ...
> >
> > Re bigint -- do you mean implementing u128 as a union, like
> >
> > typedef union __u128 {
> > struct {
> > u32 b127_96;
> > u32 b95_64;
> > u32 b63_32;
> > u32 b31_0;
> > };
> > struct {
> > u64 b127_64;
> > u64 b63_0;
> > };
> > #ifdef __HAVE_INT128
> > __int128 b127_0;
> > #endif
> > } u128;
> >
> > ?
>
>
> This looks not good (besides union aliasing, have you thought about BE64?).

It was just an example to vizualize the thought.

>
>
>
> >
> >
> > We have similar feature in one of our internal trees and planning
> > to present generic u128 soon, but this doesn't work well for flags
> > I think.
> > bitmap API and bitops are widely used and familiar to tons of folks,
> > most platforms define their own machine-optimized bitops
> > implementation, arrays of unsigned longs are native...
> >
> > Re awkward -- all u64 <-> bitmap conversion is implemented in the
> > core code in 4/4 and users won't need doing anything besides one
> > get/set. And still use bitmap/bitops API. Userspace, as I said,
> > can use a single __u64 as long as it fits into 64 bits.
> >
> > Summarizing, I feel like bigints would lead to much more boilerplate
> > in both kernel and user spaces and need to implement a whole new API
> > instead of using the already existing and well-used bitmap one.
> > Continuation of using single objects with fixed size like %NLA_U* or
> > %NLA_BITFIELD_U32 will lead to introducing a new Netlink attr every
> > 32/64 bits (or even 16 like with IP tunnels, that was the initial
> > reason why I started working on those 3 series). As Jake wrote me
> > in PM earlier,
> >
> > "I like the concept of an NLA_BITMAP. I could have used this for
> > some of the devlink interfaces we've done, and it definitely feels
> > a bit more natural than being forced to a single u32 bitfield."
>
>
> Netlink is an ABI, how would you naturally convert unsigned long from
> 64-bit kernel to the unsigned long on 32-bit user space, esp. on BE
> architectures?

Uhm, that's what this patchset does. Cover letter says: we can't
transfer longs between userspace and the kernel, so in Netlink
messages they're represented as arrays of u64s, then Netlink core
in the kernel code takes care of converting them to longs when you
call getter (and vice versa for setter)...
On LE and 64-bit BE architectures this conversion is a noop (see
bitmap_{from,to}_arr64()), that's why u64s were chosen, not u32s
like in Ethtool for example.

>
>
> >
> > Thanks,
> > Olek
> >
>
>
> --
> With Best Regards,
> Andy Shevchenko

Thanks,
Olek