Re: [PATCH rdma-next v1 10/15] RDMA/cm: Use an attribute_group on the ib_port_attribute intead of kobj's

From: Greg KH
Date: Fri Jun 11 2021 - 04:17:13 EST


On Fri, Jun 11, 2021 at 07:25:46AM +0000, Haakon Bugge wrote:
>
>
> > On 7 Jun 2021, at 14:50, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> >
> > On Mon, Jun 07, 2021 at 02:39:45PM +0200, Greg KH wrote:
> >> On Mon, Jun 07, 2021 at 09:14:11AM -0300, Jason Gunthorpe wrote:
> >>> On Mon, Jun 07, 2021 at 12:25:03PM +0200, Greg KH wrote:
> >>>> On Mon, Jun 07, 2021 at 11:17:35AM +0300, Leon Romanovsky wrote:
> >>>>> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> >>>>>
> >>>>> This code is trying to attach a list of counters grouped into 4 groups to
> >>>>> the ib_port sysfs. Instead of creating a bunch of kobjects simply express
> >>>>> everything naturally as an ib_port_attribute and add a single
> >>>>> attribute_groups list.
> >>>>>
> >>>>> Remove all the naked kobject manipulations.
> >>>>
> >>>> Much nicer.
> >>>>
> >>>> But why do you need your counters to be atomic in the first place? What
> >>>> are they counting that requires this?
> >>>
> >>> The write side of the counter is being updated from concurrent kernel
> >>> threads without locking, so this is an atomic because the write side
> >>> needs atomic_add().
> >>
> >> So the atomic write forces a lock :(
> >
> > Of course, but a single atomic is cheaper than the double atomic in a
> > full spinlock.
> >
> >>> Making them a naked u64 will cause significant corruption on the write
> >>> side, and packet counters that are not accurate after quiescence are
> >>> not very useful things.
> >>
> >> How "accurate" do these have to be?
> >
> > They have to be accurate. They are networking packet counters. What is
> > the point of burning CPU cycles keeping track of inaccurate data?
>
> Consider a CPU with a 32-bit wide datapath to memory, which reads and writes the most significant 4-byte word first:

What CPU is that?

> Memory CPU1 CPU2
> MSW LSW MSW LSW MSW LSW
> 0x0 0xffffffff
> 0x0 0xffffffff 0x0
> 0x0 0xffffffff 0x0 0xffffffff
> 0x0 0xffffffff 0x1 0x0 cpu1 has incremented its register
> 0x1 0xffffffff 0x1 0x0 cpu1 has written msw
> 0x1 0xffffffff 0x1 0x0 0x1 cpu2 has read msw
> 0x1 0xffffffff 0x1 0x0 0x1 0xffffffff
> 0x1 0x0 0x1 0x0 0x2 0x0
> 0x2 0x0 0x1 0x0 0x2 0x0
> 0x2 0x0 0x1 0x0 0x2 0x0
>
>
> I would say that 0x200000000 vs. 0x100000001 is more than inaccurate!

True, then maybe these should just be 32bit counters :)

thanks,

greg k-h