Re: [PATCH] RDMA/device: Fix a race between mad_client and cm_client init

From: Jason Gunthorpe
Date: Fri Jan 05 2024 - 09:19:57 EST


On Fri, Jan 05, 2024 at 04:15:18PM +0800, Shifeng Li wrote:
> On 2024/1/4 20:37, Jason Gunthorpe wrote:
> > On Thu, Jan 04, 2024 at 02:48:14PM +0800, Shifeng Li wrote:
> >
> > > The root cause is that mad_client and cm_client may init concurrently
> > > when devices_rwsem write semaphore is downgraded in enable_device_and_get() like:
> >
> > That can't be true, the module loader infrastructue ensures those two
> > things are sequential.
> >
>
> I'm a bit confused how the module loader infrastructue ensures that mad_client.add() and
> cm_client.add() are sequential. Could you explain in more detail
> please?

ib_cm has a symbol dependency on ib_mad, so the module loader will not
allow ib_cm to start running until all its symbol dependencies have
completed loading.

> We know that the ib_cm driver and mlx5_ib driver can load concurrently.

Yes, this is possible

Jason