Re: [RFC PATCH] fpga: remove module reference counting from core components

From: Xu Yilun
Date: Thu Nov 09 2023 - 00:09:23 EST


On Wed, Nov 08, 2023 at 05:20:53PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Nov 08, 2023 at 11:52:52PM +0800, Xu Yilun wrote:
> > > >>
> > > >> In fpga_region_get() / fpga_region_put(): call get_device() before
> > > >> acquiring the mutex and put_device() after having released the mutex
> > > >> to avoid races.
>
> Why do you need another reference count with a lock? You already have
> that with the calls to get/put_device().

The low-level driver module could still be possibly unloaded at the same
time, if so, when FPGA core run some callbacks provided by low-level driver
module, its referenced page of code is unmapped...

I actually got this idea from this mail thread:

https://lore.kernel.org/all/20231010003746.GN800259@ZenIV/

And I see get/put of "struct file_operations.owner" in many framework
code for this purpose, to ensure no fops->read/write/ioctl() is ongoing
when module unloading.

>
> > > > Could you help elaborate more about the race?
> > > >
> > >
> > > I accidentally misused the word race. My concern was that memory might
> > > be released after the last put_device(), causing mutex_unlock() to be
> > > called on a mutex that does not exist anymore. It should not happen
> > > for the moment since the region does not use devres, but I think it
> > > still makes the code more brittle.
> >
> > It makes sense.
> >
> > But I dislike the mutex itself. The purpose is to exclusively grab the
> > device, but a mutex is too much heavy for this.
>
> Why "heavy"? Is this a fast-path? Have you measured it?

It's not a fast-path. But I didn't make it clear, see below...

>
> > The lock/unlock of mutex
> > scattered in different functions is also an uncommon style. Maybe some
> > atomic count should be enough.
>
> Don't make a fake lock with an atomic variable, use real locks if you
> need it.

I mean, here it doesn't not looks like a "locking" senario, although it
works.

The purpose here is to declare a device state, which says the device is
exclusively used by a user, no other user is allowed. But usually we use
mutex to protect against critical code blocks, not to represent a long
live device state.

I'm still OK for the existing mutex usage as it doesn't break anything
and if we don't want change much.

>
> Or don't even care about module unloading at all! Why does it matter?
> It can never happen without explicit intervention and it's something
> that a lot of the time, just will cause problems. Don't do a lot of
> extra work for something that doesn't matter.

mm.. as mentioned above some fundamental subsystems do care about
module unloading, and I tend to keep it same way. But mm... I'm OK to
make things easier if you insist.

Thanks,
Yilun

>
> thanks,
>
> greg k-h