Re: [PATCH] component: try_module_get() to prevent unloading while in use

From: Richard Fitzgerald
Date: Thu Jul 28 2022 - 06:50:27 EST


On 26/07/2022 18:28, Greg KH wrote:
On Tue, Jul 26, 2022 at 11:32:28AM +0100, Richard Fitzgerald wrote:
On 25/07/2022 19:09, Greg KH wrote:
On Mon, Jul 25, 2022 at 05:08:59PM +0100, Richard Fitzgerald wrote:
Call try_module_get() on a component before attempting to call its
bind() function, to ensure that a loadable module cannot be
unloaded while we are executing its bind().

How can bind be called while the module is unloaded?


I didn't say it could. What I said is "unloaded while we are executing
its bind()". Maybe that's already guaranteed to be safe somehow. It's
actually the problem below that I was trying to fix but placing the
try_module_get() before the bind() rather than after bind() seemed a
trivial extra safety.

It should be safe, bind() can't race with module remove as the driver
core locks will handle this.

If the bind is successful the module_put() is called only after it
has been unbound. This ensures that the module cannot be unloaded
while it is in use as an aggregate device.

That's almost never the correct thing to do, what problem is this
solving?


What I see is that when a loadable module has been made part of an
aggregate it is still possible to rmmod'd it.

An alternative workaround would be for the parent to softdep to every
driver that _might_ provide the aggregated components. Softdeps aren't
unusual (we use it in some drivers that are directly related but don't
directly link into each other). But to me this feels like a hack when
used with the component framework - isn't the idea that the parent
doesn't know (or doesn't need to know) which drivers will be aggregated?
Wouldn't it be better that when a component driver is bound into an
aggregate its module is automatically marked in-use?

If there's a better way to mark the module in-use while is it bound
into an aggregate, let me know and I'll look at implementing it.

No module references should be incremented if a device is bound to a
driver, that's the old (1990's) way of thinking. If a module wants to
be unloaded, let it, and clean up everything that it was
controlling/talking to before the module remove is finished.

That's the way all busses should be working, you don't increment a
module count when a driver binds to a device, otherwise how would you
unload a module that was being used at all?

So just remove the components controlled by the module properly when it
is removed and all should be good.

Do you have example code in the kernel tree today that does not properly
do this? Why not just fix that instead?


The actual code I'm working on isn't upstream yet, but it's a derivative
of the way these two interoperate:
sound/pci/hda/patch_realtek
sound/pci/hda/cs35l41_hda.c

In these systems the host audio interface is HDA but the amps are not
HDA devices. Audio goes through the Realtek HDA codec to the amp but
the amp is on a different bus (i2c or spi). The modules in the HDA stack
all get marked as in-use, but the amp driver doesn't. So if it's
unloaded the audio system is left in a limbo state where ALSA and the
HDA stack are still up but the amp driver code has gone.

However I realised that my try_module_get() isn't a fix anyway.
It's claiming use of the module implementing the component but not
of the bus that owns that module. I assume that's what you were
referring to by having to deal with an unload instead of trying to
prevent the unload.

(And yes, I'm aware that in that patch_realtek.c it's missing
locking around the shared struct to prevent it being accessed during a
bind and unbind.)

thanks,

greg k-h