Re: [PATCH 0/3] Collapse vimc into single monolithic driver

From: Shuah Khan
Date: Tue Aug 13 2019 - 19:22:54 EST


On 8/13/19 6:36 AM, Laurent Pinchart wrote:
Hi Helen,

On Tue, Aug 13, 2019 at 09:25:01AM -0300, Helen Koike wrote:
On 8/13/19 6:56 AM, Laurent Pinchart wrote:
On Mon, Aug 12, 2019 at 08:41:33PM -0300, Helen Koike wrote:
On 8/12/19 7:14 PM, Shuah Khan wrote:
On 8/12/19 1:10 PM, Shuah Khan wrote:
On 8/12/19 12:52 PM, Andrà Almeida wrote:
On 8/12/19 11:08 AM, Shuah Khan wrote:
On 8/9/19 9:51 PM, Helen Koike wrote:
On 8/9/19 9:24 PM, Andrà Almeida wrote:
On 8/9/19 9:17 PM, Shuah Khan wrote:
On 8/9/19 5:52 PM, Andrà Almeida wrote:
On 8/9/19 6:45 PM, Shuah Khan wrote:
vimc uses Component API to split the driver into functional
components.
The real hardware resembles a monolith structure than component and
component structure added a level of complexity making it hard to
maintain without adding any real benefit.
ÂÂÂÂÂÂ The sensor is one vimc component that would makes sense to be a
separate
module to closely align with the real hardware. It would be easier to
collapse vimc into single monolithic driver first and then split the
sensor off as a separate module.

This patch series emoves the component API and makes minimal
changes to
the code base preserving the functional division of the code
structure.
Preserving the functional structure allows us to split the sensor off
as a separate module in the future.

Major design elements in this change are:
ÂÂÂÂÂÂ - Use existing struct vimc_ent_config and struct
vimc_pipeline_config
ÂÂÂÂÂÂÂÂ to drive the initialization of the functional components.
ÂÂÂÂÂÂ - Make vimc_ent_config global by moving it to vimc.h
ÂÂÂÂÂÂ - Add two new hooks add and rm to initialize and register,
unregister
ÂÂÂÂÂÂÂÂ and free subdevs.
ÂÂÂÂÂÂ - All component API is now gone and bind and unbind hooks are
modified
ÂÂÂÂÂÂÂÂ to do "add" and "rm" with minimal changes to just add and rm
subdevs.
ÂÂÂÂÂÂ - vimc-core's bind and unbind are now register and unregister.
ÂÂÂÂÂÂ - vimc-core invokes "add" hooks from its
vimc_register_devices().
ÂÂÂÂÂÂÂÂ The "add" hooks remain the same and register subdevs. They
don't
ÂÂÂÂÂÂÂÂ create platform devices of their own and use vimc's
pdev.dev as
ÂÂÂÂÂÂÂÂ their reference device. The "add" hooks save their
vimc_ent_device(s)
ÂÂÂÂÂÂÂÂ in the corresponding vimc_ent_config.
ÂÂÂÂÂÂ - vimc-core invokes "rm" hooks from its unregister to
unregister
subdevs
ÂÂÂÂÂÂÂÂ and cleanup.
ÂÂÂÂÂÂ - vimc-core invokes "add" and "rm" hooks with pointer to struct
vimc_device
ÂÂÂÂÂÂÂÂ and the corresponding struct vimc_ent_config pointer.
ÂÂÂÂÂÂ The following configure and stream test works on all devices.
ÂÂÂÂÂÂ ÂÂÂÂ media-ctl -d platform:vimc -V '"Sensor
A":0[fmt:SBGGR8_1X8/640x480]'
ÂÂÂÂÂÂ media-ctl -d platform:vimc -V '"Debayer
A":0[fmt:SBGGR8_1X8/640x480]'
ÂÂÂÂÂÂ media-ctl -d platform:vimc -V '"Sensor
B":0[fmt:SBGGR8_1X8/640x480]'
ÂÂÂÂÂÂ media-ctl -d platform:vimc -V '"Debayer
B":0[fmt:SBGGR8_1X8/640x480]'
ÂÂÂÂÂÂ ÂÂÂÂ v4l2-ctl -z platform:vimc -d "RGB/YUV Capture" -v
width=1920,height=1440
ÂÂÂÂÂÂ v4l2-ctl -z platform:vimc -d "Raw Capture 0" -v
pixelformat=BA81
ÂÂÂÂÂÂ v4l2-ctl -z platform:vimc -d "Raw Capture 1" -v
pixelformat=BA81
ÂÂÂÂÂÂ ÂÂÂÂ v4l2-ctl --stream-mmap --stream-count=100 -d /dev/video1
ÂÂÂÂÂÂ v4l2-ctl --stream-mmap --stream-count=100 -d /dev/video2
ÂÂÂÂÂÂ v4l2-ctl --stream-mmap --stream-count=100 -d /dev/video3

The third patch in the series fixes a general protection fault found
when rmmod is done while stream is active.

I applied your patch on top of media_tree/master and I did some
testing.
Not sure if I did something wrong, but just adding and removing the
module generated a kernel panic:

Thanks for testing.

Odd. I tested modprobe and rmmod both.I was working on Linux 5.3-rc2.
I will apply these to media latest and work from there. I have to
rebase these on top of the reverts from Lucas and Helen

Ok, please let me know if I succeeded to reproduce.


~# modprobe vimc
~# rmmod vimc
[ÂÂ 16.452974] stack segment: 0000 [#1] SMP PTI
[ÂÂ 16.453688] CPU: 0 PID: 2038 Comm: rmmod Not tainted 5.3.0-rc2+ #36
[ÂÂ 16.454678] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.12.0-20181126_142135-anatol 04/01/2014
[ÂÂ 16.456191] RIP: 0010:kfree+0x4d/0x240

<registers values...>

[ÂÂ 16.469188] Call Trace:
[ÂÂ 16.469666]Â vimc_remove+0x35/0x90 [vimc]
[ÂÂ 16.470436]Â platform_drv_remove+0x1f/0x40
[ÂÂ 16.471233]Â device_release_driver_internal+0xd3/0x1b0
[ÂÂ 16.472184]Â driver_detach+0x37/0x6b
[ÂÂ 16.472882]Â bus_remove_driver+0x50/0xc1
[ÂÂ 16.473569]Â vimc_exit+0xc/0xca0 [vimc]
[ÂÂ 16.474231]Â __x64_sys_delete_module+0x18d/0x240
[ÂÂ 16.475036]Â do_syscall_64+0x43/0x110
[ÂÂ 16.475656]Â entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ÂÂ 16.476504] RIP: 0033:0x7fceb8dafa4b

<registers values...>

[ÂÂ 16.484853] Modules linked in: vimc(-) videobuf2_vmalloc
videobuf2_memops v4l2_tpg videobuf2_v4l2 videobuf2_common
[ÂÂ 16.486187] ---[ end trace 91e5e0894e254d49 ]---
[ÂÂ 16.486758] RIP: 0010:kfree+0x4d/0x240

<registers values...>

fish: ârmmod vimcâ terminated by signal SIGSEGV (Address boundary
error)

I just added the module after booting, no other action was made.
Here is
how my `git log --oneline` looks like:

897d708e922b media: vimc: Fix gpf in rmmod path when stream is active
2e4a5ad8ad6d media: vimc: Collapse component structure into a single
monolithic driver
7c8da1687e92 media: vimc: move private defines to a common header
97299a303532 media: Remove dev_err() usage after platform_get_irq()
25a3d6bac6b9 media: adv7511/cobalt: rename driver name to adv7511-v4l2

I couldn't reproduce the error, my tree looks the same:

[I] koike@floko ~/m/o/linux> git log --oneline
e3345155c8ed (HEAD) media: vimc: Fix gpf in rmmod path when stream is
active
43e9e2fe761f media: vimc: Collapse component structure into a single
monolithic driver
8a6d0b9adde0 media: vimc: move private defines to a common header
97299a303532 (media/master) media: Remove dev_err() usage after
platform_get_irq()
25a3d6bac6b9 media: adv7511/cobalt: rename driver name to adv7511-v4l2

Thanks Helen for trying to reproduce and sharing the result.

Me and Helen found out what is the problem. If you follow this call trace:

vimc_ent_sd_unregister()
v4l2_device_unregister_subdev()
v4l2_subdev_release()

You'll notice that this last function calls the `release` callback
implementation of the subdevice. For instance, the `release` of
vimc-sensor is this one:

static void vimc_sen_release(struct v4l2_subdev *sd)
{
ÂÂÂÂstruct vimc_sen_device *vsen =
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ container_of(sd, struct vimc_sen_device, sd);

ÂÂÂÂv4l2_ctrl_handler_free(&vsen->hdl);
ÂÂÂÂtpg_free(&vsen->tpg);
ÂÂÂÂkfree(vsen);
}

And then you can see that `vsen` has been freed. Back to
vimc_ent_sd_unregister(), after v4l2_device_unregister_subdev(), the
function will call vimc_pads_cleanup(). This is basically a
kfree(ved->pads), but `ved` has just been freed at
v4l2_subdev_release(), producing a memory fault.

To fix that, we found two options:

- place the kfree(ved->pads) inside the release callback of each
subdevice and removing vimc_pads_cleanup() from
vimc_ent_sd_unregister()
- use a auxiliary variable to hold the address of the pads, for instance:

void vimc_ent_sd_unregister(...)
{
ÂÂÂÂ struct media_pad *pads = ved->pads;
ÂÂÂÂ ...
ÂÂÂÂ vimc_pads_cleanup(pads);
}



I fixed a problem in the thirds patch. vimc-capture uses the first
approach - placing the kfree(ved->pads) inside the release callback.

I am debugging another such problem in unbind path while streaming.
I am working on v2 and I will look for the rmmod problem and fix it.

thanks again for testing and finding the root cause.
-- Shuah

Hi Andre,

Here is what's happening.

Before this change, you can't really do rmmod vimc, because vimc is in
use by other component drivers. With the collapse, now you can actually
do rmmod on vimc and this problem in vimc_ent_sd_unregister() that frees
pads first and the does v4l2_device_unregister_subdev().

I fixed this in the 3/3 patch. I can reproduce the problem with patches 1 and 2, and patch 3 fixes it.

Did you test with the third patch in this series?

yes, we tested with 3/3, but the new problem now is when doing the following
in this order:

v4l2_device_unregister_subdev(sd);
vimc_pads_cleanup(ved->pads);


v4l2_device_unregister_subdev() calls the release function of the subdevice that
frees the ved object, so ved->pads is not valid anymore. That is why Andrà suggested
a temporary variable to hold ved->pads and to be able to free it later:

struct media_pad *pads = ved->pads;

v4l2_device_unregister_subdev(sd);
vimc_pads_cleanup(pads); // So we don't use the ved object here anymore.

Can't you simply call vimc_pads_cleanup() in the release function of the
subdevice before freeing the ved object ?

Yes we can, that is the other option Andre suggested.
The advantage of doing it inside vimc_ent_sd_unregister() is that
who allocated the memory in the first place was vimc_ent_sd_register(), and also,
this is a common code to all subdevs, so letting it in vimc_ent_sd_unregister()
will make sure no subdevs ever forget to free this memory.
But saving the pointer to a variable to free it later is not that pretty
either.

The release function of a subdevice is meant to be used as a destructor,
it should free all the resources allocated for the subdevice. Your
subdevices (such as vimc_sca_device) are essentially objects derived
from a common vimc subdevice object (vimc_ent_device), I thus suggest
providing four functions to deal with vimc_ent_device:

- vimc_ent_sd_init() to initialise the base vimc_ent_device object (most
of the code is currently in vimc_ent_sd_register()), to be called by
the init function of each subdevice type (the bind functions in the
current code)

- vimc_ent_sd_register() to only register the vimc_ent_device once it
has been fully initialised

- vimc_end_sd_unregister() to unregister the vimc_ent_device

- vimc_end_sd_cleanup() to free the resources allocated by
vimc_ent_sd_init() (including calling vimc_pads_cleanup()), to be
called by the subdevice release handler


We kind of mostly there. We have release handlers free'ing resources
with an exception of pads. I changed the release handlers to release
pads and with a comment where pads_init is called to say that pads
should be released from the release handler.

vimc-capture does this already. I changed the rest. rmmod works with
no issues now. rmmod when streaming comes back with in use.

There is one outstanding issue remaining in unbind path. I will fix
that in a separate patch. I decided to collapse the first two patches
into one and added common defines (vimc_device and vimc_ent_config) to
vimc-common.h based on our discussion.

I will send the patch series out soon.

thanks,
-- Shuah



thanks,
-- Shuah