Re: [PATCH v7 45/49] media: core: Add bitmap manage bufs array entries

From: Benjamin Gaignard
Date: Thu Sep 21 2023 - 15:05:09 EST



Le 21/09/2023 à 14:46, Benjamin Gaignard a écrit :

Le 21/09/2023 à 14:13, Hans Verkuil a écrit :
On 21/09/2023 14:05, Benjamin Gaignard wrote:
Le 21/09/2023 à 12:24, Hans Verkuil a écrit :
On 21/09/2023 11:28, Benjamin Gaignard wrote:
Le 20/09/2023 à 16:56, Hans Verkuil a écrit :
On 20/09/2023 16:30, Benjamin Gaignard wrote:
<snip>

         num_buffers = min_t(unsigned int, num_buffers,
                     q->max_allowed_buffers - vb2_get_num_buffers(q));
     -    first_index = vb2_get_num_buffers(q);
+    first_index = bitmap_find_next_zero_area(q->bufs_map, q->max_allowed_buffers,
+                         0, num_buffers, 0);
           if (first_index >= q->max_allowed_buffers)
             return 0;
@@ -675,7 +678,13 @@ static void __vb2_queue_free(struct vb2_queue *q, unsigned int buffers)
       struct vb2_buffer *vb2_get_buffer(struct vb2_queue *q, unsigned int index)
     {
-    if (index < q->num_buffers)
+    if (!q->bufs_map || !q->bufs)
+        return NULL;
I don't think this can ever happen.
I got kernel crash without them.
I will keep them.
What is the backtrace? How can this happen? It feels wrong that this can be
called with a vb2_queue that apparently is not properly initialized.
I have this log when adding dump_stack() in vb2_get_buffer() if !q->bufs_bitmap:

[   18.924627] Call trace:
[   18.927090]  dump_backtrace+0x94/0xec
[   18.930787]  show_stack+0x18/0x24
[   18.934137]  dump_stack_lvl+0x48/0x60
[   18.937833]  dump_stack+0x18/0x24
[   18.941166]  __vb2_queue_cancel+0x23c/0x2f0
[   18.945365]  vb2_core_queue_release+0x24/0x6c
[   18.949740]  vb2_queue_release+0x10/0x1c
[   18.953677]  v4l2_m2m_ctx_release+0x20/0x40
[   18.957892]  hantro_release+0x20/0x54
[   18.961584]  v4l2_release+0x74/0xec
[   18.965110]  __fput+0xb4/0x274
[   18.968205]  __fput_sync+0x50/0x5c
[   18.971626]  __arm64_sys_close+0x38/0x7c
[   18.975562]  invoke_syscall+0x48/0x114
[   18.979329]  el0_svc_common.constprop.0+0xc0/0xe0
[   18.984068]  do_el0_svc+0x1c/0x28
[   18.987402]  el0_svc+0x40/0xe8
[   18.990470]  el0t_64_sync_handler+0x100/0x12c
[   18.994842]  el0t_64_sync+0x190/0x194

This happen at boot time when hantro driver is open and close without other actions.
Ah, now I see the problem. q->bufs and q->bufs_map are allocated in
vb2_core_create_bufs and vb2_core_reqbufs, but they should be allocated
in vb2_queue_init: that's the counterpart of vb2_core_queue_release.

Hans,
I think we are doing loops in your comment :-)
https://patchwork.kernel.org/comment/25496456/

Regards,
Benjamin


With that change you shouldn't have to check for q->bufs/bufs_map anymore.
It is a better solution but even like this vb2_core_queue_release() is called
at least 2 times on the same vivid queue and without testing q->bufs_bitmap
makes kernel crash.
Do you have a stacktrace for that? Perhaps vb2_core_queue_release should check
for q->bufs/q->bufs_map and return if those are NULL. But it could also be a
bug that it is called twice, it just was never noticed because it was harmless
before.

I have added some printk to log that when running test-media on vivid:

[  130.497426] vb2_core_queue_init queue cap-0000000050d195ab allocate q->bufs 00000000dc2c15ed and q->bufs_bitmap 000000008173fc5a
...
[  130.733967] vb2_core_queue_release queue cap-0000000050d195ab release q->bufs and q->bufs_bitmap
[  133.866345] vb2_get_buffer queue cap-0000000050d195ab q->bufs_bitmap is NULL
[  133.873454] CPU: 1 PID: 321 Comm: v4l2-ctl Not tainted 6.6.0-rc1+ #542
[  133.879997] Hardware name: NXP i.MX8MQ EVK (DT)
[  133.884536] Call trace:
[  133.886988]  dump_backtrace+0x94/0xec
[  133.890673]  show_stack+0x18/0x24
[  133.894002]  dump_stack_lvl+0x48/0x60
[  133.897681]  dump_stack+0x18/0x24
[  133.901009]  __vb2_queue_cancel+0x250/0x31c
[  133.905209]  vb2_core_queue_release+0x24/0x88
[  133.909580]  _vb2_fop_release+0xb0/0xbc
[  133.913428]  vb2_fop_release+0x2c/0x58
[  133.917187]  vivid_fop_release+0x80/0x388 [vivid]
[  133.921948]  v4l2_release+0x74/0xec
[  133.925452]  __fput+0xb4/0x274
[  133.928520]  __fput_sync+0x50/0x5c
[  133.931934]  __arm64_sys_close+0x38/0x7c
[  133.935868]  invoke_syscall+0x48/0x114
[  133.939630]  el0_svc_common.constprop.0+0x40/0xe0
[  133.944349]  do_el0_svc+0x1c/0x28
[  133.947677]  el0_svc+0x40/0xe8
[  133.950741]  el0t_64_sync_handler+0x100/0x12c
[  133.955109]  el0t_64_sync+0x190/0x194

and later I have a call to reqbufs on the same queue without call to vb2_core_queue_init before

[   58.696812] __vb2_queue_alloc queue cap- 0000000050d195abq->bufs_bitmap is NULL
[   58.704148] CPU: 1 PID: 319 Comm: v4l2-compliance Not tainted 6.6.0-rc1+ #544
[   58.711291] Hardware name: NXP i.MX8MQ EVK (DT)
[   58.715826] Call trace:
[   58.718274]  dump_backtrace+0x94/0xec
[   58.721951]  show_stack+0x18/0x24
[   58.725274]  dump_stack_lvl+0x48/0x60
[   58.728946]  dump_stack+0x18/0x24
[   58.732268]  __vb2_queue_alloc+0x4a8/0x50c
[   58.736374]  vb2_core_reqbufs+0x274/0x46c
[   58.740391]  vb2_ioctl_reqbufs+0xb0/0xe8
[   58.744320]  vidioc_reqbufs+0x50/0x64 [vivid]
[   58.748717]  v4l_reqbufs+0x50/0x64
[   58.752125]  __video_do_ioctl+0x164/0x3c8
[   58.756140]  video_usercopy+0x200/0x668
[   58.759982]  video_ioctl2+0x18/0x28
[   58.763475]  v4l2_ioctl+0x40/0x60
[   58.766798]  __arm64_sys_ioctl+0xac/0xf0
[   58.770730]  invoke_syscall+0x48/0x114
[   58.774487]  el0_svc_common.constprop.0+0x40/0xe0
[   58.779199]  do_el0_svc+0x1c/0x28
[   58.782520]  el0_svc+0x40/0xe8
[   58.785580]  el0t_64_sync_handler+0x100/0x12c
[   58.789942]  el0t_64_sync+0x190/0x194


Regards,

    Hans

Regards,

     Hans

+
+    return (bitmap_weight(q->bufs_map, q->max_allowed_buffers) > 0);
How about:

       return vb2_get_num_buffers(q) > 0;
vb2_get_num_buffers is defined in videobuf2-core.c, I'm not sure that
an inline function could depend of a module function.
Not a problem. E.g. v4l2-ctrls.h is full of such static inlines.

Regards,

      Hans