Re: [PATCH] media: verisilicon: Additional fix for the crash when opening the driver

From: Nicolas Dufresne
Date: Tue May 23 2023 - 19:22:31 EST


Le mardi 23 mai 2023 à 14:36 -0300, Ezequiel Garcia a écrit :
> Hi guys,
>
> After reviewing the format logic (hantro_reset_encoded_fmt and
> hantro_reset_raw_fmt).
> It seems to me trying to support Decoders, Encoders and so many
> different SoC Variants, is getting increasingly fragile.
> This driver is becoming a big fat monolith. Regressions like this will
> be increasingly frequent.
>
> The only codec that supports encoding right now is JPEG, so I think
> it's a good idea to remove it for good,
> and split it to its own driver.
>
> Anyone volunteering? :-)

We won't have that luxury with VP8 and H.264, as the decoder and encoder shares
the same cache memory. They must be time sliced. Note that this driver is only
missing VP8/H.264 encoding before it becomes maintenance only (there won't be
any interesting feature left, so I would not start on big refactoring, as this
may cause more trouble then good. Anything newer like VC8000 or VC9000 should be
a new driver, and with encoder/decoder split.

regards,
Nicolas

p.s. this is my personal opinion, in general, we should improve the helpers if
there is too much boilerplate, rather then creating monolithic drivers, and on
that, I believe I agree, but the H1/G1 combo have hardware dependencies which
has been solve that way, and changing that now is a big amount of work for a
relative quite driver. Feel free to split G2 away from that driver, that would
make sense, its not sharing anything.

>
> Thanks,
> Ezequiel
>
> On Tue, May 23, 2023 at 2:06 PM Michael Tretter
> <m.tretter@xxxxxxxxxxxxxx> wrote:
> >
> > On Tue, 23 May 2023 18:36:09 +0200, Benjamin Gaignard wrote:
> > >
> > > Le 23/05/2023 à 18:25, Benjamin Gaignard a écrit :
> > > > This fixes the following issue observed on Odroid-M1 board:
> > > >
> > > > Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > > > Mem abort info:
> > > > ...
> > > > Modules linked in: crct10dif_ce hantro_vpu snd_soc_simple_card snd_soc_simple_card_utils v4l2_vp9 v4l2_h264 rockchip_saradc v4l2_mem2mem videobuf2_dma_contig videobuf2_memops rtc_rk808 videobuf2_v4l2 industrialio_triggered_buffer rockchip_thermal dwmac_rk stmmac_platform stmmac videodev kfifo_buf display_connector videobuf2_common pcs_xpcs mc rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi drm_display_helper panfrost drm_shmem_helper gpu_sched ip_tables x_tables ipv6
> > > > CPU: 3 PID: 176 Comm: v4l_id Not tainted 6.3.0-rc7-next-20230420 #13481
> > > > Hardware name: Hardkernel ODROID-M1 (DT)
> > > > pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > pc : hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > > > lr : hantro_try_fmt+0x94/0x278 [hantro_vpu]
> > > > ...
> > > > Call trace:
> > > > hantro_try_fmt+0xa0/0x278 [hantro_vpu]
> > > > hantro_set_fmt_out+0x3c/0x298 [hantro_vpu]
> > > > hantro_reset_raw_fmt+0x98/0x128 [hantro_vpu]
> > > > hantro_set_fmt_cap+0x240/0x254 [hantro_vpu]
> > > > hantro_reset_encoded_fmt+0x94/0xcc [hantro_vpu]
> > > > hantro_reset_fmts+0x18/0x38 [hantro_vpu]
> > > > hantro_open+0xd4/0x20c [hantro_vpu]
> > > > v4l2_open+0x80/0x120 [videodev]
> > > > chrdev_open+0xc0/0x22c
> > > > do_dentry_open+0x13c/0x48c
> > > > vfs_open+0x2c/0x38
> > > > path_openat+0x550/0x934
> > > > do_filp_open+0x80/0x12c
> > > > do_sys_openat2+0xb4/0x168
> > > > __arm64_sys_openat+0x64/0xac
> > > > invoke_syscall+0x48/0x114
> > > > el0_svc_common+0x100/0x120
> > > > do_el0_svc+0x3c/0xa8
> > > > el0_svc+0x40/0xa8
> > > > el0t_64_sync_handler+0xb8/0xbc
> > > > el0t_64_sync+0x190/0x194
> > > > Code: 97fc8a7f f940aa80 52864a61 72a686c1 (b9400800)
> > > > ---[ end trace 0000000000000000 ]---
> > > >
> > > > Fixes: db6f68b51e5c ("media: verisilicon: Do not set context src/dst formats in reset functions")
> >
> > This patch partially reverts the previous commit. I wonder whether the reason
> > for resetting the context format only if the targeted queue is not busy still
> > stands.
> >
> > > >
> > > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@xxxxxxxxxxxxx>
> >
> > Tested-by: Michael Tretter <m.tretter@xxxxxxxxxxxxxx>
> >
> > > > ---
> > >
> > > Diederick, Marek, Michael,
> > > I have tested this patch on my boards and I see no regressions on
> > > decoder part and no more crash when probing the encoder.
> > > Could you test it on your side to confirm it is ok ?
> > >
> > > Thorsten, I try/test regzbot commands, please tell me if it is correct.
> > >
> > > #regzbot ^introduced db6f68b51e5c
> > > #regzbot title media: verisilicon: null pointer dereference in try_fmt
> > > #regzbot ignore-activity
> > >
> > >
> > > > drivers/media/platform/verisilicon/hantro_v4l2.c | 6 ++++--
> > > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > > index 835518534e3b..61cfaaf4e927 100644
> > > > --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > > +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
> > > > @@ -397,10 +397,12 @@ hantro_reset_raw_fmt(struct hantro_ctx *ctx, int bit_depth)
> > > > if (!raw_vpu_fmt)
> > > > return -EINVAL;
> > > > - if (ctx->is_encoder)
> > > > + if (ctx->is_encoder) {
> > > > encoded_fmt = &ctx->dst_fmt;
> > > > - else
> > > > + ctx->vpu_src_fmt = raw_vpu_fmt;
> > > > + } else {
> > > > encoded_fmt = &ctx->src_fmt;
> > > > + }
> > > > hantro_reset_fmt(&raw_fmt, raw_vpu_fmt);
> > > > raw_fmt.width = encoded_fmt->width;
> > >