Re: mainline build failure due to cf21f328fcaf ("media: nxp: Add i.MX8 ISI driver")

From: Mauro Carvalho Chehab
Date: Wed May 10 2023 - 04:05:41 EST


Hi Linus,

Em Mon, 8 May 2023 09:27:28 -0700
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> escreveu:

> On Mon, May 8, 2023 at 3:55 AM Linux regression tracking #adding
> (Thorsten Leemhuis) <regressions@xxxxxxxxxxxxx> wrote:
> >
> > Thanks for the report. The fixes (see the mail from Laurent) apparently
> > are still not mainlined (or am I missing something?), so let me add this
> > report to the tracking to ensure this is not forgotten:
>
> Gaah. I was intending to apply the patch directly before rc1, but then
> I forgot about this issue.
>
> Mauro: I'm currently really *really* fed up with the media tree. This
> exact same thing happened last merge window, where the media tree
> caused pointless build errors, and it took way too long to get the
> fixes the proper ways.
>
> If something doesn't even build, it should damn well be fixed ASAP.
>
> Last release it was imx290.c and PM support being disabled, and I had
> to apply the fix manually because it continued to not come in the
> proper way.
>
> See commit 7b50567bdcad ("media: i2c: imx290: fix conditional function
> defintions").
>
> But also see commit b928db940448 ("media: i2c: imx290: fix conditional
> function definitions"), which you *did* commit, but note this on that
> commit:
>
> AuthorDate: Tue Feb 7 17:13
> CommitDate: Sat Mar 18 08:44
>
> so it took you a MONTH AND A HALF to react to a build failure.
>
> And see this:
>
> git name-rev b928db940448
> b928db940448 tags/v6.4-rc1~161^2~458
>
> ie that build fix that you finally committed came in *AFTER* the 6.3
> release, even though the bug it fixes was introduced in the 6.3 merge
> window:
>
> git name-rev 02852c01f654
> 02852c01f654 tags/v6.3-rc1~72^2~2^2~193
>
> and now we're in the *EXACT*SAME* situation, with me applying a build
> fix directly, because you couldn't get it fixed in a timely manner.

Sorry for the mess. I'll work to improve the process to avoid this
to happen again.

FYI, in order to reduce build issues, we have a Jenkins instance
doing builds with gcc and CLANG at the media stage tree, before we even merge
them at the main media development tree. They run with allyesconfig for
x86_64 arch, with W=1:

https://builder.linuxtv.org/job/media_stage_clang/
https://builder.linuxtv.org/job/media_stage_gcc/

And another CI job testing bisect breakages as I receive pull requests,
applying patch per patch and using both allyesconfig and allmodconfig,
also on x86_64 arch with W=1:

https://builder.linuxtv.org/job/patchwork/

The rule is to not merge stuff on media tree if any of those jobs
fail. I also fast-forward merging patches whose subject states that
the build has failed.

In order to help with that, on normal situation, I usually take one week
to merge stuff from media_stage into media_tree, doing rebases at
media_stage if needed to avoid git bisect build breakages at media_tree
(which is from where I send my update PRs to you).

Unfortunately, currently we don't have resources to do multiple randconfig
on Jenkins, as the build machines on the server are very slow. Yet, I'll
add CONFIG_PM disabled to the test set, as it seems to be a recurrent source
of troubles those days. I'll also try to identify a couple of other
randconfigs that would help to catch earlier problems like that.
If some other problematic Kconfig variables comes to your mind, please
feel free to suggest them for us to add to the CI automation.

-

In the specific case of this fixup patch, I didn't identify it as a build
issue, so it followed the usual workflow. We have a huge number of patches
for media, and it usually takes some time to handle all of them. This one
just followed the normal flow, as it didn't break Jenkins builds nor the
subject mentioned anything about build breakage.

Regards,
Mauro