Re: [PATCH v4] Makefile.compiler: replace cc-ifversion with compiler-specific macros

From: Shreeya Patel
Date: Mon Jun 12 2023 - 06:35:08 EST


Hi Masahiro,


On 24/05/23 02:57, Nick Desaulniers wrote:
On Tue, May 23, 2023 at 3:27 AM Shreeya Patel
<shreeya.patel@xxxxxxxxxxxxx> wrote:
Hi Nick and Masahiro,

On 23/05/23 01:22, Nick Desaulniers wrote:
On Mon, May 22, 2023 at 9:52 AM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
On Mon, May 22, 2023 at 12:09:34PM +0200, Ricardo Cañuelo wrote:
On vie, may 19 2023 at 08:57:24, Nick Desaulniers <ndesaulniers@xxxxxxxxxx> wrote:
It could be; if the link order was changed, it's possible that this
target may be hitting something along the lines of:
https://isocpp.org/wiki/faq/ctors#static-init-order i.e. the "static
initialization order fiasco"

I'm struggling to think of how this appears in C codebases, but I
swear years ago I had a discussion with GKH (maybe?) about this. I
think I was playing with converting Kbuild to use Ninja rather than
Make; the resulting kernel image wouldn't boot because I had modified
the order the object files were linked in. If you were to randomly
shuffle the object files in the kernel, I recall some hazard that may
prevent boot.
I thought that was specifically a C++ problem? But then again, the
kernel docs explicitly say that the ordering of obj-y goals in kbuild is
significant in some instances [1]:
Yes, it matters, you can not change it. If you do, systems will break.
It is the only way we have of properly ordering our init calls within
the same "level".
Ah, right it was the initcall ordering. Thanks for the reminder.

(There's a joke in there similar to the use of regexes to solve a
problem resulting in two new problems; initcalls have levels for
ordering, but we still have (unexpressed) dependencies between calls
of the same level; brittle!).

+Maksim, since that might be relevant info for the BOLT+Kernel work.

Ricardo,
https://elinux.org/images/e/e8/2020_ELCE_initcalls_myjosserand.pdf
mentions that there's a kernel command line param `initcall_debug`.
Perhaps that can be used to see if
5750121ae7382ebac8d47ce6d68012d6cd1d7926 somehow changed initcall
ordering, resulting in a config that cannot boot?

Here are the links to Lava jobs ran with initcall_debug added to the
kernel command line.

1. Where regression happens (5750121ae7382ebac8d47ce6d68012d6cd1d7926)
https://lava.collabora.dev/scheduler/job/10417706
<https://lava.collabora.dev/scheduler/job/10417706>

2. With a revert of the commit 5750121ae7382ebac8d47ce6d68012d6cd1d7926
https://lava.collabora.dev/scheduler/job/10418012
<https://lava.collabora.dev/scheduler/job/10418012>
Thanks!

Yeah, I can see a diff in the initcall ordering as a result of
commit 5750121ae738 ("kbuild: list sub-directories in ./Kbuild")

https://gist.github.com/nickdesaulniers/c09db256e42ad06b90842a4bb85cc0f4

Not just different orderings, but some initcalls seem unique to the
before vs. after, which is troubling. (example init_events and
init_fs_sysctls respectively)

That isn't conclusive evidence that changes to initcall ordering are
to blame, but I suspect confirming that precisely to be very very time
consuming.

Masahiro, what are your thoughts on reverting 5750121ae738? There are
conflicts in Kbuild and Makefile when reverting 5750121ae738 on
mainline.

I'm not sure if you followed the conversation but we are still seeing this regression with the latest kernel builds and would like to know if you plan to revert 5750121ae738?


Thanks,
Shreeya Patel


Thanks,
Shreeya Patel