Re: [PATCH] ARM: do not assemble iwmmxt.S with LLVM toolchain

From: Nick Desaulniers
Date: Mon Apr 13 2020 - 16:45:46 EST


On Fri, Apr 10, 2020 at 6:09 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Fri, 10 Apr 2020 at 14:33, Russell King - ARM Linux admin
> <linux@xxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Apr 10, 2020 at 01:15:08PM +0200, Ard Biesheuvel wrote:
> > > On Fri, 10 Apr 2020 at 11:56, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > > >
> > > > On Fri, Apr 10, 2020 at 1:28 AM Jian Cai <caij2003@xxxxxxxxx> wrote:
> > > > >
> > > > > iwmmxt.S contains XScale instructions LLVM ARM backend does not support.
> > > > > Skip this file if LLVM integrated assemmbler or LLD is used to build ARM
> > > > > kernel.
> > > > >
> > > > > Signed-off-by: Jian Cai <caij2003@xxxxxxxxx>
> > > >
> > > > It clearly makes sense to limit the Kconfig option to compilers that
> > > > can actually build it.
> > > > A few questions though:
> > > >
> > > > - Given that Armada XP with its PJ4B was still marketed until fairly
> > > > recently[1],
> > > > wouldn't it make sense to still add support for it? Is it a lot of work?
> > > >
> > >
> > > The part of that file that the assembler chokes on hasn't been touched
> > > by anyone since Nico added it 15+ years ago. It can only be built in
> > > ARM mode, and it disassembles to the sequence below (the ld/st fe/fp
> > > mnemonics are not document in recent versions of the ARM ARM, and
> > > aren't understood by Clang either)
> >
> > For older CPUs, it doesn't matter what the latest ARM ARM says, the
> > appropriate version of the ARM ARM is the one relevant for the CPU
> > architecture. This is a mistake frequently made, and it's been pointed
> > out by Arm Ltd in the past (before ARMv6 even came on the scene) that
> > keeping older revisions is necessary if you want to be interested in
> > the older architectures.
> >
> > However, there's an additional complication here: DEC's license from
> > Arm Ltd back in the days of StrongARM allowed them to make changes to
> > the architecture - that was passed over to Intel when they bought that
> > part of DEC. Consequently, these "non-Arm vendor" cores contain
> > extensions that are not part of the ARM ARM. iWMMXT is one such
> > example, which first appeared in the Intel PXA270 SoC (an ARMv5
> > derived CPU).
> >
> > In fact, several of the features found in later versions of the ARM
> > architecture came from DEC and Intel enhancements.
> >
> > If your compiler/assembler only implements what is in the latest ARM
> > ARM, then it is not going to be suitable for these older CPUs and
> > alternate vendor "ARM compatible" CPUs.
> >
>
> Indeed, and I'm a bit disappointed at the willingness to leave stuff
> by the wayside, especially since Clang's integrated assembler has no
> other benefit to it than being built into the compiler.

I don't disagree. I also wish LLVM had a backend for every
architecture that GCC does. But resources are finite and there are
more fires than firemen. It gets really hard to justify a high
priority for certain things over others. Doubly-so for hardware no
longer in production. Triply-so when the ISA vendor doesn't provide
information in available reference manuals. I'm happy to push for
more investment in LLVM to support the Linux kernel from Google
internally; maybe you can help do so from ARM? That was my appeal to
ARM back in February; support for newest ISA extensions is great,
support for existing instructions is great, too. (And not having to
choose between one or the other is preferrable, given the amount of
resources available).

My thoughts on the benefits of this approach to using Clang's
integrated assembler:
1. continuous integration and randconfigs. We need CI to help us spot
where things are still broken, and help us from regressing the ground
we've fought for. We can't expect kernel developers to test with
LLVM. Currently, we have LLVM builds in numerous kernel continuous
integration services (KernelCI, Kbuild test robot "0day bot", Linaro's
TCWG, Syzcaller, and our own CI). For services that are bisecting and
notifying authors, they are currently harassing authors for
pre-existing conditions that the service uncovered via randconfig.
This is very very dangerous territory to be in. If authors start
ignoring build reports due to false positives or false negatives, it
becomes a weak signal that tends to be ignored. Then when real bugs
are uncovered, the actual bugs get ignored. We don't want that. If a
canary dies in a coal mine, but no one notices, was it for naught?

2. It helps us quantify how broken we are, via `grep` and `wc`. LLVM
is in no way a perfect substitute for GNU utilities, but it's not too
far off either. As an imperfect substitute, there's a lot of work
both on the toolchain side and sources of various codebases in terms
of toolchain portability. Being able to quantify what doesn't work
let's us be clear in which ways LLVM is not a perfect substitute, but
also where and how much resources we need to get closer. That helps
then with planning and prioritization. The proper thing to do is to
bury the dead but at this point we only have resources to collect dog
tags and keep moving. That doesn't rule out revisiting implementing
iWMMXT in the future.

3. Testing Clang's assembler allows for us to do kernel builds without
binutils. This work is helping uncover places within the kernel where
the build is not hermetic. We're still a long ways away from hermetic
reproducible kernel builds I suspect, but my main worry is when people
have multiple versions of a toolchain in their path, that only one is
used. Otherwise, it leads to spooky hard to reproduce bug reports. I
don't think I need to argue about build hermiticity, but it's
important for user trust and verification.

4. Improving toolchain portability of assembler in LLVM itself.
There's plenty of subtle differences, but missing full on instructions
(or are they psuedo's?) is pretty bad.

I value the feedback from you, Russell, and Arnd even when I disagree.
These are just my thoughts on *why* things are the way they are, FWIW.
If there's thoughts on how we might better prioritize one thing over
another, I would appreciate it.
--
Thanks,
~Nick Desaulniers