Re: [PATCH] riscv: avoid enabling vectorized code generation

From: Saleem Abdulrasool
Date: Thu Dec 22 2022 - 10:25:45 EST


On Thu, Dec 22, 2022 at 1:41 AM Bin Meng <bmeng.cn@xxxxxxxxx> wrote:
>
> Hi,
>
> On Thu, Dec 22, 2022 at 1:39 AM Saleem Abdulrasool <abdulras@xxxxxxxxxx> wrote:
> >
> > On Wed, Dec 21, 2022 at 8:17 AM Bin Meng <bmeng.cn@xxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > On Sat, Dec 17, 2022 at 3:12 AM Saleem Abdulrasool <abdulras@xxxxxxxxxx> wrote:
> > > >
> > > > The compiler is free to generate vectorized operations for zero'ing
> > > > memory. The kernel does not use the vector unit on RISCV, similar to
> > > > architectures such as x86 where we use `-mno-mmx` et al to prevent the
> > > > implicit vectorization. Perform a similar check for
> > > > `-mno-implicit-float` to avoid this on RISC-V targets.
> > > >
> > > > Signed-off-by: Saleem Abdulrasool <abdulras@xxxxxxxxxx>
> > > > ---
> > > > arch/riscv/Makefile | 4 ++++
> > > > 1 file changed, 4 insertions(+)
> > > >
> > > > diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
> > > > index 0d13b597cb55..68433476a96e 100644
> > > > --- a/arch/riscv/Makefile
> > > > +++ b/arch/riscv/Makefile
> > > > @@ -89,6 +89,10 @@ KBUILD_AFLAGS_MODULE += $(call as-option,-Wa$(comma)-mno-relax)
> > > > # architectures. It's faster to have GCC emit only aligned accesses.
> > > > KBUILD_CFLAGS += $(call cc-option,-mstrict-align)
> > > >
> > > > +# Ensure that we do not vectorize the kernel code when the `v` extension is
> > > > +# enabled. This mirrors the `-mno-mmx` et al on x86.
> > > > +KBUILD_CFLAGS += $(call cc-option,-mno-implicit-float)
> > >
> > > This looks like an LLVM flag, but not GCC.
> >
> > Correct, this is a clang flag, though I imagine that GCC will need a
> > similar flag once it receives support for the V extension.
> >
> > > Can you elaborate what exact combination (compiler flag and source)
> > > would cause an issue?
> >
> > The particular case that I was using was simply `clang -target
> > riscv64-unknown-linux-musl -march=rv64gcv` off of main.
> >
> > > From your description, I guess it's that when enabling V extension in
> > > LLVM, the compiler tries to use vector instructions to zero memory,
> > > correct?
> >
> > Correct.
>
> Thanks for the confirmation.
>
> >
> > > Can you confirm LLVM does not emit any float instructions (like F/D
> > > extensions) because the flag name suggests something like "float"?
> >
> > The `-mno-implicit-float` should disable any such emission. I assume
> > that you are worried about the case without the flag? I'm not 100%
> > certain without this flag, but the RISCV build with this flag has been
> > running smoothly locally for a while.
> >
> >
>
> I still have some questions about the `-mno-implicit-float` option's behavior.
>
> - If this option is not on, does the compiler emit any F/D extension
> instruction for zero'ing memory when -march=rv64g? I want to know
> whether the `-mno-implicit-float` option only takes effect when "v"
> appears on the -march string.

AFAIK, and from a quick test, no, it will not. That also makes sense
since the F/D/Q handling is not as likely to be useful for generating
a 0-filled array. No, the use of `-mno-implicit-float` is not guarded
by the use of the vector extension, but it does only impact the
vectorized code generation (the loop vectorizer, load/store
vectorizer, and SLP vectorizer).

> - If the answer to the above question is no, I wonder why the option
> is called `-mno-implicit-float` as float suggests the FPU usage, but
> actually it is about vectorization. The Clang documentation says
> almost nothing about this option.

The flag itself is from GCC, it was added for the ARM architecture, to
prefer using the scalar core over the VFP register set as ARM uses the
VFP for vectorized operations. As it so happens, internally in LLVM,
the loop vectorizer uses the (internal) `NoImplicitFloat` function
attribute to prevent the loop from being vectorized, and the flag that
controls this is exposed as `-mimplicit-float` and
`-mno-implicit-float`.

> > > > +
> > > > ifeq ($(CONFIG_STACKPROTECTOR_PER_TASK),y)
> > > > prepare: stack_protector_prepare
> > > > stack_protector_prepare: prepare0
> > > > --
>
> Regards,
> Bin