Re: [PATCH v3] arm64: sysreg: make mrs_s and msr_s macros work with Clang and LTO

From: Nick Desaulniers
Date: Mon Apr 15 2019 - 13:22:42 EST


On Mon, Apr 15, 2019 at 10:06 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> On Mon, Apr 15, 2019 at 07:38:21AM -0700, Kees Cook wrote:
> > From: Alex Matveev <alxmtvv@xxxxxxxxx>
> >
> > Clang's integrated assembler does not allow assembly macros defined
> > in one inline asm block using the .macro directive to be used across
> > separate asm blocks. LLVM developers consider this a feature and not a
> > bug, recommending code refactoring:
> >
> > https://bugs.llvm.org/show_bug.cgi?id=19749
> >
> > As binutils doesn't allow macros to be redefined, this change uses
> > UNDEFINE_MRS_S and UNDEFINE_MSR_S to define corresponding macros
> > in-place and workaround gcc and clang limitations on redefining macros
> > across different assembler blocks.
> >
> > Specifically, the current state after preprocessing looks like this:
> >
> > asm volatile(".macro mXX_s ... .endm");
> > void f()
> > {
> > asm volatile("mXX_s a, b");
> > }
> >
> > With GCC, it gives macro redefinition error because sysreg.h is included
> > in multiple source files, and assembler code for all of them is later
> > combined for LTO (I've seen an intermediate file with hundreds of
> > identical definitions).
> >
> > With clang, it gives macro undefined error because clang doesn't allow
> > sharing macros between inline asm statements.
> >
> > I also seem to remember catching another sort of undefined error with
> > GCC due to reordering of macro definition asm statement and generated
> > asm code for function that uses the macro.
> >
> > The solution with defining and undefining for each use, while certainly
> > not elegant, satisfies both GCC and clang, LTO and non-LTO.
> >
> > Signed-off-by: Alex Matveev <alxmtvv@xxxxxxxxx>
> > Signed-off-by: Yury Norov <ynorov@xxxxxxxxxxxxxxxxxx>
> > Signed-off-by: Sami Tolvanen <samitolvanen@xxxxxxxxxx>
> > Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> > ---
> > v3: split out patch as stand-alone, added more uses in irqflags,
> > updated commit log, based on discussion in
> > https://lore.kernel.org/patchwork/patch/851580/
> > ---
> > arch/arm64/include/asm/irqflags.h | 12 +++++--
> > arch/arm64/include/asm/kvm_hyp.h | 8 +++--
> > arch/arm64/include/asm/sysreg.h | 55 +++++++++++++++++++++----------
> > 3 files changed, 53 insertions(+), 22 deletions(-)
> >
> > diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
> > index 43d8366c1e87..06d3987d1546 100644
> > --- a/arch/arm64/include/asm/irqflags.h
> > +++ b/arch/arm64/include/asm/irqflags.h
> > @@ -43,7 +43,9 @@ static inline void arch_local_irq_enable(void)
> > asm volatile(ALTERNATIVE(
> > "msr daifclr, #2 // arch_local_irq_enable\n"
> > "nop",
> > + DEFINE_MSR_S
> > "msr_s " __stringify(SYS_ICC_PMR_EL1) ",%0\n"
> > + UNDEFINE_MSR_S
>
> If we do need this, can we wrap this in a larger CPP macro that does the
> whole sequence of defining, using, and undefining the asm macros?

I agree that would be cleaner. For example, __mrs_s/__msr_s can
ALMOST do this (be used in arch/arm64/include/asm/irqflags.h) except
for the input/output constraints. Maybe the constraints can be
removed from the cpp macro definition, then the constraints be left to
the macro expansion sites? That way the DEFINE_MXX_S/UNDEFINE_MXX_S
can be hidden in the cpp macro itself.

>
> It would be nice if we could simply rely on a more recent binutils these
> days, which supports the generic S<op0>_<op1>_<cn>_<Cm>_<op2> sysreg
> definition. That would mean we could get rid of the whole msr_s/mrs_s
> hack by turning that into a CPP macro which built that name.
>
> It looks like binutils has been able to do that since September 2014...
>
> Are folk using toolchains older than that to compile kernels?

Do you have a link to a commit? If we can pinpoint the binutils
version, that might help.

Also, I look forward to this patch for use of Clang's integrated
assembler (regardless of LTO). I remember getting frustrated trying
to figure out how to resolve this for both assemblers, and I had
forgotten this solution existed.
--
Thanks,
~Nick Desaulniers