Re: [PATCH] arm64: fix the address of syscall in arch_syscall_addr if CFI is enabled

From: Sami Tolvanen
Date: Fri Dec 03 2021 - 11:07:35 EST


On Fri, Dec 3, 2021 at 3:12 AM Mark Rutland <mark.rutland@xxxxxxx> wrote:
>
> Hi Ben,
>
> Please Cc the arm64 maintainers for arm64 patches, you can find the relevant
> folk in the MAINTAINERS file, and you can use the get_maintainer.pl script to
> find them automatically (e.g. with `./scripts/get_maintainer.pl ${PATCH}` or
> `./scripts/get_maintainer.pl -f ${FILE}`).
>
> For arm64 patches, you should also see LAKML
> (linux-arm-kernel@xxxxxxxxxxxxxxxxxxx).
>
> I've added Will and Catalin now, and Ard and Peter too since they've btoh been
> fighting with things in this area.
>
> On Fri, Dec 03, 2021 at 01:29:08PM +0800, Ben Dai wrote:
> > With CONFIG_CFI_CLANG, the addresses in sys_call_table[] actually point to
> > jump instructions like "b __arm64_sys_*", and if CONFIG_LTO_CLANG_FULL is
> > enabled, the compiler will not generate a symbol for each jump. It causes
> > syscall tracer can't get symbol name in find_syscall_meta() and fail to
> > initialize.
> >
> > To fix this problem, implement an strong definition of arch_syscall_addr()
> > to get the actual addresses of system calls.
> >
> > Signed-off-by: Ben Dai <ben.dai@xxxxxxxxxx>

I don't think this is the correct solution. I believe Nick fixed the
issue with missing symbols in https://reviews.llvm.org/D107934, so we
should probably just require a newer compiler version here.

> > ---
> > arch/arm64/kernel/syscall.c | 21 +++++++++++++++++++++
> > 1 file changed, 21 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > index 50a0f1a38e84..2b911603966b 100644
> > --- a/arch/arm64/kernel/syscall.c
> > +++ b/arch/arm64/kernel/syscall.c
> > @@ -12,6 +12,8 @@
> > #include <asm/debug-monitors.h>
> > #include <asm/exception.h>
> > #include <asm/fpsimd.h>
> > +#include <asm/insn.h>
> > +#include <asm/patching.h>
> > #include <asm/syscall.h>
> > #include <asm/thread_info.h>
> > #include <asm/unistd.h>
> > @@ -19,6 +21,25 @@
> > long compat_arm_syscall(struct pt_regs *regs, int scno);
> > long sys_ni_syscall(void);
> >
> > +#ifdef CONFIG_CFI_CLANG
> > +unsigned long __init arch_syscall_addr(int nr)
> > +{
> > + u32 insn;
> > + unsigned long addr = (unsigned long)sys_call_table[nr];
> > +
> > + /*
> > + * Clang's CFI will replace the address of each system call function
> > + * with the address of a jump table entry. In this case, the jump
> > + * target address is the actual address of the system call.
> > + */
> > + aarch64_insn_read((void *)addr, &insn);
> > + if (likely(aarch64_insn_is_b(insn)))
> > + addr += aarch64_get_branch_offset(insn);
>
> The problemn is not your fault, but I absolutely hate this, because we're
> bodging around the compiler having broken stuff for us in the first place.
> We've been encountering more of these cases over time, and had we been aware of
> them to begin with we'd have strongly pushed back on marging the CFI patches
> until the compiler offered a better solution.
>
> My view is that the current clang CFI scheme is broken by design here, and we
> should mark tracing as incompatible with it for now, and work with the compiler
> folk to get a scheme that actually works, rather than trying to bodge around
> it.

We are actively fixing compiler issues when they're discovered. In the
meanwhile, Android uses both CFI and tracing without any major issues,
so I don't see why we should mark the combination broken.

Sami