Re: [PATCH v7 2/3] arm64: implement ftrace with regs

From: Torsten Duwe
Date: Wed Apr 03 2019 - 09:05:32 EST


On Wed, Apr 03, 2019 at 03:48:43AM +0100, Mark Rutland wrote:
> Hi Torsten,
>
> Sorry for the long delay prior to this reply.

I was hoping you would come up with some code to speed things up :(

(For the record, v8 was the latest I sent, but nothing in the locations
mentioned here has changed)

> On Fri, Jan 18, 2019 at 05:39:08PM +0100, Torsten Duwe wrote:
> > --- a/arch/arm64/include/asm/ftrace.h
> > +++ b/arch/arm64/include/asm/ftrace.h
> > @@ -14,9 +14,24 @@
> > #include <asm/insn.h>
> >
> > #define HAVE_FUNCTION_GRAPH_FP_TEST
> > -#define MCOUNT_ADDR ((unsigned long)_mcount)
> > #define MCOUNT_INSN_SIZE AARCH64_INSN_SIZE
> >
> > +/*
> > + * DYNAMIC_FTRACE_WITH_REGS is implemented by adding 2 NOPs at the beginning
> > + * of each function, with the second NOP actually calling ftrace. In contrary
> > + * to a classic _mcount call, the call instruction to be modified is thus
> > + * the second one, and not the only one.
> > + */
> > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> > +#define ARCH_SUPPORTS_FTRACE_OPS 1
> > +#define REC_IP_BRANCH_OFFSET AARCH64_INSN_SIZE
> > +/* All we need is some magic value. Simply use "_mCount:" */
> > +#define MCOUNT_ADDR (0x5f6d436f756e743a)
>
> I'm really not keen on having a magic constant, and I'd really like to
> see MCOUNT_ADDR disappear entirely when the compiler doesn't generate an
> mcount call. I'm concerned that this is confusing and fragile.

Confusing: agreed. Fragile? don't think so.

> I think that it would be better to make the core ftrace API agnostic of
> mcount, and to teach it that there's a one-time initialization step for
> callsites (which is not necessarily patching a call with a NOP).
>
> We currently have:
>
> ftrace_make_nop(mod, rec, addr)
> ftrace_make_call(rec, addr)
> ftrace_modify_call(rec, old_addr, new_addr)
>
> ... whereas we could have:
>
> ftrace_call_init(mod, rec)
> ftrace_call_enable(rec, addr)
> ftrace_call_disable(rec, addr)
> ftrace_call_modify(rec, old_addr, new_addr)
>
> ... so we wouldn't need to special-case anything for initialization (and
> don't need MCOUNT_ADDR at all), and it would be clearer as to what was
> happening at each stage.

I'm fully on your side here, BUT...

This is the dynamic ftrace with regs implementation for aarch64 with the
_current_ dynamic ftrace API. Changing the latter is IMO a different issue.
This implementation has been tested, up to the point of some preliminary live
patching. I propose to commit this and only then change the API for aarch64
and s390 simultaneously, avoiding breakage on ppc64le and x86, of course.
(others to be investigated)

> > +
> > + /* The program counter just after the ftrace call site */
> > + str lr, [sp, #S_PC]
>
> For consistency with the existing ftrace assembly, please use 'x30'
> rather than 'lr'.

You could have raised that concern already along with the "fp" issue for v6.
I don't have a strong preference here; personally I find fp and lr more
readable, but with x29 and x30 one knows where they go.
This is only just a waste of time.

> > - module_disable_ro(mod);
> > - *mod->arch.ftrace_trampoline = trampoline;
> > - module_enable_ro(mod, true);
> > + /* Check against our well-known list of ftrace entry points */
> > + if (addr == FTRACE_ADDR || addr == FTRACE_REGS_ADDR) {
>
> These checks exist within install_ftrace_trampoline(), so we don't need
> to duplicate them here.

True. Code evolution at work.

Any other opinions on the dynamic ftrace API change, anyone?

Torsten