Re: [PATCH 1/3] x86/alternatives: Teach text_poke_bp() to emulate instructions

From: Peter Zijlstra
Date: Thu Oct 03 2019 - 07:01:26 EST


On Thu, Oct 03, 2019 at 10:27:51AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 03, 2019 at 02:00:50PM +0900, Masami Hiramatsu wrote:
>
> > > This fits almost all text_poke_bp() users, except
> > > arch_unoptimize_kprobe() which restores random text, and for that site
> > > we have to build an explicit emulate instruction.
> >
> > OK, and in this case, I would like to change RELATIVEJUMP_OPCODE
> > to JMP32_INSN_OPCODE for readability. (or at least
> > making RELATIVEJUMP_OPCODE as an alias of JMP32_INSN_OPCODE)
>
> > > @@ -448,12 +447,18 @@ void arch_optimize_kprobes(struct list_h
> > > void arch_unoptimize_kprobe(struct optimized_kprobe *op)
> > > {
> > > u8 insn_buff[RELATIVEJUMP_SIZE];
> > > + u8 emulate_buff[RELATIVEJUMP_SIZE];
> > >
> > > /* Set int3 to first byte for kprobes */
> > > insn_buff[0] = BREAKPOINT_INSTRUCTION;
> > > memcpy(insn_buff + 1, op->optinsn.copied_insn, RELATIVE_ADDR_SIZE);
> > > +
> > > + emulate_buff[0] = RELATIVEJUMP_OPCODE;
> > > + *(s32 *)(&emulate_buff[1]) = (s32)((long)op->optinsn.insn -
> > > + ((long)op->kp.addr + RELATIVEJUMP_SIZE));
>
> I'm halfway through a patch introducing:
>
> union text_poke_insn {
> u8 code[POKE_MAX_OPCODE_SUZE];
> struct {
> u8 opcode;
> s32 disp;
> } __attribute__((packed));
> };
>
> to text-patching.h to unify all such custom unions we have all over the
> place. I'll mob up the above in that.
>
> > > +
> > > text_poke_bp(op->kp.addr, insn_buff, RELATIVEJUMP_SIZE,
> > > - op->optinsn.insn);
> > > + emulate_buff);
> > > }
>
> As argued in a previous thread, text_poke_bp() is broken when it changes
> more than a single instruction at a time.
>
> Now, ISTR optimized kprobes does something like:
>
> poke INT3

Hmm, it does this using text_poke(), but lacks a
on_each_cpu(do_sync_core, NULL, 1), which I suppose is OK-ish IFF you do
that synchronize_rcu_tasks() after it, but less so if you don't.

That is, without either, you can't really tell if the kprobe is in
effect or not.

Also, I think text_poke_bp(INT3) is broken, although I don't think
anybody actually does that. Still, let me fix that.

> synchronize_rcu_tasks() /* waits for all tasks to schedule
> guarantees instructions after INT3
> are unused */
> install optimized probe /* overwrites multiple instrctions with
> JMP.d32 */
>
> And the above then undoes that by:
>
> poke INT3 on top of the optimzed probe
>
> poke tail instructions back /* guaranteed safe because the
> above INT3 poke ensures the
> JMP.d32 instruction is unused */
>
> poke head byte back
>
> Is this correct? If so, we should probably put a comment in there
> explaining how all this is unusual but safe.