Re: TLB flushes on fixmap changes

From: Masami Hiramatsu
Date: Sun Aug 26 2018 - 23:03:15 EST


On Sun, 26 Aug 2018 11:09:58 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Sat, Aug 25, 2018 at 09:21:22PM -0700, Andy Lutomirski wrote:
> > I just re-read text_poke(). It's, um, horrible. Not only is the
> > implementation overcomplicated and probably buggy, but it's SLOOOOOW.
> > It's totally the wrong API -- poking one instruction at a time
> > basically can't be efficient on x86. The API should either poke lots
> > of instructions at once or should be text_poke_begin(); ...;
> > text_poke_end();.
>
> I don't think anybody ever cared about performance here. Only
> correctness. That whole text_poke_bp() thing is entirely tricky.

Agreed. Self modification is a special event.

> FWIW, before text_poke_bp(), text_poke() would only be used from
> stop_machine, so all the other CPUs would be stuck busy-waiting with
> IRQs disabled. These days, yeah, that's lots more dodgy, but yes
> text_mutex should be serializing all that.

I'm still not sure that speculative page-table walk can be done
over the mutex. Also, if the fixmap area is for aliasing
pages (which always mapped to memory), what kind of
security issue can happen?

Anyway, from the viewpoint of kprobes, either per-cpu fixmap or
changing CR3 sounds good to me. I think we don't even need per-cpu,
it can call a thread/function on a dedicated core (like the first
boot processor) and wait :) This may prevent leakage of pte change
to other cores.

> And on that, I so hate comments like: "must be called under foo_mutex",
> we have lockdep_assert_held() for that.

Indeed. I also think that text_poke() should not call BUG_ON, but
its caller should decide it is recoverable or not.

Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>