Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation

From: Linus Torvalds
Date: Fri Dec 02 2016 - 13:11:53 EST


On Fri, Dec 2, 2016 at 9:38 AM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>
> apply_alternatives, unfortunately. It's performance-critical because
> it's intensely stupid and does sync_core() for every single patch.
> Fixing that would be nice, too.

So looking at text_poke_early(), that's very much a case that really
shouldn't need any "sync_core()" at all as far as I can tell.

Only the current CPU is running, and for local CPU I$ coherence all
you need is a jump instruction, and even that is only on really old
CPU's. From the PPro onwards (maybe even Pentium?) the I$ is entirely
serialized as long as you change the data using the same linear
address.

So at most, that function could mark itsel f"noinline" just to
guarantee that it will cause a control flow change before returning.
The sync_core() seems entirely bogus.

Same goes for optimize_nops() too.

Linus