Re: [PATCH] x86/retpoline/entry: Disable the entire SYSCALL64 fast path with retpolines on

From: Andy Lutomirski
Date: Thu Jan 25 2018 - 16:31:39 EST


On Thu, Jan 25, 2018 at 1:20 PM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Thu, Jan 25, 2018 at 1:08 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
>>
>> With retpoline, the retpoline in the trampoline sucks. I don't need
>> perf for that -- I've benchmarked it both ways. It sucks. I'll fix
>> it, but it'll be kind of complicated.
>
> Ahh, I'd forgotten about that (and obviously didn't see it in the profiles).
>
> But yeah, that is fixable even if it does require a page per CPU. Or
> did you have some clever scheme in mind?

Nothing clever. I was going to see if I could get actual
binutils-generated relocations to work in the trampoline. We already
have code to parse ELF relocations and turn them into a simple table,
and it shouldn't be *that* hard to run a separate pass on the entry
trampoline.

Another potentially useful if rather minor optimization would be to
rejigger the SYSCALL_DEFINE macros a bit. Currently we treat all
syscalls like this:

long func(long arg0, long arg1, long arg2, long arg3, long arg4, long arg5);

I wonder if we'd be better off doing:

long func(const struct pt_regs *regs);

and autogenerating:

static long SyS_read(const struct pt_regs *regs)
{
return sys_reg(regs->di, ...);
}