Re: [PATCH 4/6] Unsuck "x86/entry/64: Create a percpu SYSCALL entry trampoline"

From: Andy Lutomirski
Date: Sat Dec 02 2017 - 11:05:50 EST


On Sat, Dec 2, 2017 at 7:18 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> On Thu, Nov 30, 2017 at 10:29:44PM -0800, Andy Lutomirski wrote:
>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> index caf74a1bb3de..28f4e7553c26 100644
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -180,14 +180,24 @@ ENTRY(entry_SYSCALL_64_trampoline)
>>
>> /*
>> * x86 lacks a near absolute jump, and we can't jump to the real
>> - * entry text with a relative jump, so we fake it using retq.
>> + * entry text with a relative jump. We could push the target
>> + * address and then use retq, but this destroys the pipeline on
>> + * many CPUs (wasting over 20 cycles on Sandy Bridge). Instead,
>> + * spill RDI and restore it in a second-stage trampoline.
>> */
>> - pushq $entry_SYSCALL_64_after_hwframe
>> - retq
>> + pushq %rdi
>> + movq $entry_SYSCALL_64_stage2, %rdi
>> + jmp *%rdi
>> END(entry_SYSCALL_64_trampoline)
>>
>> .popsection
>>
>> +ENTRY(entry_SYSCALL_64_stage2)
>> + UNWIND_HINT_EMPTY
>> + popq %rdi
>> + jmp entry_SYSCALL_64_after_hwframe
>> +END(entry_SYSCALL_64_stage2)
>> +
>> ENTRY(entry_SYSCALL_64)
>> UNWIND_HINT_EMPTY
>> /*
>
> Another crazy idea:
>
> call 1f
> 1: movq $entry_SYSCALL_64_after_hwframe, (%rsp)
> ret
>
> Does that fix the regression?

I suspect that's as bad or worse. The issue (I think) is that the CPU
has a little invisible internal stack that tracks calls and rets and
the CPU will speculate past a ret under the assumption that it returns
to the last call on the stack. If it doesn't, then the CPU has to
start over.