Re: [PATCH 2/2] x86: Shorten RESET_CALL_DEPTH

From: Andrew . Cooper3
Date: Mon May 15 2023 - 05:47:59 EST


On 15/05/2023 10:28 am, Peter Zijlstra wrote:
> RESET_CALL_DEPTH is a pretty fat monster and blows up UNTRAIN_RET to
> 20 bytes:
>
> 19: 48 c7 c0 80 00 00 00 mov $0x80,%rax
> 20: 48 c1 e0 38 shl $0x38,%rax
> 24: 65 48 89 04 25 00 00 00 00 mov %rax,%gs:0x0 29: R_X86_64_32S pcpu_hot+0x10
>
> Shrink it by 4 bytes:
>
> 0:   31 c0                   xor    %eax,%eax
> 2:   48 0f ba e8 3f          bts    $0x3f,%rax
> 7:   65 48 89 04 25 00 00 00 00      mov    %rax,%gs:0x0
>
> Shrink RESET_CALL_DEPTH_FROM_CALL by 5 bytes by only setting al, the
> other bits are shifted out (the same could be done for
> RESET_CALL_DEPTH, but the xor+bts sequence has less depencies due to
> the zeroing).
>
> Suggested-by: Andrew.Cooper3@xxxxxxxxxx

Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> arch/x86/include/asm/nospec-branch.h | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> --- a/arch/x86/include/asm/nospec-branch.h
> +++ b/arch/x86/include/asm/nospec-branch.h
> @@ -84,12 +84,12 @@
> movq $-1, PER_CPU_VAR(pcpu_hot + X86_call_depth);
>
> #define RESET_CALL_DEPTH \
> - mov $0x80, %rax; \
> - shl $56, %rax; \
> + xor %eax, %eax; \
> + bts $59, %rax; \

$63 ?

The disassembly looks correct.

~Andrew