Re: [PATCH v4 RESEND] x86/asm/entry/32, selftests: Add 'test_syscall_vdso' test

From: Denys Vlasenko
Date: Thu Sep 10 2015 - 15:04:49 EST


On 09/10/2015 12:01 AM, Andy Lutomirski wrote:
> On Mon, Sep 7, 2015 at 8:56 AM, Denys Vlasenko <dvlasenk@xxxxxxxxxx> wrote:
>> This new test checks that all x86 registers are preserved across
>> 32-bit syscalls. It tests syscalls through VDSO (if available)
>> and through INT 0x80, normally and under ptrace.
>>
>> If kernel is a 64-bit one, high registers (r8..r15) are poisoned
>> before the syscall is called and are checked afterwards.
>>
>> They must be either preserved, or cleared to zero (but r11 is special);
>> r12..15 must be preserved for INT 0x80.
>>
>> EFLAGS is checked for changes too, but change there is not
>> considered to be a bug (paravirt kernels do not preserve
>> arithmetic flags).
>>
>> Run-tested on 64-bit kernel:
>>
>> $ ./test_syscall_vdso_32
>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>> [OK] Arguments are preserved across syscall
>> [NOTE] R11 has changed:0000000000200ed7 - assuming clobbered by SYSRET insn
>> [OK] R8..R15 did not leak kernel data
>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>> [OK] Arguments are preserved across syscall
>> [OK] R8..R15 did not leak kernel data
>> [RUN] Running tests under ptrace
>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>> [OK] Arguments are preserved across syscall
>> [OK] R8..R15 did not leak kernel data
>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>> [OK] Arguments are preserved across syscall
>> [OK] R8..R15 did not leak kernel data
>>
>> On 32-bit paravirt kernel:
>>
>> $ ./test_syscall_vdso_32
>> [NOTE] Not a 64-bit kernel, won't test R8..R15 leaks
>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>> [WARN] Flags before=0000000000200ed7 id 0 00 o d i s z 0 a 0 p 1 c
>> [WARN] Flags after=0000000000200246 id 0 00 i z 0 0 p 1
>> [WARN] Flags change=0000000000000c91 0 00 o d s 0 a 0 0 c
>> [OK] Arguments are preserved across syscall
>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>> [OK] Arguments are preserved across syscall
>> [RUN] Running tests under ptrace
>> [RUN] Executing 6-argument 32-bit syscall via VDSO
>> [OK] Arguments are preserved across syscall
>> [RUN] Executing 6-argument 32-bit syscall via INT 80
>> [OK] Arguments are preserved across syscall
>>
>> Signed-off-by: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
>> CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>> CC: Steven Rostedt <rostedt@xxxxxxxxxxx>
>> CC: Ingo Molnar <mingo@xxxxxxxxxx>
>> CC: Borislav Petkov <bp@xxxxxxxxx>
>> CC: "H. Peter Anvin" <hpa@xxxxxxxxx>
>> CC: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
>> CC: Oleg Nesterov <oleg@xxxxxxxxxx>
>> CC: Frederic Weisbecker <fweisbec@xxxxxxxxx>
>> CC: Alexei Starovoitov <ast@xxxxxxxxxxxx>
>> CC: Will Drewry <wad@xxxxxxxxxxxx>
>> CC: Kees Cook <keescook@xxxxxxxxxxxx>
>> CC: x86@xxxxxxxxxx
>> CC: linux-kernel@xxxxxxxxxxxxxxx
>
> Acked-by: Andy Lutomirski <luto@xxxxxxxxxx>
>
> with minor caveats below, none of which are show-stoppers...
>
>> + /* INT80 syscall entrypoint can be used by
>> + * 64-bit programs too, unlike SYSCALL/SYSENTER.
>> + * Therefore it must preserve R12+
>> + * (they are callee-saved registers in 64-bit C ABI).
>> + *
>> + * This was probably historically not intended,
>> + * but R8..11 are clobbered (cleared to 0).
>> + * IOW: they are the only registers which aren't
>> + * preserved across INT80 syscall.
>> + */
>> + if (*r64 == 0 && num <= 11)
>> + continue;
>
> Ugh. I'll change my big entry patchset to preserve these and maybe to
> preserve all of the 64-bit regs.

If you do that, this won't change the ABI: we don't _promise_
to save them. If we accidentally do, that means nothing.

If you do that, the test won't fail. The code above does
not require regs to be 0 - there is further code which
also allow them to be unchanged.

(I'm not very comfortable about additional six push/pops
which are necessary for this to happen. I'm surprised
maintainers tentatively agreed to that -
I was grilled and asked to prove with measurements
that *one* additional push+pop wasn't adding significant overhead).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/