Re: [PATCH 1/2] x86_64,entry: Filter RFLAGS.NT on entry from userspace

From: Chuck Ebbert
Date: Tue Sep 30 2014 - 20:27:51 EST


On Tue, 30 Sep 2014 12:40:35 -0700
Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:

> The NT flag doesn't do anything in long mode other than causing IRET
> to #GP. Oddly, CPL3 code can still net NT using popf.
>
> Entry via hardware or software interrupt clears NT automatically, so
> the only relevant entries are fast syscalls.
>
> This patch programs the CPU to clear NT on entry via SYSCALL (both
> 32-bit and 64-bit, by my reading of the AMD APM). It also clears NT
> (and some other flags) in software on SYSENTER.
>
> I haven't touched anything on 32-bit kernels.
>
> If user code causes kernel code to run with NT set, then there's at
> least some (small) chance that it could cause trouble. For example,
> user code could cause a call to EFI code with NT set, and who knows
> what would happen. Apparently Wine sometimes does this (!), and, if
> an IRET return happens, Wine will segfault.
>
> I think that Wine should be fixed to stop setting NT when a syscall
> happens, but handling NT more gracefully is still nice.
>
> The syscall mask change comes from a variant of this patch by Anish
> Bhatt.
>
> Cc: stable@xxxxxxxxxx
> Reported-by: Anish Bhatt <anish@xxxxxxxxxxx>
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> ---
> arch/x86/ia32/ia32entry.S | 10 +++++++++-
> arch/x86/kernel/cpu/common.c | 2 +-
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/ia32/ia32entry.S b/arch/x86/ia32/ia32entry.S
> index 4299eb05023c..079f42a7ad58 100644
> --- a/arch/x86/ia32/ia32entry.S
> +++ b/arch/x86/ia32/ia32entry.S
> @@ -143,7 +143,15 @@ ENTRY(ia32_sysenter_target)
> pushq_cfi %r10
> CFI_REL_OFFSET rip,0
> pushq_cfi %rax
> - cld
> +
> + /*
> + * Sysenter doesn't filter flags, so we should filter them
> + * ourselves.
> + */
> + pushfq_cfi
> + andl $~(X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_NT|X86_EFLAGS_IOPL),(%rsp)
> + popfq_cfi

Can't you just push a constant and pop that onto flags instead? It's
not like we care what's in there on entry to the kernel.

> +
> SAVE_ARGS 0,1,0
> /* no need to do an access_ok check here because rbp has been
> 32bit zero extended */
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index e4ab2b42bd6f..31265580c38a 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -1184,7 +1184,7 @@ void syscall_init(void)
> /* Flags to clear on syscall */
> wrmsrl(MSR_SYSCALL_MASK,
> X86_EFLAGS_TF|X86_EFLAGS_DF|X86_EFLAGS_IF|
> - X86_EFLAGS_IOPL|X86_EFLAGS_AC);
> + X86_EFLAGS_IOPL|X86_EFLAGS_AC|X86_EFLAGS_NT);
> }
>
> /*

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/