Re: [PATCH v12 8/8] x86: Disallow vsyscall emulation when CET is enabled

From: Andy Lutomirski
Date: Wed Sep 23 2020 - 17:35:00 EST


On Tue, Sep 22, 2020 at 10:46 AM Yu, Yu-cheng <yu-cheng.yu@xxxxxxxxx> wrote:
>
> On 9/21/2020 4:48 PM, Andy Lutomirski wrote:
> > On Mon, Sep 21, 2020 at 3:37 PM Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx> wrote:
> >>
> >> On Mon, 2020-09-21 at 09:22 -0700, Yu, Yu-cheng wrote:
>
> [...]
>
> >>
> >> Here is the patch:
> >>
> >> ------
> >>
> >> From dfdee39c795ee5dcee2c77f6ba344a61f4d8124b Mon Sep 17 00:00:00 2001
> >> From: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> >> Date: Thu, 29 Nov 2018 14:15:38 -0800
> >> Subject: [PATCH 34/43] x86/vsyscall/64: Fixup Shadow Stack and Indirect Branch
> >> Tracking for vsyscall emulation
> >>
> >> Vsyscall entry points are effectively branch targets. Mark them with
> >> ENDBR64 opcodes. When emulating the RET instruction, unwind the shadow
> >> stack and reset IBT state machine.
> >>
> >> Signed-off-by: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> >> ---
> >> arch/x86/entry/vsyscall/vsyscall_64.c | 29 +++++++++++++++++++++++
> >> arch/x86/entry/vsyscall/vsyscall_emu_64.S | 9 +++++++
> >> arch/x86/entry/vsyscall/vsyscall_trace.h | 1 +
> >> 3 files changed, 39 insertions(+)
> >>
> >> diff --git a/arch/x86/entry/vsyscall/vsyscall_64.c
> >> b/arch/x86/entry/vsyscall/vsyscall_64.c
> >> index 44c33103a955..0131c9f7f9c5 100644
> >> --- a/arch/x86/entry/vsyscall/vsyscall_64.c
> >> +++ b/arch/x86/entry/vsyscall/vsyscall_64.c
> >> @@ -38,6 +38,9 @@
> >> #include <asm/fixmap.h>
> >> #include <asm/traps.h>
> >> #include <asm/paravirt.h>
> >> +#include <asm/fpu/xstate.h>
> >> +#include <asm/fpu/types.h>
> >> +#include <asm/fpu/internal.h>
> >>
> >> #define CREATE_TRACE_POINTS
> >> #include "vsyscall_trace.h"
> >> @@ -286,6 +289,32 @@ bool emulate_vsyscall(unsigned long error_code,
> >> /* Emulate a ret instruction. */
> >> regs->ip = caller;
> >> regs->sp += 8;
> >> +
> >> + if (current->thread.cet.shstk_size ||
> >> + current->thread.cet.ibt_enabled) {
> >> + u64 r;
> >> +
> >> + fpregs_lock();
> >> + if (test_thread_flag(TIF_NEED_FPU_LOAD))
> >> + __fpregs_load_activate();
> >
> > Wouldn't this be nicer if you operated on the memory image, not the registers?
>
> Do you mean writing to the XSAVES area?

Yes.

>
> >
> >> +
> >> +#ifdef CONFIG_X86_INTEL_BRANCH_TRACKING_USER
> >> + /* Fixup branch tracking */
> >> + if (current->thread.cet.ibt_enabled) {
> >> + rdmsrl(MSR_IA32_U_CET, r);
> >> + wrmsrl(MSR_IA32_U_CET, r & ~CET_WAIT_ENDBR);
> >> + }
> >> +#endif
> >
> > Seems reasonable on first glance.
> >
> >> +
> >> +#ifdef CONFIG_X86_INTEL_SHADOW_STACK_USER
> >> + /* Unwind shadow stack. */
> >> + if (current->thread.cet.shstk_size) {
> >> + rdmsrl(MSR_IA32_PL3_SSP, r);
> >> + wrmsrl(MSR_IA32_PL3_SSP, r + 8);
> >> + }
> >> +#endif
> >
> > What happens if the result is noncanonical? A quick skim of the SDM
> > didn't find anything. This latter issue goes away if you operate on
> > the memory image, though -- writing a bogus value is just fine, since
> > the FP restore will handle it.
> >
>
> At this point, the MSR's value can still be valid or is already saved to
> memory. If we are going to write to memory, then the MSR must be saved
> first. So I chose to do __fpregs_load_activate() and write the MSR.
>
> Maybe we can check the address before writing it to the MSR?

Performance is almost irrelevant here, and the writing-to-XSAVES-area
approach should have the benefit that the exception handling and
signaling happens for free.