Re: [PATCH 2/6] x86/entry/64: Convert SYSRET validation tests to C

From: Brian Gerst
Date: Tue Jul 18 2023 - 10:26:16 EST


On Tue, Jul 18, 2023 at 10:17 AM Mika Penttilä <mpenttil@xxxxxxxxxx> wrote:
>
> Hi,
>
>
> On 18.7.2023 16.44, Brian Gerst wrote:
> > Signed-off-by: Brian Gerst <brgerst@xxxxxxxxx>
> > ---
> > arch/x86/entry/common.c | 50 ++++++++++++++++++++++++++++++-
> > arch/x86/entry/entry_64.S | 55 ++--------------------------------
> > arch/x86/include/asm/syscall.h | 2 +-
> > 3 files changed, 52 insertions(+), 55 deletions(-)
> >
> > diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> > index 6c2826417b33..afe79c3f1c5b 100644
> > --- a/arch/x86/entry/common.c
> > +++ b/arch/x86/entry/common.c
> > @@ -70,8 +70,12 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
> > return false;
> > }
> >
> > -__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
> > +/* Returns true to return using SYSRET, or false to use IRET */
> > +__visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > {
> > + long rip;
> > + unsigned int shift_rip;
> > +
> > add_random_kstack_offset();
> > nr = syscall_enter_from_user_mode(regs, nr);
> >
> > @@ -84,6 +88,50 @@ __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
> >
> > instrumentation_end();
> > syscall_exit_to_user_mode(regs);
> > +
> > + /*
> > + * Check that the register state is valid for using SYSRET to exit
> > + * to userspace. Otherwise use the slower but fully capable IRET
> > + * exit path.
> > + */
> > +
> > + /* XEN PV guests always use IRET path */
> > + if (cpu_feature_enabled(X86_FEATURE_XENPV))
> > + return false;
> > +
> > + /* SYSRET requires RCX == RIP and R11 == EFLAGS */
> > + if (unlikely(regs->cx != regs->ip || regs->r11 != regs->flags))
> > + return false;
> > +
> > + /* CS and SS must match the values set in MSR_STAR */
> > + if (unlikely(regs->cs != __USER_CS || regs->ss != __USER_DS))
> > + return false;
> > +
> > + /*
> > + * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
> > + * in kernel space. This essentially lets the user take over
> > + * the kernel, since userspace controls RSP.
> > + *
> > + * Change top bits to match most significant bit (47th or 56th bit
> > + * depending on paging mode) in the address.
> > + */
> > + shift_rip = (64 - __VIRTUAL_MASK_SHIFT + 1);
>
> Should this be:
>
> shift_rip = (64 - __VIRTUAL_MASK_SHIFT - 1);
> ?

I removed a set of parentheses, which switched the sign from -1 to +1.
I could put it back if that's less confusing.

Brian Gerst