Re: [PATCH 2/6] x86/entry/64: Convert SYSRET validation tests to C

From: Brian Gerst
Date: Tue Jul 18 2023 - 11:47:13 EST


On Tue, Jul 18, 2023 at 11:21 AM Brian Gerst <brgerst@xxxxxxxxx> wrote:
>
> On Tue, Jul 18, 2023 at 10:49 AM Mika Penttilä <mpenttil@xxxxxxxxxx> wrote:
> >
> >
> >
> > On 18.7.2023 17.25, Brian Gerst wrote:
> > > On Tue, Jul 18, 2023 at 10:17 AM Mika Penttilä <mpenttil@xxxxxxxxxx> wrote:
> > >>
> > >> Hi,
> > >>
> > >>
> > >> On 18.7.2023 16.44, Brian Gerst wrote:
> > >>> Signed-off-by: Brian Gerst <brgerst@xxxxxxxxx>
> > >>> ---
> > >>> arch/x86/entry/common.c | 50 ++++++++++++++++++++++++++++++-
> > >>> arch/x86/entry/entry_64.S | 55 ++--------------------------------
> > >>> arch/x86/include/asm/syscall.h | 2 +-
> > >>> 3 files changed, 52 insertions(+), 55 deletions(-)
> > >>>
> > >>> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> > >>> index 6c2826417b33..afe79c3f1c5b 100644
> > >>> --- a/arch/x86/entry/common.c
> > >>> +++ b/arch/x86/entry/common.c
> > >>> @@ -70,8 +70,12 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr)
> > >>> return false;
> > >>> }
> > >>>
> > >>> -__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
> > >>> +/* Returns true to return using SYSRET, or false to use IRET */
> > >>> +__visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr)
> > >>> {
> > >>> + long rip;
> > >>> + unsigned int shift_rip;
> > >>> +
> > >>> add_random_kstack_offset();
> > >>> nr = syscall_enter_from_user_mode(regs, nr);
> > >>>
> > >>> @@ -84,6 +88,50 @@ __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
> > >>>
> > >>> instrumentation_end();
> > >>> syscall_exit_to_user_mode(regs);
> > >>> +
> > >>> + /*
> > >>> + * Check that the register state is valid for using SYSRET to exit
> > >>> + * to userspace. Otherwise use the slower but fully capable IRET
> > >>> + * exit path.
> > >>> + */
> > >>> +
> > >>> + /* XEN PV guests always use IRET path */
> > >>> + if (cpu_feature_enabled(X86_FEATURE_XENPV))
> > >>> + return false;
> > >>> +
> > >>> + /* SYSRET requires RCX == RIP and R11 == EFLAGS */
> > >>> + if (unlikely(regs->cx != regs->ip || regs->r11 != regs->flags))
> > >>> + return false;
> > >>> +
> > >>> + /* CS and SS must match the values set in MSR_STAR */
> > >>> + if (unlikely(regs->cs != __USER_CS || regs->ss != __USER_DS))
> > >>> + return false;
> > >>> +
> > >>> + /*
> > >>> + * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP
> > >>> + * in kernel space. This essentially lets the user take over
> > >>> + * the kernel, since userspace controls RSP.
> > >>> + *
> > >>> + * Change top bits to match most significant bit (47th or 56th bit
> > >>> + * depending on paging mode) in the address.
> > >>> + */
> > >>> + shift_rip = (64 - __VIRTUAL_MASK_SHIFT + 1);
> > >>
> > >> Should this be:
> > >>
> > >> shift_rip = (64 - __VIRTUAL_MASK_SHIFT - 1);
> > >> ?
> > >
> > > I removed a set of parentheses, which switched the sign from -1 to +1.
> > > I could put it back if that's less confusing.
> > >
> >
> > I mean isn't it supposed to be:
> > shift_rip = (64 - 48) for 4 level, now it's
> > shift_rip = (64 - 46)
> >
> > __VIRTUAL_MASK_SHIFT == 47

My apologies, you were right. I've been sitting on this series for a
while and finally got around to posting it and didn't catch that
error.

>
> Original:
> (64 - (47 + 1)) = (64 - 48) = 16
>
> c5: 48 c1 e1 10 shl $0x10,%rcx
> c9: 48 c1 f9 10 sar $0x10,%rcx

This was wrong. I hastily compiled this after I had reverted to the
original formula.

> New:
> (64 - 47 - 1) = (17 - 1) = 16
>
> 18b: b9 10 00 00 00 mov $0x10,%ecx
> 193: 48 d3 e2 shl %cl,%rdx
> 196: 48 d3 fa sar %cl,%rdx
>
> Anyways, I'll switch it back to the original formula. I'm not going
> to argue any more about basic math.

I'll send a v2 later after any more feedback. Thanks.

Brian Gerst