Re: [RFC][PATCH] objtool: STAC/CLAC validation

From: Andy Lutomirski
Date: Mon Feb 25 2019 - 10:36:29 EST




> On Feb 25, 2019, at 3:53 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
>> On Mon, Feb 25, 2019 at 11:51:44AM +0100, Peter Zijlstra wrote:
>>> On Fri, Feb 22, 2019 at 03:55:25PM -0800, Andy Lutomirski wrote:
>>> I'm wondering if we can just change the code that does getreg() and
>>> load_gs_index() so it doesn't do it with AC set. Also, what about
>>> paravirt kernels? They'll call into PV code for load_gs_index() with
>>> AC set.
>>
>> Paravirt can go bugger off. There's no sane way to fix that.
>
>> I don't fully understand that code at all; I also have no clue why GS
>> has paravirt bits on but the other segments do not.
>
> *sigh* SWAPGS
>
>> *thought*... we could delay the actual set_user_seg() thing until after
>> the get_user_catch(), would that work?
>
>
> How horrible / broken is this?
>
> ---
>
> diff --git a/arch/x86/ia32/ia32_signal.c b/arch/x86/ia32/ia32_signal.c
> index 321fe5f5d0e9..67c866943102 100644
> --- a/arch/x86/ia32/ia32_signal.c
> +++ b/arch/x86/ia32/ia32_signal.c
> @@ -60,17 +60,21 @@
> regs->seg = GET_SEG(seg) | 3; \
> } while (0)
>
> -#define RELOAD_SEG(seg) { \
> - unsigned int pre = GET_SEG(seg); \
> - unsigned int cur = get_user_seg(seg); \
> - pre |= 3; \
> - if (pre != cur) \
> - set_user_seg(seg, pre); \
> +#define LOAD_SEG(seg) { \
> + pre_##seg = 3 | GET_SEG(seg); \
> + cur_##seg = get_user_seg(seg); \
> +}
> +
> +#define RELOAD_SEG(seg) { \
> + if (pre_##seg != cur_##seg) \
> + set_user_seg(seg, pre_##seg); \
> }
>
> static int ia32_restore_sigcontext(struct pt_regs *regs,
> struct sigcontext_32 __user *sc)
> {
> + u16 pre_gs, pre_fs, pre_ds, pre_es;
> + u16 cur_gs, cur_fs, cur_ds, cur_es;
> unsigned int tmpflags, err = 0;
> void __user *buf;
> u32 tmp;
> @@ -85,10 +89,10 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,
> * the handler, but does not clobber them at least in the
> * normal case.
> */
> - RELOAD_SEG(gs);
> - RELOAD_SEG(fs);
> - RELOAD_SEG(ds);
> - RELOAD_SEG(es);
> + LOAD_SEG(gs);
> + LOAD_SEG(fs);
> + LOAD_SEG(ds);
> + LOAD_SEG(es);
>
> COPY(di); COPY(si); COPY(bp); COPY(sp); COPY(bx);
> COPY(dx); COPY(cx); COPY(ip); COPY(ax);
> @@ -106,6 +110,11 @@ static int ia32_restore_sigcontext(struct pt_regs *regs,
> buf = compat_ptr(tmp);
> } get_user_catch(err);
>
> + RELOAD_SEG(gs);
> + RELOAD_SEG(fs);
> + RELOAD_SEG(ds);
> + RELOAD_SEG(es);
> +
> err |= fpu__restore_sig(buf, 1);
>
> force_iret();

I would call this pretty horrible. How about we do it without macros? :)

But yes, deferring the segment load until after the read seems fine to me. Frankly, we could also just copy_from_user the whole thing up front â thus code is not really a serious fast path.