Re: [PATCH 05/35] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states

From: Edgecombe, Rick P
Date: Tue Feb 08 2022 - 17:23:46 EST


On Mon, 2022-02-07 at 15:28 -0800, Dave Hansen wrote:
> On 1/30/22 13:18, Rick Edgecombe wrote:
> > From: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> >
> > Control-flow Enforcement Technology (CET) introduces these MSRs:
> >
> > MSR_IA32_U_CET (user-mode CET settings),
> > MSR_IA32_PL3_SSP (user-mode shadow stack pointer),
> >
> > MSR_IA32_PL0_SSP (kernel-mode shadow stack pointer),
> > MSR_IA32_PL1_SSP (Privilege Level 1 shadow stack pointer),
> > MSR_IA32_PL2_SSP (Privilege Level 2 shadow stack pointer),
> > MSR_IA32_S_CET (kernel-mode CET settings),
> > MSR_IA32_INT_SSP_TAB (exception shadow stack table).
>
> To be honest, I'm not sure this is very valuable. It's *VERY* close
> to
> the exact information in the structure definitions. It's also not
> obviously related to XSAVE. It's more of the "what" this patch does
> than the "why". Good changelogs talk about "why".

Ok I'll look at re-wording this.

>
> > The two user-mode MSRs belong to XFEATURE_CET_USER. The first
> > three of
> > kernel-mode MSRs belong to XFEATURE_CET_KERNEL. Both XSAVES states
> > are
> > supervisor states. This means that there is no direct,
> > unprivileged access
> > to these states, making it harder for an attacker to subvert CET.

Oh, well I guess this *is* mentioned elsewhere, than in patch 3.

>
> Forgive me while I go into changelog lecture mode for a moment.
>
> I was constantly looking up at the list of MSRs and trying to
> reconcile
> them with this paragraph. Imagine if you had started out this
> changelog
> by saying:
>
> Shadow stack register state can be managed with XSAVE. The
> registers can logically be separated into two groups:
>
> * Registers controlling user-mode operation
> * Registers controlling kernel-mode operation
>
> The architecture has two new XSAVE state components: one for
> each group of registers. This _lets_ an OS manage them
> separately if it chooses. Linux chooses to ... <explain the
> design choice here, or why we don't care yet>.
>
> Both XSAVE state components are supervisor states, even the
> state controlling user-mode operation. This is a departure
> from
> earlier features like protection keys where the PKRU state is
> a normal user (non-supervisor) state. Having the user state be
>
> supervisor-managed ensures there is no direct, unprivileged
> access to it, making it harder for an attacker to subvert CET.
>
> Also, IBT gunk is in here too, right? Let's at least *mention* that
> in
> the changelog.

We can remove the IBT stuff if its better. I always appreciate finding
the unused features in headers when hacking around. But it all adds to
build time slightly I guess.

>
> ...
> > /* All supervisor states including supported and unsupported
> > states. */
> > #define XFEATURE_MASK_SUPERVISOR_ALL
> > (XFEATURE_MASK_SUPERVISOR_SUPPORTED | \
> > diff --git a/arch/x86/include/asm/msr-index.h
> > b/arch/x86/include/asm/msr-index.h
> > index 3faf0f97edb1..0ee77ce4c753 100644
> > --- a/arch/x86/include/asm/msr-index.h
> > +++ b/arch/x86/include/asm/msr-index.h
> > @@ -362,6 +362,26 @@
> >
> >
> > #define MSR_CORE_PERF_LIMIT_REASONS 0x00000690
> > +
> > +/* Control-flow Enforcement Technology MSRs */
> > +#define MSR_IA32_U_CET 0x000006a0 /* user mode
> > cet setting */
> > +#define MSR_IA32_S_CET 0x000006a2 /* kernel
> > mode cet setting */
> > +#define CET_SHSTK_EN BIT_ULL(0)
> > +#define CET_WRSS_EN BIT_ULL(1)
> > +#define CET_ENDBR_EN BIT_ULL(2)
> > +#define CET_LEG_IW_EN BIT_ULL(3)
> > +#define CET_NO_TRACK_EN BIT_ULL(4)
> > +#define CET_SUPPRESS_DISABLE BIT_ULL(5)
> > +#define CET_RESERVED (BIT_ULL(6) |
> > BIT_ULL(7) | BIT_ULL(8) | BIT_ULL(9))
>
> Would GENMASK_ULL() look any nicer here? I guess it's pretty clear
> as-is that bits 6->9 are reserved.

Hmm, visually I think it might be easier to catch that you need to
remove a reserved bit if it is being added after becoming unreserved
some day.

>
> > +#define CET_SUPPRESS BIT_ULL(10)
> > +#define CET_WAIT_ENDBR BIT_ULL(11)
>
> Are those bit fields common for both registers? It might be worth a
> comment to mention that.

Yes, I'll mention that.

>
> > +#define MSR_IA32_PL0_SSP 0x000006a4 /* kernel shadow
> > stack pointer */
> > +#define MSR_IA32_PL1_SSP 0x000006a5 /* ring-1 shadow
> > stack pointer */
> > +#define MSR_IA32_PL2_SSP 0x000006a6 /* ring-2 shadow
> > stack pointer */
>
> Are PL1/2 ever used in this implementation? If not, let's axe these
> definitions.

They are not used. Ok.

>
> > +#define MSR_IA32_PL3_SSP 0x000006a7 /* user shadow stack
> > pointer */
> > +#define MSR_IA32_INT_SSP_TAB 0x000006a8 /* exception
> > shadow stack table */
> > +
> > #define MSR_GFX_PERF_LIMIT_REASONS 0x000006B0
> > #define MSR_RING_PERF_LIMIT_REASONS 0x000006B1
> >
> > diff --git a/arch/x86/kernel/fpu/xstate.c
> > b/arch/x86/kernel/fpu/xstate.c
> > index 02b3ddaf4f75..44397202762b 100644
> > --- a/arch/x86/kernel/fpu/xstate.c
> > +++ b/arch/x86/kernel/fpu/xstate.c
> > @@ -50,6 +50,8 @@ static const char *xfeature_names[] =
> > "Processor Trace (unused)" ,
> > "Protection Keys User registers",
> > "PASID state",
> > + "Control-flow User registers" ,
> > + "Control-flow Kernel registers" ,
> > "unknown xstate feature" ,
> > "unknown xstate feature" ,
> > "unknown xstate feature" ,
> > @@ -73,6 +75,8 @@ static unsigned short xsave_cpuid_features[]
> > __initdata = {
> > [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT,
> > [XFEATURE_PKRU] = X86_FEATURE_PKU,
> > [XFEATURE_PASID] = X86_FEATURE_ENQCMD,
> > + [XFEATURE_CET_USER] = X86_FEATURE_SHSTK,
> > + [XFEATURE_CET_KERNEL] =
> > X86_FEATURE_SHSTK,
> > [XFEATURE_XTILE_CFG] =
> > X86_FEATURE_AMX_TILE,
> > [XFEATURE_XTILE_DATA] =
> > X86_FEATURE_AMX_TILE,
> > };
> > @@ -250,6 +254,8 @@ static void __init print_xstate_features(void)
> > print_xstate_feature(XFEATURE_MASK_Hi16_ZMM);
> > print_xstate_feature(XFEATURE_MASK_PKRU);
> > print_xstate_feature(XFEATURE_MASK_PASID);
> > + print_xstate_feature(XFEATURE_MASK_CET_USER);
> > + print_xstate_feature(XFEATURE_MASK_CET_KERNEL);
> > print_xstate_feature(XFEATURE_MASK_XTILE_CFG);
> > print_xstate_feature(XFEATURE_MASK_XTILE_DATA);
> > }
> > @@ -405,6 +411,7 @@ static __init void os_xrstor_booting(struct
> > xregs_state *xstate)
> > XFEATURE_MASK_BNDREGS | \
> > XFEATURE_MASK_BNDCSR | \
> > XFEATURE_MASK_PASID | \
> > + XFEATURE_MASK_CET_USER | \
> > XFEATURE_MASK_XTILE)
> >
> > /*
> > @@ -621,6 +628,8 @@ static bool __init
> > check_xstate_against_struct(int nr)
> > XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state);
> > XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state);
> > XCHECK_SZ(sz, nr, XFEATURE_XTILE_CFG, struct xtile_cfg);
> > + XCHECK_SZ(sz, nr, XFEATURE_CET_USER, struct cet_user_state);
> > + XCHECK_SZ(sz, nr, XFEATURE_CET_KERNEL, struct
> > cet_kernel_state);
> >
> > /* The tile data size varies between implementations. */
> > if (nr == XFEATURE_XTILE_DATA)
> > @@ -634,7 +643,9 @@ static bool __init
> > check_xstate_against_struct(int nr)
> > if ((nr < XFEATURE_YMM) ||
> > (nr >= XFEATURE_MAX) ||
> > (nr == XFEATURE_PT_UNIMPLEMENTED_SO_FAR) ||
> > - ((nr >= XFEATURE_RSRVD_COMP_11) && (nr <=
> > XFEATURE_RSRVD_COMP_16))) {
> > + (nr == XFEATURE_RSRVD_COMP_13) ||
> > + (nr == XFEATURE_RSRVD_COMP_14) ||
> > + (nr == XFEATURE_RSRVD_COMP_16)) {
> > WARN_ONCE(1, "no structure for xstate: %d\n", nr);
> > XSTATE_WARN_ON(1);
> > return false;
>
> That if() is getting unweildy. While I generally despise macros
> implicitly modifying variables, this might be worth it. We could
> have a
> local function variable:
>
> bool feature_checked = false;
>
> and then muck with it in the macro:
>
> #define XCHECK_SZ(sz, nr, nr_macro, __struct) do {
> if (nr == nr_macro)) {
> feature_checked = true;
> if (WARN_ONCE(sz != sizeof(__struct), ... ) {
> __xstate_dump_leaves();
> }
> }
> } while (0)
>
> Then the if() just makes sure the feature was checked instead of
> checking for reserved features explicitly. We could also do:
>
> bool c = false;
>
> ...
>
> c |= XCHECK_SZ(sz, nr, XFEATURE_YMM, struct
> ymmh_struct);
> c |= XCHECK_SZ(sz, nr, XFEATURE_BNDREGS, struct ...
> c |= XCHECK_SZ(sz, nr, XFEATURE_BNDCSR, struct ...
> ...
>
> but that starts to run into 80 columns. Those are both nice because
> they mean you don't have to maintain a list of reserved features in
> the
> code. Another option would be to define a:
>
> bool xfeature_is_reserved(int nr)
> {
> switch (nr) {
> case XFEATURE_RSRVD_COMP_13:
> ...
>
> so the if() looks nicer and won't grow; the function will grow
> instead.
>
> Either way, I think this needs some refactoring.

Yes, this makes sense. I'll play around with it.