Re: [GIT PULL] x86/shstk for 6.4

From: Linus Torvalds
Date: Sat May 06 2023 - 16:12:24 EST


On Sat, May 6, 2023 at 12:34 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> And I'm about a quarter in, haven't even gotten to the meat yet, and
> I've already found a bug.

Ok, so despite this I'm going a bit further, just to familiarize
myself with this even if I can't pull it.

And this is horrendous: pte_wrprotect() doing

if (pte_dirty(pte))
pte = pte_mksaveddirty(pte);

which actually has two fundamental problems:

(a) it shouldn't be a conditional thing at all, it should just be bit
operations

(b) "pte_dirty()" isn't even the right thing to use, since it
includes the SW dirty bit.

so what this *should* do is likely to just do

pte.val |= (pte.val >> _PAGE_BIT_DIRTY) & 1) << _PAGE_BIT_SOFT_DIRTY;
pte.val &= ~_PAGE_DIRTY;

which the compiler should be able to turn into a nice unconditional
series. Smaller than any conditional.

(You could simplify the above by just using fixed bit numbers - the
above is written with two shifts just to work with "any bit pair", but
obviously it could be written to be simpler and more straightforward
by just shifting the bit right into place)

I think the compilers may be able to do that all the simplification
for you even with a

if (pte.val & _PAGE_DIRTY)
pte.val |= _PAGE_SOFT_DIRTY;
pte.val &= ~_PAGE_DIRTY;

but that is only true when there are *not* things like those
cpu_feature_enabled() tests in between those operations.

So I strongly suspect that all those

if (!cpu_feature_enabled(X86_FEATURE_USER_SHSTK))

around this area are only making things worse. You're replacing a
couple of simple bit operations with jump-arounds (and then doing the
bit operations anyway in one side). And you are limited the compiler
from doing obvious simplifications in the process.

Conditionals are really bad, even when they can be turned into static jumps.

As static jumps they just cause odd code flow, and lack of testing.
And compiler barriers.

All these bit operations are actually cheaper - and would get more
test coverage - if they are just done unconditionally with a couple of
shift-and-mask operations.

Now, my reaction here is

- the whole shadow stack notion of "dirty but not writable is a magic
marker" is *DISGUSTING*. It's wrong.

Whatever Intel designer that came up with that - instead of just
picking another bit for the *HARDWARE* to check - should be ashamed.

Now we have to pick a software bit instead, and play games for
this. BAD BAD BAD.

I'm assuming this is something where Microsoft went "we already
don't have that, and we want all the sw bits for sw, so do this". But
from a design standpoint it's just nasty.

- But if we have to play those games, just *play* them. Do it all
unconditionally, and make the x86-64 rules be that "dirty but not
writable" is something we should never have.

Having two different rules, and conditionals for them, is both more
complex for maintainers, *and* for compilers.

So just make that _PAGE_BIT_SOFT_DIRTY be unconditional, and let's
just live with it. But make the bit operations efficient.

Maybe I'm missing something, and the people who have been working on
this have a really good reason for this mess. But it really looks
horrible to me.

So maybe you can explain in small words why I'm wrong, but right now
my feeling is that not only did I find an arm bug in the series
(trivially fixed with a one-liner, but worrying, and triggered by the
series being done in a particularly fragile way).

But I also think there are some x86 things that are just not done the
way they should have been done.

But again: maybe I don't understand some of the problems.

Linus