Re: [GIT PULL] x86/shstk for 6.4

From: Edgecombe, Rick P
Date: Sat May 06 2023 - 20:18:26 EST


On Sat, 2023-05-06 at 13:09 -0700, Linus Torvalds wrote:
> > On Sat, May 6, 2023 at 12:34 PM Linus Torvalds
> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > And I'm about a quarter in, haven't even gotten to the meat
> > > > yet, > > and
> > > > I've already found a bug.
> >
> > Ok, so despite this I'm going a bit further, just to familiarize
> > myself with this even if I can't pull it.
> >
> > And this is horrendous: pte_wrprotect() doing
> >
> >     if (pte_dirty(pte))
> >         pte = pte_mksaveddirty(pte);
> >
> > which actually has two fundamental problems:
> >
> >  (a) it shouldn't be a conditional thing at all, it should just be
> > > bit
> > operations
> >
> >  (b) "pte_dirty()" isn't even the right thing to use, since it
> > includes the SW dirty bit.

pte_dirty() actually doesn't check the the SW dirty bit, but this
probably just re-enforces the over complexity here.

>
[ snip ]
> >
> > Now, my reaction here is
> >
> >  - the whole shadow stack notion of "dirty but not writable is a >
> > magic
> > marker" is *DISGUSTING*. It's wrong.
> >
> >    Whatever Intel designer that came up with that - instead of just
> > picking another bit for the *HARDWARE* to check - should be
> > ashamed.
> >
> >    Now we have to pick a software bit instead, and play games for
> > this. BAD BAD BAD.
> >
> >    I'm assuming this is something where Microsoft went "we already
> > don't have that, and we want all the sw bits for sw, so do this".
> > But
> > from a design standpoint it's just nasty.

I can't argue against this.

> >
> >  - But if we have to play those games, just *play* them. Do it all
> > unconditionally, and make the x86-64 rules be that "dirty but not
> > writable" is something we should never have.
> >
> >    Having two different rules, and conditionals for them, is both >
> > more
> > complex for maintainers, *and* for compilers.
> >
> > So just make that _PAGE_BIT_SOFT_DIRTY be unconditional, and let's
> > just live with it. But make the bit operations efficient.
> >
> > Maybe I'm missing something, and the people who have been working
> > on
> > this have a really good reason for this mess. But it really looks
> > horrible to me.
> >
> > So maybe you can explain in small words why I'm wrong, but right
> > now
> > my feeling is that not only did I find an arm bug in the series
> > (trivially fixed with a one-liner, but worrying, and triggered by
> > the
> > series being done in a particularly fragile way).
> >
> > But I also think there are some x86 things that are just not done
> > the
> > way they should have been done.
> >
> > But again: maybe I don't understand some of the problems.
> >

Reflecting on these points, I think there might have been some amount
of wishful thinking around wanting to draw lines around the ugliness
and keep it at bay.

I'm swayed by your argument to bite the bullet for the reduced
complexity and increased testing. The only caveat that I could point
out would be: with the existing solution we retain the ability to
compile out saved-dirty. It's a bit thorny and if anything goes wrong,
the whole thing can easily be disabled. So there might be an argument
that having two sets of logic is a better thing to start out with, even
if the long term solution is saved-dirty all-the-time.

I don't want to argue it too strongly though, because it's clear now
how you can look at this and think, this can't be right. And in any
case I didn't justify having the two modes in any of the logs.