Re: [PATCH v4 16/39] x86/mm: Check Shadow Stack page fault errors

From: Edgecombe, Rick P
Date: Wed Jan 04 2023 - 20:29:30 EST


On Wed, 2023-01-04 at 15:32 +0100, Borislav Petkov wrote:
> On Fri, Dec 02, 2022 at 04:35:43PM -0800, Rick Edgecombe wrote:
> > From: Yu-cheng Yu <yu-cheng.yu@xxxxxxxxx>
> >
> > The CPU performs "shadow stack accesses" when it expects to
> > encounter
> > shadow stack mappings. These accesses can be implicit (via CALL/RET
> > instructions) or explicit (instructions like WRSS).
> >
> > Shadow stacks accesses to shadow-stack mappings can see faults in
> > normal,
> > valid operation just like regular accesses to regular mappings.
> > Shadow
> > stacks need some of the same features like delayed allocation, swap
> > and
> > copy-on-write. The kernel needs to use faults to implement those
> > features.
> >
> > The architecture has concepts of both shadow stack reads and shadow
> > stack
> > writes. Any shadow stack access to non-shadow stack memory will
> > generate
> > a fault with the shadow stack error code bit set.
>
> You lost me here: by "shadow stack access to non-shadow stack memory"
> you mean
> the explicit one using WRU*SS?

Shadow stack accesses can be WR*SS, shadow stack pushes/pops from
call/ret or incssp. Basically the instructions that intend to read or
write to a shadow stack.

>
> > This means that, unlike normal write protection, the fault handler
> > needs
> > to create a type of memory that can be written to (with
> > instructions that
> > generate shadow stack writes), even to fulfill a read access. So in
> > the
> > case of COW memory, the COW needs to take place even with a shadow
> > stack
> > read.
>
> I guess I'm missing an example here: are we talking here about a user
> process
> getting its shadow stack pages allocated and them being COW first and
> on the
> first shstk operation, it would generate that fault?

So if you have a shadow stack, then fork() so the shadow stack PTEs
become read-only. Then you RET and the shadow stack get's popped to
compare it to the normal stack value. In this case it only needs to
read the shadow stack, but it does this with a "shadow stack read",
which will fault if memory is not shadow stack. So the fault is a read
fault, but it needs to make the PTE shadow stack in order to resolve
it. So it needs to trigger COW, otherwise the shared page would be
changeable from userspace. Make sense? I guess I can add an example if
you think it would help.

>
> > @@ -1331,6 +1345,30 @@ void do_user_addr_fault(struct pt_regs
> > *regs,
> >
> > perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
> >
> > + /*
> > + * When a page becomes COW it changes from a shadow stack
> > permissioned
>
> Unknown word [permissioned] in comment.

I can change it.

>
> > + * page (Write=0,Dirty=1) to (Write=0,Dirty=0,CoW=1), which is
> > simply
> > + * read-only to the CPU. When shadow stack is enabled, a RET
> > would
> > + * normally pop the shadow stack by reading it with a "shadow
> > stack
> > + * read" access. However, in the COW case the shadow stack
> > memory does
> > + * not have shadow stack permissions, it is read-only. So it
> > will
> > + * generate a fault.
> > + *
> > + * For conventionally writable pages, a read can be serviced
> > with a
> > + * read only PTE, and COW would not have to happen. But for
> > shadow
> > + * stack, there isn't the concept of read-only shadow stack
> > memory.
> > + * If it is shadow stack permissioned, it can be modified via
> > CALL and
>
> Ditto.
>
> > + * RET instructions. So COW needs to happen before any memory
> > can be
> > + * mapped with shadow stack permissions.
> > + *
> > + * Shadow stack accesses (read or write) need to be serviced
> > with
> > + * shadow stack permissioned memory, so in the case of a shadow
> > stack
>
> Is this some new formulation I haven't heard about yet?
>
> "Permissioned <something>"?

It looks like it's not an official dictionary word, but I've seen it
from time to time:
https://en.wiktionary.org/wiki/permissioned

>
> > + * read access, treat it as a WRITE fault so both COW will
> > happen and
> > + * the write fault path will tickle maybe_mkwrite() and map the
> > memory
> > + * shadow stack.
> > + */
> > + if (error_code & X86_PF_SHSTK)
> > + flags |= FAULT_FLAG_WRITE;
> > if (error_code & X86_PF_WRITE)
> > flags |= FAULT_FLAG_WRITE;
> > if (error_code & X86_PF_INSTR)
> > --
> > 2.17.1
> >
>
>