Re: [PATCH 00/35] Shadow stacks for userspace

From: Mike Rapoport
Date: Wed Feb 09 2022 - 06:56:39 EST


Hi Rick,

On Wed, Feb 09, 2022 at 02:18:42AM +0000, Edgecombe, Rick P wrote:
> On Tue, 2022-02-08 at 20:02 +0300, Cyrill Gorcunov wrote:
> > On Tue, Feb 08, 2022 at 08:21:20AM -0800, Andy Lutomirski wrote:
> > > > > But such a knob will immediately reduce the security value of
> > > > > the entire
> > > > > thing, and I don't have good ideas how to deal with it :(
> > > >
> > > > Probably a kind of latch in the task_struct which would trigger
> > > > off once
> > > > returt to a different address happened, thus we would be able to
> > > > jump inside
> > > > paratite code. Of course such trigger should be available under
> > > > proper
> > > > capability only.
> > >
> > > I'm not fully in touch with how parasite, etc works. Are we
> > > talking about save or restore?
> >
> > We use parasite code in question during checkpoint phase as far as I
> > remember.
> > push addr/lret trick is used to run "injected" code (code injection
> > itself is
> > done via ptrace) in compat mode at least. Dima, Andrei, I didn't look
> > into this code
> > for years already, do we still need to support compat mode at all?
> >
> > > If it's restore, what exactly does CRIU need to do? Is it just
> > > that CRIU needs to return
> > > out from its resume code into the to-be-resumed program without
> > > tripping CET? Would it
> > > be acceptable for CRIU to require that at least one shstk slot be
> > > free at save time?
> > > Or do we need a mechanism to atomically switch to a completely full
> > > shadow stack at resume?
> > >
> > > Off the top of my head, a sigreturn (or sigreturn-like mechanism)
> > > that is intended for
> > > use for altshadowstack could safely verify a token on the
> > > altshdowstack, possibly
> > > compare to something in ucontext (or not -- this isn't clearly
> > > necessary) and switch
> > > back to the previous stack. CRIU could use that too. Obviously
> > > CRIU will need a way
> > > to populate the relevant stacks, but WRUSS can be used for that,
> > > and I think this
> > > is a fundamental requirement for CRIU -- CRIU restore absolutely
> > > needs a way to write
> > > the saved shadow stack data into the shadow stack.
>
> Still wrapping my head around the CRIU save and restore steps, but
> another general approach might be to give ptrace the ability to
> temporarily pause/resume/set CET enablement and SSP for a stopped
> thread. Then injected code doesn't need to jump through any hoops or
> possibly run into road blocks. I'm not sure how much this opens things
> up if the thread has to be stopped...

IIRC, criu dump does something like this:
* Stop the process being dumped (victim) with ptrace
* Inject parasite code and data into the victim, again with ptrace.
Among other things the parasite data contains a sigreturn frame with
saved victim state.
* Resume the victim process, which will run parasite code now.
* When parasite finishes it uses that frame to sigreturn to normal victim
execution

So, my feeling is that for dump side WRUSS should be enough.

> Cyrill, could it fit into the CRIU pause and resume flow? What action
> causes the final resuming of execution of the restored process for
> checkpointing and for restore? Wondering if we could somehow make CET
> re-enable exactly then.
>
> And I guess this also needs a way to create shadow stack allocations at
> a specific address to match where they were in the dumped process. That
> is missing in this series.

Yes, criu restore will need to recreate shadow stack mappings. Currently,
we recreate the restored process (target) address space based on
/proc/pid/maps and /proc/pid/smaps. CRIU preserves the virtual addresses
and VMA flags. The relevant steps of restore process can be summarised as:
* Clone() the target process tree
* Recreate VMAs with the needed size and flags, but not necessarily at the
correct place yet
* Partially populate memory data from the saved images
* Move VMAs to their exact addresses
* Complete restoring the data
* Create a frame for sigreturn and jump to the target.

Here, the stack used after sigreturn contains the data that was captured
during dump and it entirely different from what shadow stack will contain.

There are several points when the target threads are stopped, so
pausing/resuming CET may help.

> > > So I think the only special capability that CRIU really needs is
> > > WRUSS, and
> > > we need to wire that up anyway.
> >
> > Thanks for these notes, Andy! I can't provide any sane answer here
> > since didn't
> > read tech spec for this feature yet :-)
>
>
>
>

--
Sincerely yours,
Mike.