Re: [PATCH 00/35] Shadow stacks for userspace

From: avagin@xxxxxxxxx
Date: Fri Feb 11 2022 - 02:41:24 EST


On Wed, Feb 09, 2022 at 06:37:53PM -0800, Andy Lutomirski wrote:
> On 2/8/22 18:18, Edgecombe, Rick P wrote:
> > On Tue, 2022-02-08 at 20:02 +0300, Cyrill Gorcunov wrote:
> > > On Tue, Feb 08, 2022 at 08:21:20AM -0800, Andy Lutomirski wrote:
> > > > > > But such a knob will immediately reduce the security value of
> > > > > > the entire
> > > > > > thing, and I don't have good ideas how to deal with it :(
> > > > >
> > > > > Probably a kind of latch in the task_struct which would trigger
> > > > > off once
> > > > > returt to a different address happened, thus we would be able to
> > > > > jump inside
> > > > > paratite code. Of course such trigger should be available under
> > > > > proper
> > > > > capability only.
> > > >
> > > > I'm not fully in touch with how parasite, etc works. Are we
> > > > talking about save or restore?
> > >
> > > We use parasite code in question during checkpoint phase as far as I
> > > remember.
> > > push addr/lret trick is used to run "injected" code (code injection
> > > itself is
> > > done via ptrace) in compat mode at least. Dima, Andrei, I didn't look
> > > into this code
> > > for years already, do we still need to support compat mode at all?
> > >
> > > > If it's restore, what exactly does CRIU need to do? Is it just
> > > > that CRIU needs to return
> > > > out from its resume code into the to-be-resumed program without
> > > > tripping CET? Would it
> > > > be acceptable for CRIU to require that at least one shstk slot be
> > > > free at save time?
> > > > Or do we need a mechanism to atomically switch to a completely full
> > > > shadow stack at resume?
> > > >
> > > > Off the top of my head, a sigreturn (or sigreturn-like mechanism)
> > > > that is intended for
> > > > use for altshadowstack could safely verify a token on the
> > > > altshdowstack, possibly
> > > > compare to something in ucontext (or not -- this isn't clearly
> > > > necessary) and switch
> > > > back to the previous stack. CRIU could use that too. Obviously
> > > > CRIU will need a way
> > > > to populate the relevant stacks, but WRUSS can be used for that,
> > > > and I think this
> > > > is a fundamental requirement for CRIU -- CRIU restore absolutely
> > > > needs a way to write
> > > > the saved shadow stack data into the shadow stack.
> >
> > Still wrapping my head around the CRIU save and restore steps, but
> > another general approach might be to give ptrace the ability to
> > temporarily pause/resume/set CET enablement and SSP for a stopped
> > thread. Then injected code doesn't need to jump through any hoops or
> > possibly run into road blocks. I'm not sure how much this opens things
> > up if the thread has to be stopped...
>
> Hmm, that's maybe not insane.
>
> An alternative would be to add a bona fide ptrace call-a-function mechanism.
> I can think of two potentially usable variants:
>
> 1. Straight call. PTRACE_CALL_FUNCTION(addr) just emulates CALL addr,
> shadow stack push and all.
>
> 2. Signal-style. PTRACE_CALL_FUNCTION_SIGFRAME injects an actual signal
> frame just like a real signal is being delivered with the specified handler.
> There could be a variant to opt-in to also using a specified altstack and
> altshadowstack.

I think this would be ideal. In CRIU, the parasite code is executed in
the "daemon" mode and returns back via sigreturn. Right now, CRIU needs
to generate a signal frame. If I understand your idea right, the signal
frame will be generated by the kernel.

>
> 2 would be more expensive but would avoid the need for much in the way of
> asm magic. The injected code could be plain C (or Rust or Zig or whatever).
>
> All of this only really handles save, not restore. I don't understand
> restore enough to fully understand the issue.

In a few words, it works like this: CRIU restores all required resources
and prepares a signal frame with a target process state, then it
switches to a small PIE blob, where it restores vma-s and calls
rt_sigreturn.

>
> --Andy