Re: [RFC v2] ptrace, pidfd: add pidfd_ptrace syscall

From: Jann Horn
Date: Mon Apr 27 2020 - 13:53:17 EST


On Mon, Apr 27, 2020 at 7:08 PM Christian Brauner
<christian.brauner@xxxxxxxxxx> wrote:
> On Sun, Apr 26, 2020 at 06:34:30PM +0200, Hagen Paul Pfeifer wrote:
> > Working on a safety-critical stress testing tool, using ptrace in an
> > rather uncommon way (stop, peeking memory, ...) for a bunch of
> > applications in an automated way I realized that once opened processes
> > where restarted and PIDs recycled. Resulting in monitoring and
> > manipulating the wrong processes.
> >
> > With the advent of pidfd we are now able to stick with one stable handle
> > to identifying processes exactly. We now have the ability to get this
> > race free. Sending signals now works like a charm, next step is to
> > extend the functionality also for ptrace.
> >
> > API:
> > long pidfd_ptrace(int pidfd, enum __ptrace_request request,
> > void *addr, void *data, unsigned flags);
>
> I'm in general not opposed to this if there's a clear need for this and
> users that are interested. But I think if people really prefer having
> this a new syscall then we should probably try to improve on the old
> one. Things that come to mind right away without doing a deep review are
> replacing the void *addr pointer with a dedicated struct ptract_args or
> union ptrace_args and a size argument. If we're not doing something
> like this or something more fundamental we can equally well either just
> duplicate all enums in the old ptrace syscall and append a _PIDFD to it
> where it makes sense.

Yeah, it seems like just adding pidfd flavors of PTRACE_ATTACH and
PTRACE_SEIZE should do the job.


And if we do make a new syscall, there is a bunch of de-crufting that
can be done... for example, just as some low-hanging fruit, a new
ptrace API probably shouldn't have
PTRACE_PEEKTEXT/PTRACE_PEEKDATA/PTRACE_POKETEXT/PTRACE_POKEDATA (we
have /proc/$pid/mem for that, which is much saner than doing peek/poke
in word-size units), and probably also shouldn't support all the weird
arch-specific register-accessing requests (e.g.
PTRACE_PEEKUSR/PTRACE_POKEUSR/PTRACE_GETREGS/PTRACE_SETREGS/PTRACE_GETFPREGS/...)
that are nowadays accessible via PTRACE_GETREGSET/PTRACE_SETREGSET.

(And there are also some more major changes that I think would be
sensible; for example, it'd be neat if you could have notifications
about the tracee delivered through a pollable file descriptor, and if
you could get the kernel to tell you in each notification which type
of ptrace stop you're dealing with (e.g. distinguishing
syscall-entry-stop vs syscall-exit-stop), and it would be great to be
able to directly inject syscalls into the child instead of having to
figure out where a syscall instruction you can abuse is and then
setting the instruction pointer to that, and it'd be helpful to be
able to have multiple tracers attached to a single process so that you
can e.g. have strace and gdb active on the same process
concurrently...)