ptrace/strace and freezer oddities and v5.2+ kernels

From: Bruce Ashfield
Date: Tue Oct 01 2019 - 12:14:34 EST


Hi all,

The Yocto project has an upcoming release this fall, and I've been trying to
sort through some issues that are happening with kernel 5.2+ .. although
there is a specific yocto kernel, I'm testing and seeing this with
normal / vanilla
mainline kernels as well.

I'm running into an issue that is *very* similar to the one discussed in the
[REGRESSION] ptrace broken from "cgroup: cgroup v2 freezer" (76f969e)
thread from this past may: https://lkml.org/lkml/2019/5/12/272

I can confirm that I have the proposed fix for the initial regression report in
my build (05b2892637 [signal: unconditionally leave the frozen state
in ptrace_stop()]),
but yet I'm still seeing 3 or 4 minute runtimes on a test that used to take 3 or
4 seconds.

This isn't my normal area of kernel hacking, so I've so far come up empty
at either fixing it myself, or figuring out a viable workaround. (well, I can
"fix" it by remove the cgroup_enter_frozen() call in ptrace_stop ...
but obviously,
that is just me trying to figure out what could be causing the issue).

As part of the release, we run tests that come with various applications. The
ptrace test that is causing us issues can be boiled down to this:

$ cd /usr/lib/strace/ptest/tests
$ time ../strace -o log -qq -esignal=none -e/clock ./printpath-umovestr>ttt

(I can provide as many details as needed, but I wanted to keep this initial
email relatively short).

I'll continue to debug and attempt to fix this myself, but I grabbed the
email list from the regression report in May to see if anyone has any ideas
or angles that I haven't covered in my search for a fix.

Cheers,

Bruce



--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II