Re: INFO: rcu detected stall in ext4_write_checks

From: Paul E. McKenney
Date: Sun Jul 14 2019 - 15:30:06 EST


On Sun, Jul 14, 2019 at 03:05:22PM -0400, Theodore Ts'o wrote:
> On Sun, Jul 14, 2019 at 05:48:00PM +0300, Dmitry Vyukov wrote:
> > But short term I don't see any other solution than stop testing
> > sched_setattr because it does not check arguments enough to prevent
> > system misbehavior. Which is a pity because syzkaller has found some
> > bad misconfigurations that were oversight on checking side.
> > Any other suggestions?
>
> Or maybe syzkaller can put its own limitations on what parameters are
> sent to sched_setattr? In practice, there are any number of ways a
> root user can shoot themselves in the foot when using sched_setattr or
> sched_setaffinity, for that matter. I imagine there must be some such
> constraints already --- or else syzkaller might have set a kernel
> thread to run with priority SCHED_BATCH, with similar catastrophic
> effects --- or do similar configurations to make system threads
> completely unschedulable.
>
> Real time administrators who know what they are doing --- and who know
> that their real-time threads are well behaved --- will always want to
> be able to do things that will be catastrophic if the real-time thread
> is *not* well behaved. I don't it is possible to add safety checks
> which would allow the kernel to automatically detect and reject unsafe
> configurations.
>
> An apt analogy might be civilian versus military aircraft. Most
> airplanes are designed to be "inherently stable"; that way, modulo
> buggy/insane control systems like on the 737 Max, the airplane will
> automatically return to straight and level flight. On the other hand,
> some military planes (for example, the F-16, F-22, F-36, the
> Eurofighter, etc.) are sometimes designed to be unstable, since that
> way they can be more maneuverable.
>
> There are use cases for real-time Linux where this flexibility/power
> vs. stability tradeoff is going to argue for giving root the
> flexibility to crash the system. Some of these systems might
> literally involve using real-time Linux in military applications,
> something for which Paul and I have had some experience. :-)
>
> Speaking of sched_setaffinity, one thing which we can do is have
> syzkaller move all of the system threads to they run on the "system
> CPU's", and then move the syzkaller processes which are testing the
> kernel to be on the "system under test CPU's". Then regardless of
> what priority the syzkaller test programs try to run themselves at,
> they can't crash the system.
>
> Some real-time systems do actually run this way, and it's a
> recommended configuration which is much safer than letting the
> real-time threads take over the whole system:
>
> http://linuxrealtime.org/index.php/Improving_the_Real-Time_Properties#Isolating_the_Application

Good point! We might still have issues with some per-CPU kthreads,
but perhaps use of nohz_full would help at least reduce these sorts
of problems. (There could still be issues on CPUs with more than
one runnable threads.)

Thanx, Paul