Re: [PATCH] [request for inclusion] Realtime LSM

From: utz lehmann
Date: Thu Jan 13 2005 - 21:39:21 EST


On Fri, 2005-01-14 at 13:08 +1100, Con Kolivas wrote:
> utz lehmann wrote:
> > On Thu, 2005-01-13 at 16:25 -0500, Lee Revell wrote:
> >
> >>On Thu, 2005-01-13 at 22:07 +0100, Arjan van de Ven wrote:
> >>
> >>>On Thu, Jan 13, 2005 at 03:04:26PM -0600, Jack O'Quin wrote:
> >>>
> >>>>(Probably, this simplistic analysis misses some other, more subtle,
> >>>>factors.)
> >>>
> >>>I think you can do nasty things to the locks held by those threads too
> >>>
> >>>
> >>>>RT threads should not do FS writes of their own. But, a badly broken
> >>>>or malicious one could, I suppose. So, that might provide a mechanism
> >>>>for losing more data than usual. Is that what you had in mind?
> >>>
> >>>basically yes.
> >>>note that "FS writes" can come from various things, including library calls
> >>>made and such. But I think you got my point; even though it might seem a bit
> >>>theoretical it sure is unpleasant.
> >>>
> >>
> >>I added Con to the cc: because this thread is starting to converge with
> >>an email discussion we've been having.
> >>
> >>The basic issue is that the current semantics of SCHED_FIFO seem make
> >>the deadlock/data corruption due to runaway RT thread issue difficult.
> >>The obvious solution is a new scheduling class equivalent to SCHED_FIFO
> >>but with a mechanism for the kernel to demote the offending thread to
> >>SCHED_OTHER in an emergency. The problem can be solved in userspace
> >>with a SCHED_FIFO watchdog thread that runs at a higher RT priority than
> >>all other RT processes.
> >>
> >>This all seems to imply that introducing an rlimit for MAX_RT_PRIO is an
> >>excellent solution. The RT watchdog thread could run as root, and the
> >>rlimit would be used to ensure than even nonroot users in the RT group
> >>could never preempt the watchdog thread.
> >
> >
> > Just an idea. What about throttling runaway RT tasks?
> > If the system spend more than 98% in RT tasks for 5s consider this as a
> > _fatal error_. Print an error message and throttle RT tasks by inserting
> > ticks where only SCHED_OTHER tasks allowed. For a limit of 98% this
> > means one SCHED_OTHER only tick all 50 ticks.
> >
> > The limit and timeout should be configurable and of course it can be
> > disabled.
> >
> > I know this is against RT task preempt all SCHED_OTHER but this is only
> > for a fatal system state to be able to recover sanely. A locked up
> > machine is is the worse alternative.
>
> There is a patch in -mm currently designed to use a sysrq key
> combination which converts all real time tasks to sched normal to save
> you if you desire in a lockup situation. We do want to preserve RT
> scheduling behaviour at all times without caveats for privileged users.

The sysrq is already in 2.6.10. I had to use it the last days a few
times. But it does help if you have no access to the console.

The RT throttling idea is not to change the behavior in normal
conditions. It's only for a fatal system state. If you have a runaway RT
task you can't guarantee the system is work properly anyway. It's
blocking vital kernel threads, filesystems, swap, keyboard, ...

It's a bit like out of memory. You can do nothing and panic. Or trying
something bad (killing processes) which is hopefully better as the
former.
btw: Are RT tasks excluded by the oom killer?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/