Re: Renice X for cpu schedulers

From: Michael K. Edwards
Date: Thu Apr 19 2007 - 21:32:32 EST

Next message: Andrew Morton: "Re: [PATCH] cpqphp: Convert to use the kthread API"
Previous message: Linus Torvalds: "Re: Renice X for cpu schedulers"
In reply to: Lee Revell: "Re: Renice X for cpu schedulers"
Next in thread: hui: "Re: Renice X for cpu schedulers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 4/19/07, Lee Revell <rlrevell@xxxxxxxxxxx> wrote:

IMHO audio streamers should use SCHED_FIFO thread for time critical
work. I think it's insane to expect the scheduler to figure out that
these processes need low latency when they can just be explicit about
it. "Professional" audio software does it already, on Linux as well
as other OS...

It is certainly true that SCHED_FIFO is currently necessary in the
layers of an audio application lying closest to the hardware, if you
don't want to throw a monstrous hardware ring buffer at the problem.
See the alsa-devel archives for a patch to aplay (sched_setscheduler
plus some cleanups) that converts it from "unsafe at any speed" (on a
non-RT kernel) to a rock-solid 18ms round trip from PCM in to PCM out.
(The hardware and driver aren't terribly exotic for an SoC, and the
measurement was done with aplay -C | aplay -P -- on a
not-particularly-tuned CONFIG_PREEMPT kernel with a 12ms+ peak
scheduling latency according to cyclictest. A similar test via
/dev/dsp, done through a slightly modified OSS emulation layer to the
same driver, measures at 40ms and is probably tuned too
conservatively.)

Note that SCHED_FIFO may be less necessary on an -rt kernel, but I
haven't had that option on the embedded hardware I've been working
with lately. Ingo, please please pretty please pick a -stable branch
one of these days and provide a git repo with -rt integrated against
that branch. Then I could port our chip support to it -- all of which
will be GPLed after the impending code review -- after which I might
have a prayer of strong-arming our chip vendor into porting their WiFi
driver onto -rt. It's really a much more interesting scheduler use
case than make -j200 under X, because it's a best-effort
SCHED_BATCH-ish load that wants to be temporally clustered for power
management reasons.

(Believe it or not, a stable -rt branch with a clock-scaling-aware
scheduler is the one thing that might lead to this major WiFi vendor's
GPLing their driver core. They're starting to see the light on the
biz dev side, and the nature of the devices their chip will go in
makes them somewhat less concerned about the regulatory fig leaf
aspect of a closed-source driver; but they would have to port off of
the third-party real-time executive embedded within the driver, and
mainline's task and timer granularity won't cut it. I can't even get
more detail about _why_ it won't cut it unless there's some remotely
supportable -rt base they could port to.)

But I think SCHED_FIFO on a chain of tasks is fundamentally not the
right way to handle low audio latency. The object with a low latency
requirement isn't the task, it's the device. When it's starting to
get urgent to deliver more data to the device, the task that it's
waiting on should slide up the urgency scale; and if it's waiting on
something else, that something else should slide up the scale; and so
forth. Similarly, responding to user input is urgent; so when user
input is available (by whatever mechanism), the task that's waiting
for it should slide up the urgency scale, etc.

In practice, you probably don't want to burden desktop Linux with
priority inheritance where you don't have to. Priority queues with
algorithmically efficient decrease-key operations (Fibonacci heaps and
their ilk) are complicated to implement and have correspondingly high
constant factors. (However, a sufficiently clever heuristic for
assigning quasi-static task priorities would usually short-circuit the
priority cascade; if you can keep N small in the
tasks-with-unpredictable-priority queue, you can probably use a
simpler flavor with O(log N) decrease-key. Ask someone who knows more
about data structures than I do.)

More importantly, non-real-time application coders aren't very smart
about grouping data structure accesses on one side or the other of a
system call that is likely to release a lock and let something else
run, flushing application data out of cache. (Kernel coders aren't
always smart about this either; see LKML threads a few weeks ago about
racy, and cache-stall-prone, f_pos handling in VFS.) So switching
tasks immediately on lock release is usually the wrong thing to do if
letting the task run a little longer would allow it to reach a point
where it has to block anyway.

Anyway, I already described the urgency-driven strategy to the extent
that I've thought it out, elsewhere in this thread. I only held this
draft back because I wanted to double-check my latency measurements.

Cheers,
- Michael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andrew Morton: "Re: [PATCH] cpqphp: Convert to use the kthread API"
Previous message: Linus Torvalds: "Re: Renice X for cpu schedulers"
In reply to: Lee Revell: "Re: Renice X for cpu schedulers"
Next in thread: hui: "Re: Renice X for cpu schedulers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]