Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm2-V0.7.30-2

From: Ingo Molnar
Date: Mon Nov 22 2004 - 04:01:51 EST



* Eran Mann <emann@xxxxxxx> wrote:

> Ingo Molnar wrote:
> >i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> >downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> Hi,
> I´m seeing latencies of up to ~2000 microseconds. see attached traces
> file for a small sample. I think I´m missing something obvious
> config-wise but I don´t know what...

131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+1.687ms): dequeue_task (deactivate_task)
5506 80000002 1.691ms (+0.001ms): __switch_to (__schedule)

this seems to be hardware-generated. As you can see it from the trace,
the codepath between __schedule()'s deactive_task() and __switch_to()
has all interrupts and preemption disabled. The O(1) scheduler there has
constant overhead and that codepath should at most take ~1 usec.

(there is one exception, if both LATENCY_TRACING and RT_DEADLOCK_DETECT
are enabled in -V0.7.30-2 and later kernels then the overhead within
__schedule is O(nr_running), because the tracer adds entries for every
runnable task. But this is not the case for your trace because then
you'd see those entries in /proc/latency_trace.)

> The ´load´ during the traces consisted of a kernel build in a
> gnome-terminal, and 2 browser windows with a heavy site (Flash ads
> etc.) in each. This load causes a >1 ms latency every 5 minutes on
> average. After the kernel build ended the rate dropped dramatically to
> ~2 traces an hour.

this seems to imply IDE DMA related hardware overhead. Apparently what
happens is that with certain motherboards/chipsets, if IDE DMA happens
then that DMA transfer _completely locks up_ the system bus. Nothing
happens, and the CPU is stalled in essence until the end of the DMA
request.

there's nothing the kernel can do about a hardware latency like that,
but you can try to work it around. Mark H. Johnson has reported up to
500 usec latencies that had a similar pattern as yours, and he has
experimented with lesser DMA modes (udma2?) via hdparm. YMMV and be
careful with hdparm settings.

> The traces were from V-0.7.29-5 but I´ve seen these latencies in all
> RT kernels I tested (2.6.9-mm1-RT-V0.2 was the first). I´ll try
> V0.7.30-2 next. The machine is a PIII 733 Mhz, 256MB RAM, IDE disks.

this very strongly implies some sort of hardware overhead. Btw., the
likely reason why this often shows up within __schedule() is that 1)
it's a very common operation, especially on the -RT kernel 2) we do a
TLB flush there, which can be quite memory-intense, so if the system bus
(the memory bus) is locked up, there is a high likelyhood that this
function generates a cachemiss.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/