RE: Debugging kernel semaphore contention and priority inversion

From: Davda, Bhavesh P (Bhavesh)
Date: Thu Aug 18 2005 - 09:53:32 EST


> -----Original Message-----
> From: Mike Galbraith [mailto:efault@xxxxxx]
> Sent: Wednesday, August 17, 2005 11:10 PM
> At 09:43 PM 8/17/2005 -0600, you wrote:
> > >
> > > Have you tried sysrq t? See the Documentation/sysrq.txt file.
> >
> >This is a headless system.
>
> You could try netconsole.

Haven't heard of it before. Will look into it. But I doubt it will help
pinpoint the semaphore holder, if all I can do is sysrq stuff.

>
> > >
> > > How stuck is the system?
> > >
> > > Keith
> >
> >Very. Only pingable, but can't login via
> telnet/ssh/anything. Reason is
> >the same reason the low priority mystery task is unable to run and
> >release the held semaphore.
>
> (hmm. I'm obviously missing some original context here)
>
> Sounds like there must be another player who is RT prio + spinning.
>
> -Mike

Very good! Yes, I left out that piece of detail in my original posting.
There is a real low priority (4) SCHED_FIFO (hence still higher than any
SCHED_OTHER) task spinning. But it is not the semaphore holder. I am
trying to identify which kernel thread (because that's most likely)
running at SCHED_OTHER real low priority (too nice) is holding the
semaphore, locking out a priority 50 SCHED_FIFO task in its sys_write()
as a result.

Thanks

- Bhavesh

Bhavesh P. Davda | Distinguished Member of Technical Staff | Avaya |
1300 West 120th Avenue | B3-B03 | Westminster, CO 80234 | U.S.A. |
Voice/Fax: 303.538.4438 | bhavesh@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/