Re: Scheduler: Spinning until tasks are STOPPED

From: Nick Piggin
Date: Sat May 07 2005 - 02:13:34 EST


Yuly Finkelberg wrote:
Hi,

I sent a message regarding this issue earlier, but after re-reading
it, I realized that it wasn't very clear. Hopefully, this will
clarify things a little bit:

I have a strange scheduling issue: a bunch of worker tasks are all waiting on a wait queue. Each task is woken up by the preceeding, does some work, wakes up the next one, and then sends a SIGSTOP to itself. The last task however
does not stop itself, but instead yield()s until all tasks have reached state
TASK_STOPPED.

The code looks like this (irrelevant parts cut out):
...
ret = wait_event_interruptible(waitq, next_in_line == myself);
...
(some work)
...
next_in_line = next;
ret = wakeup_next_one();
if (!last_one)
send_sig(SIGSTOP, current, 1);
else
spin_until_all_stopped()

When run with 50 tasks, normally this works well. However sometimes one of the
tasks (never the last one) gets stuck between calling wakeup_next_one() and between sending the signal. It accumulates system time, and its stack looks
like (no pending signals, ti_flags is clear):

c55e7ad0 00000086 c55e6000 c55e7a94 00000046 c55e6000 c55e7ad0 c0109c2d
00000000 c0497800 00000001 d38da344 0013bc9c c5632840 00071931 d3d93161
0013bc9c c55d546c c05d3960 0000270f c05d3960 c55e6000 c0106f25 c05d3960

Call Trace:
[<c0106f25>] need_resched+0x27/0x32

(yes, this is not a mistake: this is ALL the stack reported by show_stack())

Normally the spinning task will magically get released after "a while", where few seconds < "a while" < 10 minutes and sometimes even longer. So the mystery is -
1. Why does the task spin for so long ?
2. Where does it spin ? (the kernel stack doesn't hint on anything...)
3. How can I find out #2 ?
4. How to fix it ?
5. Is there a better way to make sure a specific task is STOPPED ?

Currently running 2.6.8.1 and 2.6.9 (UP, PREEMPT). I'd appreciate any
help here...

You're doing this in the *kernel*? It sounds like it should be done
in userspace or done a different way (ie. not with 50 tasks).

And using signals and spinning on yield for synchronisation and
process control in the kernel like this is fairly crazy.

Can't you use a semaphore or something?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/