Re: 2.6.8-rc1-mm1 "Badness in schedule" on ppc32

From: Nick Piggin
Date: Fri Jul 16 2004 - 08:49:17 EST

Mikael Pettersson wrote:
On Thu, 15 Jul 2004 13:27:05 -0700, Tom Rini wrote:

[ much needed cutting ]

On x86, could you force the PDC202XX_NEW to dump_stack in the function
in question? Perhaps there's a calling order issue on ppc. Thanks.

I hacked pdc202xx_init_one() to dump_stack(), and upped ppc's
log buffer size to capture all badness messages. The ppc boot
log is a bit large, so I put both the ppc and x86 logs in

All badness calls appear to emanate from sleeps/waits in init code
called from init/main.c:init(), which itself runs in a kernel thread.
It seems extremely fishy that the kernel considers the scheduler
off-limits even though threads have been created and started.

The init thread is itself created in init/main.c:rest_init():

static void noinline rest_init(void)
kernel_thread(init, NULL, CLONE_FS | CLONE_SIGHAND);

system_state is changed only after the init thread is created.
Unless kernel_thread guarantees some execution ordering between
parent and child, I don't see how this could be race-free.

But I also don't see why ppc and x86 behave so differently here.

You must have missed my mail to the linuxppc list.

sched-clean-init-idle (which is in -mm) has the following hunk to
schedule() which should catch all unsafe calls to it, I think.

+ /*
+ * The idle thread is not allowed to schedule!
+ * Remove this check after it has been exercised a bit.
+ */
+ if (unlikely(current == rq->idle) && current->state != TASK_RUNNING) {
+ printk(KERN_ERR "bad: scheduling from the idle thread!\n");
+ dump_stack();
+ }

So the system_state patch can be dropped.

