Re: [PATCH] kthreads: Fix startup synchronization boot crash

From: Ingo Molnar
Date: Tue Sep 01 2009 - 11:55:19 EST



* Oleg Nesterov <oleg@xxxxxxxxxx> wrote:

> On 09/01, Ingo Molnar wrote:
> >
> > * Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> > > On 09/01, Ingo Molnar wrote:
> > > >
> > > > * Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> > > >
> > > > > Yes, this should work. But I _think_ we can make the better fix...
> > > > >
> > > > > I'll try to make the patch soon. Afaics we don't need
> > > > > kthreadd_task_init_done.
> > > >
> > > > ok.
> > >
> > > Just in case, the patch is ready. [...]
> >
> > yes - that's roughly the cleanup i referred to in the commit log.
> >
> > way too late for -rc8 though - the minimal fix i did _might_ be
> > eligible.
> >
> > agreed?
>
> Agreed. Then I will sent the patch on top of this change.
>
> But. May be your minimal patch needs a small tweak ?
>
> rest_init()->complete(&kthreadd_task_init_done) assumes that exactly
> _one_ caller of kthread_create() can race with kernel_thread(kthreadd).
> Perhap we need complete_all() ?
>
>
> But I must admit, now I don't understand what happens,
>
> The modification of that variable is protected by the BKL, but
> the _ordering_ of the initial task (which becomes the idle
> thread of CPU0) and the init task (which is spawned by the
> initial task) is not synchronized.
>
> So we can occasionally end up init running sooner than
> rest_init()
>
> How? rest_init() can't be preempted and it holds BKL. And
> kernel_init() takes BKL before anything else. Confused...

it cannot be preempted but it can schedule anywhere - and the BKL
will be dropped silently.

This is one of the biggest dangers of the BKL - rescheduling
_somewhere_ in a huge codepath might change timings and 'breaks up
the critical path' - breaking ancient assumptions.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/