Re: [RFC 09/16] kgr: mark task_safe in some kthreads

From: Paul E. McKenney
Date: Wed May 14 2014 - 11:30:34 EST


On Wed, May 14, 2014 at 05:15:01PM +0200, Vojtech Pavlik wrote:
> On Wed, May 14, 2014 at 04:59:05PM +0200, Jiri Slaby wrote:
>
> > I see the worst case scenario. (For curious readers, it is for example
> > this kthread body:
> > while (1) {
> > some_paired_call(); /* invokes pre-patched code */
> > if (kthread_should_stop()) { /* kgraft switches to the new code */
> > its_paired_function(); /* invokes patched code (wrong) */
> > break;
> > }
> > its_paired_function(); /* the same (wrong) */
> > })
> >
> > What to do with that now? We have come up with a couple possibilities.
> > Would you consider try_to_freeze() a good state-defining function? As it
> > is called when a kthread expects weird things can happen, it should be
> > safe to switch to the patched version in our opinion.
> >
> > The other possibility is to patch every kthread loop (~300) and insert
> > kgr_task_safe() semi-manually at some proper place.
> >
> > Or if you have any other suggestions we would appreciate that?
>
> A heretic idea would be to convert all kernel threads into functions
> that do not sleep and exit after a single iteration and are called from
> a central kthread main loop function. That would get all of
> kthread_should_stop() and try_to_freeze() and kgr_task_safe() nicely
> into one place and at the same time put enough constraint on what the
> thread function can do to prevent it from breaking the assumptions of
> each of these calls.

Some substantial restructuring would be required for several of
the kthreads I am aware of, which contain kthread_should_stop()
inside loop bodies as well as on their conditions. Also, a number
of them do things like wait_event() and the like, which would mean
that the central kthread main loop function would need to know
about the wait queues and wait conditions and handle them properly.
See for example rcu_torture_barrier_cbs() and rcu_torture_barrier()
in kernel/rcu/rcutorture.c [*], which wait on each other in order to
test RCU's rcu_barrier() primitives.

Thanx, Paul

* In older kernels, this is kernel/rcu/torture.c or kernel/rcutorture.c.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/