Re: [RFC 09/16] kgr: mark task_safe in some kthreads

From: Jiri Slaby
Date: Wed May 14 2014 - 10:59:22 EST


Hi Tejun,

On 05/01/2014 11:09 PM, Tejun Heo wrote:
> On Thu, May 01, 2014 at 05:02:42PM -0400, Tejun Heo wrote:
>> Hello, Jiri.
>>
>> On Thu, May 01, 2014 at 10:17:44PM +0200, Jiri Kosina wrote:
>>> I agree that this expectation might really somewhat implicit and is not
>>> probably properly documented anywhere. The basic observation is "whenever
>>> kthread_should_stop() is being called, all data structures are in a
>>> consistent state and don't need any further updates in order to achieve
>>> consistency, because we can exit the loop immediately here", as
>>> kthread_should_stop() is the very last thing every freezable kernel thread
>>
>> But kthread_should_stop() doesn't necessarily imply that "we can exit
>> the loop *immediately*" at all. It just indicates that it should
>> terminate in finite amount of time. I don't think it'd be too
>
> Just a bit of addition. Please note that kthread_should_stop(), along
> with the freezer test, is actually trickier than it seems. It's very
> easy to write code which works most of the time but misses wake up
> from kill when the timing is just right (or wrong). It should be
> interlocked with set_current_state() and other related queueing data
> structure accesses. This was several years ago but when I audited
> most kthread users in kernel, especially in combination with the
> freezer test which also has similar requirement, surprising percentage
> of users (at least several tens of pct) were getting it slightly
> wrong, so kthread_should_stop() really isn't used as "we can exit
> *immediately*". It just isn't that simple.

I see the worst case scenario. (For curious readers, it is for example
this kthread body:
while (1) {
some_paired_call(); /* invokes pre-patched code */
if (kthread_should_stop()) { /* kgraft switches to the new code */
its_paired_function(); /* invokes patched code (wrong) */
break;
}
its_paired_function(); /* the same (wrong) */
})

What to do with that now? We have come up with a couple possibilities.
Would you consider try_to_freeze() a good state-defining function? As it
is called when a kthread expects weird things can happen, it should be
safe to switch to the patched version in our opinion.

The other possibility is to patch every kthread loop (~300) and insert
kgr_task_safe() semi-manually at some proper place.

Or if you have any other suggestions we would appreciate that?

thanks,
--
js
suse labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/