Re: [RFC][PATCH 0/7] nested sleeps, fixes and debug infra

From: Peter Zijlstra
Date: Wed Aug 06 2014 - 04:32:03 EST


On Wed, Aug 06, 2014 at 11:51:29AM +0400, Ilya Dryomov wrote:

> OK, this one is a bit different.
>
> WARNING: CPU: 1 PID: 1744 at kernel/sched/core.c:7104 __might_sleep+0x58/0x90()
> do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff81070e10>] prepare_to_wait+0x50 /0xa0

> [<ffffffff8105bc38>] __might_sleep+0x58/0x90
> [<ffffffff8148c671>] lock_sock_nested+0x31/0xb0
> [<ffffffff81498aaa>] sk_stream_wait_memory+0x18a/0x2d0

Urgh, tedious. Its not an actual bug as is. Due to the condition check
in sk_wait_event() we can call lock_sock() with ->state != TASK_RUNNING.

I'm not entirely sure what the cleanest way is to make this go away.
Possibly something like so:

---
include/net/sock.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 156350745700..37902176c5ab 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -886,6 +886,7 @@ static inline void sock_rps_reset_rxhash(struct sock *sk)
if (!__rc) { \
*(__timeo) = schedule_timeout(*(__timeo)); \
} \
+ __set_current_state(TASK_RUNNING); \
lock_sock(__sk); \
__rc = __condition; \
__rc; \

Attachment: pgpJFqjT6yI01.pgp
Description: PGP signature