Re: [syzbot] BUG: sleeping function called from invalid context in _copy_to_iter

From: Shoaib Rao
Date: Mon Aug 09 2021 - 16:33:58 EST



On 8/9/21 1:09 PM, Eric Dumazet wrote:
On Mon, Aug 9, 2021 at 10:02 PM Eric Dumazet <edumazet@xxxxxxxxxx> wrote:
On Mon, Aug 9, 2021 at 9:40 PM Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:

On 8/9/21 12:21 PM, Dmitry Vyukov wrote:
On Mon, 9 Aug 2021 at 21:16, Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:
On 8/9/21 11:06 AM, Dmitry Vyukov wrote:
On Mon, 9 Aug 2021 at 19:33, Shoaib Rao <rao.shoaib@xxxxxxxxxx> wrote:
This seems like a false positive. 1) The function will not sleep because
it only calls copy routine if the byte is present. 2). There is no
difference between this new call and the older calls in
unix_stream_read_generic().
Hi Shoaib,

Thanks for looking into this.
Do you have any ideas on how to fix this tool's false positive? Tools
with false positives are order of magnitude less useful than tools w/o
false positives. E.g. do we turn it off on syzbot? But I don't
remember any other false positives from "sleeping function called from
invalid context" checker...
Before we take any action I would like to understand why the tool does
not single out other calls to recv_actor in unix_stream_read_generic().
The context in all cases is the same. I also do not understand why the
code would sleep, Let's assume the user provided address is bad, the
code will return EFAULT, it will never sleep,
I always assumed that it's because if user pages are swapped out, it
may need to read them back from disk.
Page faults occur all the time, the page may not even be in the cache or
the mapping is not there (mmap), so I would not consider this a bug. The
code should complain about all other calls as they are also copying to
user pages. I must not be following some semantics for the code to be
triggered but I can not figure that out. What is the recommended
interface to do user copy from kernel?
Are you aware of the difference between a mutex and a spinlock ?

When copying data from/to user, you can not hold a spinlock.


I am guessing that even your test would trigger the warning,
if you make sure to include CONFIG_DEBUG_ATOMIC_SLEEP=y in your kernel build.

Eric,

Thanks for the pointer, have you ever over looked at something when coding?

Shoaib


Shoaib

if the kernel provided
address is bad the system will panic. The only difference I see is that
the new code holds 2 locks while the previous code held one lock, but
the locks are acquired before the call to copy.

So please help me understand how the tool works. Even though I have
evaluated the code carefully, there is always a possibility that the
tool is correct.

Shoaib


On 8/8/21 4:38 PM, syzbot wrote:
Hello,

syzbot found the following issue on:

HEAD commit: c2eecaa193ff pktgen: Remove redundant clone_skb override
git tree: net-next
console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=12e3a69e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHEdQcWD$
kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=aba0c23f8230e048__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPLGp1-Za$
dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=8760ca6c1ee783ac4abd__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPCORTNOH$
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1
syz repro: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.syz?x=15c5b104300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPAjhi2yc$
C reproducer: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/repro.c?x=10062aaa300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNzAjzQJ$

The issue was bisected to:

commit 314001f0bf927015e459c9d387d62a231fe93af3
Author: Rao Shoaib <rao.shoaib@xxxxxxxxxx>
Date: Sun Aug 1 07:57:07 2021 +0000

af_unix: Add OOB support

bisection log: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/bisect.txt?x=10765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPK2iWt2r$
final oops: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/report.txt?x=12765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKAb0dft$
console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=14765f8e300000__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPNlW_w-u$

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8760ca6c1ee783ac4abd@xxxxxxxxxxxxxxxxxxxxxxxxx
Fixes: 314001f0bf92 ("af_unix: Add OOB support")

BUG: sleeping function called from invalid context at lib/iov_iter.c:619
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 8443, name: syz-executor700
2 locks held by syz-executor700/8443:
#0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501
#1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
#1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 8443 Comm: syz-executor700 Not tainted 5.14.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
___might_sleep.cold+0x1f1/0x237 kernel/sched/core.c:9154
__might_fault+0x6e/0x180 mm/memory.c:5258
_copy_to_iter+0x199/0x1600 lib/iov_iter.c:619
copy_to_iter include/linux/uio.h:139 [inline]
simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519
__skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425
skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533
skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline]
unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701
unix_stream_recv_urg net/unix/af_unix.c:2433 [inline]
unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504
unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717
sock_recvmsg_nosec net/socket.c:944 [inline]
sock_recvmsg net/socket.c:962 [inline]
sock_recvmsg net/socket.c:958 [inline]
____sys_recvmsg+0x2c4/0x600 net/socket.c:2622
___sys_recvmsg+0x127/0x200 net/socket.c:2664
do_recvmmsg+0x24d/0x6d0 net/socket.c:2758
__sys_recvmmsg net/socket.c:2837 [inline]
__do_sys_recvmmsg net/socket.c:2860 [inline]
__se_sys_recvmmsg net/socket.c:2853 [inline]
__x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x43ef39
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39
RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004
RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488
R10: 0000000000000007 R11: 0000000000000246 R12: 0000000000402fb0
R13: 0000000000000000 R14: 00000000004ac018 R15: 0000000000400488

=============================
[ BUG: Invalid wait context ]
5.14.0-rc3-syzkaller #0 Tainted: G W
-----------------------------
syz-executor700/8443 is trying to lock:
ffff8880212b6a28 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0xa3/0x180 mm/memory.c:5260
other info that might help us debug this:
context-{4:4}
2 locks held by syz-executor700/8443:
#0: ffff888028fa0d00 (&u->iolock){+.+.}-{3:3}, at: unix_stream_read_generic+0x16c6/0x2190 net/unix/af_unix.c:2501
#1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
#1: ffff888028fa0df0 (&u->lock){+.+.}-{2:2}, at: unix_stream_read_generic+0x16d0/0x2190 net/unix/af_unix.c:2502
stack backtrace:
CPU: 1 PID: 8443 Comm: syz-executor700 Tainted: G W 5.14.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:105
print_lock_invalid_wait_context kernel/locking/lockdep.c:4666 [inline]
check_wait_context kernel/locking/lockdep.c:4727 [inline]
__lock_acquire.cold+0x213/0x3ab kernel/locking/lockdep.c:4965
lock_acquire kernel/locking/lockdep.c:5625 [inline]
lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5590
__might_fault mm/memory.c:5261 [inline]
__might_fault+0x106/0x180 mm/memory.c:5246
_copy_to_iter+0x199/0x1600 lib/iov_iter.c:619
copy_to_iter include/linux/uio.h:139 [inline]
simple_copy_to_iter+0x4c/0x70 net/core/datagram.c:519
__skb_datagram_iter+0x10f/0x770 net/core/datagram.c:425
skb_copy_datagram_iter+0x40/0x50 net/core/datagram.c:533
skb_copy_datagram_msg include/linux/skbuff.h:3620 [inline]
unix_stream_read_actor+0x78/0xc0 net/unix/af_unix.c:2701
unix_stream_recv_urg net/unix/af_unix.c:2433 [inline]
unix_stream_read_generic+0x17cd/0x2190 net/unix/af_unix.c:2504
unix_stream_recvmsg+0xb1/0xf0 net/unix/af_unix.c:2717
sock_recvmsg_nosec net/socket.c:944 [inline]
sock_recvmsg net/socket.c:962 [inline]
sock_recvmsg net/socket.c:958 [inline]
____sys_recvmsg+0x2c4/0x600 net/socket.c:2622
___sys_recvmsg+0x127/0x200 net/socket.c:2664
do_recvmmsg+0x24d/0x6d0 net/socket.c:2758
__sys_recvmmsg net/socket.c:2837 [inline]
__do_sys_recvmmsg net/socket.c:2860 [inline]
__se_sys_recvmmsg net/socket.c:2853 [inline]
__x64_sys_recvmmsg+0x20b/0x260 net/socket.c:2853
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x43ef39
Code: 28 c3 e8 2a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffca8776d68 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 0000000000400488 RCX: 000000000043ef39
RDX: 0000000000000700 RSI: 0000000020001140 RDI: 0000000000000004
RBP: 0000000000402f20 R08: 0000000000000000 R09: 0000000000400488
R10: 0000000000000007 R11: 0000000000000246 R12: 0000


---
This report is generated by a bot. It may contain errors.
See https://urldefense.com/v3/__https://goo.gl/tpsmEJ__;!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPG1UhbpZ$ for more information about syzbot.
syzbot engineers can be reached at syzkaller@xxxxxxxxxxxxxxxx.

syzbot will keep track of this issue. See:
https://urldefense.com/v3/__https://goo.gl/tpsmEJ*status__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPKlEx5v1$ for how to communicate with syzbot.
For information about bisection process see: https://urldefense.com/v3/__https://goo.gl/tpsmEJ*bisection__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPJk7KaIr$
syzbot can test patches for this issue, for details see:
https://urldefense.com/v3/__https://goo.gl/tpsmEJ*testing-patches__;Iw!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPMhq2hD3$
--
You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxx.
To view this discussion on the web visit https://urldefense.com/v3/__https://groups.google.com/d/msgid/syzkaller-bugs/0c106e6c-672f-474e-5815-97b65596139d*40oracle.com__;JQ!!ACWV5N9M2RV99hQ!fbn9ny5Bw51Jl6yrU93iULDBXa_DPjyVIgQuZWyQbCo5IRkAzvYs6JKlPHjmYAGZ$ .