Re: [PATCH RFC net-next v3 0/8] virtio/vsock: support datagrams

From: Bobby Eshleman
Date: Tue Jun 06 2023 - 14:07:50 EST


On Mon, Jun 05, 2023 at 11:42:06PM +0300, Arseniy Krasnov wrote:
> Hello Bobby!
>
> Thanks for this patchset, really interesting!
>
> I applied it on head:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d20dd0ea14072e8a90ff864b2c1603bd68920b4b
>
> And tried to run ./vsock_test (client in the guest, server in the host), I had the following crash:
>
> Control socket connected to 192.168.1.1:12345.
> 0 - SOCK_STREAM connection reset...
> [ 8.050215] BUG: kernel NULL pointer derefer
> [ 8.050960] #PF: supervisor read access in kernel mode
> [ 8.050960] #PF: error_code(0x0000) - not-present page
> [ 8.050960] PGD 0 P4D 0
> [ 8.050960] Oops: 0000 [#1] PREEMPT SMP PTI
> [ 8.050960] CPU: 0 PID: 109 Comm: vsock_test Not tainted 6.4.0-rc3-gd707c220a700
> [ 8.050960] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14
> [ 8.050960] RIP: 0010:static_key_count+0x0/0x20
> [ 8.050960] Code: 04 4c 8b 46 08 49 29 c0 4c 01 c8 4c 89 47 08 89 0e 89 56 04 4f
> [ 8.050960] RSP: 0018:ffffa9a1c021bdc0 EFLAGS: 00010202
> [ 8.050960] RAX: ffffffffac309880 RBX: ffffffffc02fc140 RCX: 0000000000000000
> [ 8.050960] RDX: ffff9a5eff944600 RSI: 0000000000000000 RDI: 0000000000000000
> [ 8.050960] RBP: ffff9a5ec2371900 R08: ffffa9a1c021bd30 R09: ffff9a5eff98e0c0
> [ 8.050960] R10: 0000000000001000 R11: 0000000000000000 R12: ffffa9a1c021be80
> [ 8.050960] R13: 0000000000000000 R14: 0000000000000002 R15: ffff9a5ec1cfca80
> [ 8.050960] FS: 00007fa9bf88c5c0(0000) GS:ffff9a5efe400000(0000) knlGS:00000000
> [ 8.050960] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8.050960] CR2: 0000000000000000 CR3: 00000000023e0000 CR4: 00000000000006f0
> [ 8.050960] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 8.050960] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 8.050960] Call Trace:
> [ 8.050960] <TASK>
> [ 8.050960] once_deferred+0xd/0x30
> [ 8.050960] vsock_assign_transport+0xa2/0x1b0 [vsock]
> [ 8.050960] vsock_connect+0xb4/0x3a0 [vsock]
> [ 8.050960] ? var_wake_function+0x60/0x60
> [ 8.050960] __sys_connect+0x9e/0xd0
> [ 8.050960] ? _raw_spin_unlock_irq+0xe/0x30
> [ 8.050960] ? do_setitimer+0x128/0x1f0
> [ 8.050960] ? alarm_setitimer+0x4c/0x90
> [ 8.050960] ? fpregs_assert_state_consistent+0x1d/0x50
> [ 8.050960] ? exit_to_user_mode_prepare+0x36/0x130
> [ 8.050960] __x64_sys_connect+0x11/0x20
> [ 8.050960] do_syscall_64+0x3b/0xc0
> [ 8.050960] entry_SYSCALL_64_after_hwframe+0x4b/0xb5
> [ 8.050960] RIP: 0033:0x7fa9bf7c4d13
> [ 8.050960] Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 48
> [ 8.050960] RSP: 002b:00007ffdf2d96cc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000a
> [ 8.050960] RAX: ffffffffffffffda RBX: 0000560c305d0020 RCX: 00007fa9bf7c4d13
> [ 8.050960] RDX: 0000000000000010 RSI: 00007ffdf2d96ce0 RDI: 0000000000000004
> [ 8.050960] RBP: 0000000000000004 R08: 0000560c317dc018 R09: 0000000000000000
> [ 8.050960] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [ 8.050960] R13: 0000560c305ccc2d R14: 00007ffdf2d96ce0 R15: 00007ffdf2d96d70
> [ 8.050960] </TASK>
>
>
> I guess crash is somewhere near:
>
> old_info->transport->release(vsk); in vsock_assign_transport(). May be my config is wrong...
>
> Thanks, Arseniy

Thanks Arseniy!

I now see I broke the tests, but did't break the stream/dgram socket
utility I was using in development.

I'll track this down and include a fix in the next rev.

I should have warned this v3 is pretty under-tested. Being unsure if
some of the design choices would be accepted at all, I didn't want to
waste too much time until things were accepted at a high level.

Best,
Bobby