Re: [PATCH RFC net-next] vsock: Reschedule connect_work for O_NONBLOCK connect() requests

From: Stefano Garzarella
Date: Thu Aug 04 2022 - 02:59:41 EST


On Wed, Aug 03, 2022 at 07:09:25PM -0700, Peilin Ye wrote:
From: Peilin Ye <peilin.ye@xxxxxxxxxxxxx>

An O_NONBLOCK vsock_connect() request may try to reschedule
@connect_work. Consider the following vsock_connect() requests:

1. The 1st, non-blocking request schedules @connect_work, which will
expire after, say, 200 jiffies. Socket state is now SS_CONNECTING;

2. Later, the 2nd, blocking request gets interrupted by a signal after
5 jiffies while waiting for the connection to be established.
Socket state is back to SS_UNCONNECTED, and @connect_work will
expire after 100 jiffies;

3. Now, the 3rd, non-blocking request tries to schedule @connect_work
again, but @connect_work has already been scheduled, and will
expire in, say, 50 jiffies.

In this scenario, currently this 3rd request simply decreases the sock
reference count and returns. Instead, let it reschedules @connect_work
and resets the timeout back to @connect_timeout.

Signed-off-by: Peilin Ye <peilin.ye@xxxxxxxxxxxxx>
---
Hi all,

This patch is RFC because it bases on Stefano's WIP fix [1] for a bug [2]
reported by syzbot, and it won't apply on current net-next. I think it
solves a separate issue.

Nice, this is better!

Feel free to include my patch in this (inclunding also the Fixes tag and maybe senidng to syzbot and including its tag as well).

The last thing I was trying to figure out before sending the patch was whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout().
I think we should do that, otherwise a subsequent to connect() with O_NONBLOCK set would keep returning -EALREADY, even though the timeout has expired.

What do you think?

I don't think it changes anything for the bug raised by sysbot, so it could be a separate patch.

Thanks,
Stefano


Please advise, thanks!
Peilin Ye

[1] https://gitlab.com/sgarzarella/linux/-/commit/2d0f0b9cbbb30d58fdcbca7c1a857fd8f3110d61
[2] https://syzkaller.appspot.com/bug?id=cd9103dc63346d26acbbdbf5c6ba9bd74e48c860

net/vmw_vsock/af_vsock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 194d22291d8b..417e4ad17c03 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1395,7 +1395,7 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr,
/* If the timeout function is already scheduled, ungrab
* the socket refcount to not leave it unbalanced.
*/
- if (!schedule_delayed_work(&vsk->connect_work, timeout))
+ if (mod_delayed_work(system_wq, &vsk->connect_work, timeout))
sock_put(sk);

/* Skip ahead to preserve error code set above. */
--
2.20.1