[PATCH] netlink: introduce netlink poll to resolve fast return issue

From: Jong eon Park
Date: Fri Nov 03 2023 - 03:23:03 EST


In very rare cases, there was an issue where a user's poll function
waiting for a uevent would continuously return very quickly, causing
excessive CPU usage due to the following scenario.

Once sk_rcvbuf becomes full netlink_broadcast_deliver returns an error and
netlink_overrun is called. However, if netlink_overrun was called in a
context just before a another context returns from the poll and recv is
invoked, emptying the rcvbuf, sk->sk_err = ENOBUF is written to the
netlink socket belatedly and it enters the NETLINK_S_CONGESTED state.
If the user does not check for POLLERR, they cannot consume and clean
sk_err and repeatedly enter the situation where they call poll again but
return immediately.

To address this issue, I would like to introduce the following netlink
poll.

After calling the datagram_poll, netlink poll checks the
NETLINK_S_CONGESTED status and rcv queue, and this make the user to be
readable once more even if the user has already emptied rcv queue. This
allows the user to be able to consume sk->sk_err value through
netlink_recvmsg, thus the situation described above can be avoided

Co-developed-by: Dong ha Kang <dongha7.kang@xxxxxxxxxxx>
Signed-off-by: Jong eon Park <jongeon.park@xxxxxxxxxxx>
---
net/netlink/af_netlink.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index eb086b06d60d..f08c10220041 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2002,6 +2002,20 @@ static int netlink_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
return err ? : copied;
}

+static __poll_t netlink_poll(struct file *file, struct socket *sock,
+ poll_table *wait)
+{
+ __poll_t mask = datagram_poll(file, sock, wait);
+ struct sock *sk = sock->sk;
+ struct netlink_sock *nlk = nlk_sk(sk);
+
+ if (test_bit(NETLINK_S_CONGESTED, &nlk->state) &&
+ skb_queue_empty_lockless(&sk->sk_receive_queue))
+ mask |= EPOLLIN | EPOLLRDNORM;
+
+ return mask;
+}
+
static void netlink_data_ready(struct sock *sk)
{
BUG();
@@ -2803,7 +2817,7 @@ static const struct proto_ops netlink_ops = {
.socketpair = sock_no_socketpair,
.accept = sock_no_accept,
.getname = netlink_getname,
- .poll = datagram_poll,
+ .poll = netlink_poll,
.ioctl = netlink_ioctl,
.listen = sock_no_listen,
.shutdown = sock_no_shutdown,
--
2.25.1