Re: Weird issue with epoll and kernel >= 5.0

From: Randy Dunlap
Date: Sun Mar 29 2020 - 11:55:19 EST


On 3/29/20 5:09 AM, David Laight wrote:
> From: Randy Dunlap
>> Sent: 28 March 2020 19:22
> ...
>> There have been around 10 changes in fs/eventpoll.c since v5.0 was
>> released in March, 2019, so it would be helpful if you could test
>> the latest mainline kernel to see if the problem is still present.
>
> Is there any info about the scenarios that the fixes affect?
> We've an application that can use epoll() or poll() and I wonder
> if I should not default to epoll() on 5.0+ kernels that might be dodgy.

5.0 was released on 2019-03-03. The following patches have been merged
since then.

> git log --oneline fs/eventpoll.c | more ### latest patches first
1b53734bd0b2 epoll: fix possible lost wakeup on epoll_ctl() path
39220e8d4a2a eventpoll: support non-blocking do_epoll_ctl() calls
58e41a44c488 eventpoll: abstract out epoll_ctl() handler
339ddb53d373 fs/epoll: remove unnecessary wakeups of nested epoll
f6520c520842 epoll: simplify ep_poll_safewake() for CONFIG_DEBUG_LOCK_ALLOC
c8377adfa781 PM / wakeup: Show wakeup sources stats in sysfs
eec4844fae7c proc/sysctl: add shared variables for range check
b772434be089 signal: simplify set_user_sigmask/restore_user_sigmask
97abc889ee29 signal: remove the wrong signal_pending() check in restore_user_sigmask()
2874c5fd2842 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
a218cc491420 epoll: use rwlock in order to reduce ep_poll_callback() contention
c3e320b61581 epoll: unify awaking of wakeup source on ep_poll_callback() path
c141175d011f epoll: make sure all elements in ready list are in FIFO order


> It rather depends whether wakeups just get lost - but the next
> rx data will wake things up, or whether the linked lists get
> completely hosed and 'all hell' breaks out (or doesn't).
>
> In our case there is only one reader and the fd are all
> UDP sockets (added and removed when the socket is created/closed).


--
~Randy