[patch] fs, epoll: short circuit fetching events if thread has been killed

From: David Rientjes
Date: Wed May 03 2017 - 20:23:06 EST


We've encountered zombies that are waiting for a thread to exit that are
looping in ep_poll() almost endlessly although there is a pending SIGKILL
as a result of a group exit.

This happens because we always find ep_events_available() and fetch more
events and never are able to check for signal_pending() that would break
from the loop and return -EINTR.

Special case fatal signals and break immediately to guarantee that we
loop to fetch more events and delay making a timely exit.

It would also be possible to simply move the check for signal_pending()
higher than checking for ep_events_available(), but there have been no
reports of delayed signal handling other than SIGKILL preventing zombies
from exiting that would be fixed by this.

Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
---
fs/eventpoll.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1748,6 +1748,16 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
* to TASK_INTERRUPTIBLE before doing the checks.
*/
set_current_state(TASK_INTERRUPTIBLE);
+ /*
+ * Always short-circuit for fatal signals to allow
+ * threads to make a timely exit without the chance of
+ * finding more events available and fetching
+ * repeatedly.
+ */
+ if (fatal_signal_pending(current)) {
+ res = -EINTR;
+ break;
+ }
if (ep_events_available(ep) || timed_out)
break;
if (signal_pending(current)) {