Re: [PATCH] epoll: remove ep_call_nested() from ep_eventpoll_poll()

From: Davidlohr Bueso
Date: Tue Oct 31 2017 - 10:59:25 EST


On Tue, 31 Oct 2017, Jason Baron wrote:

The use of ep_call_nested() in ep_eventpoll_poll(), which is the .poll
routine for an epoll fd, is used to prevent excessively deep epoll
nesting, and to prevent circular paths. However, we are already preventing
these conditions during EPOLL_CTL_ADD. In terms of too deep epoll chains,
we do in fact allow deep nesting of the epoll fds themselves (deeper
than EP_MAX_NESTS), however we don't allow more than EP_MAX_NESTS when
an epoll file descriptor is actually connected to a wakeup source. Thus,
we do not require the use of ep_call_nested(), since ep_eventpoll_poll(),
which is called via ep_scan_ready_list() only continues nesting if there
are events available. Since ep_call_nested() is implemented using a global
lock, applications that make use of nested epoll can see large performance
improvements with this change.

Improvements are quite obscene actually, such as for the following epoll_wait()
benchmark with 2 level nesting on a 80 core IvyBridge:

ncpus vanilla dirty delta
1 2447092 3028315 +23.75%
4 231265 2986954 +1191.57%
8 121631 2898796 +2283.27%
16 59749 2902056 +4757.07%
32 26837 2326314 +8568.30%
64 12926 1341281 +10276.61%

(http://linux-scalability.org/epoll/epoll-test.c)

Thanks,
Davidlohr