Re: sys_epoll_wait high CPU load in 2.6.37

From: Shawn Bohrer
Date: Wed Jan 26 2011 - 12:20:23 EST


On Wed, Jan 26, 2011 at 05:13:29PM +0100, Eric Dumazet wrote:
> Le mercredi 26 janvier 2011 à 16:59 +0100, Eric Dumazet a écrit :
> > Le mercredi 26 janvier 2011 à 07:52 -0800, Davide Libenzi a écrit :
> >
> > > For "above", I meant the current epoll expire time calculation, which was
> > > described above in the message ;)
> >
> > Well, problem was not an overflow, but doing a loop 2.000.000 times ;)
> >
> > > The hint for a timespec_add_ms() was because we must be doing something
> > > similar in poll, don't we (/me got no code in front ATM)?
> >
> > Apparently its done differently in poll(), using
> > poll_select_set_timeout() helper.
> >
> >
> > Give me some minutes I'll try to cook an alternate patch
> >
>
> Here is the alternate patch, using poll_select_set_timeout() helper
>
> Thanks
>
> [PATCH v2] epoll: epoll_wait() should not use timespec_add_ns()
>
> commit 95aac7b1cd224f (epoll: make epoll_wait() use the hrtimer range
> feature) added a performance regression because it uses
> timespec_add_ns() with potential very large 'ns' values.
>
> Use poll_select_set_timeout() helper like poll()/select()
>
> Reported-by: Simon Kirby <sim@xxxxxxxxxx>
> Signed-off-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
> CC: Shawn Bohrer <shawn.bohrer@xxxxxxxxx>
> CC: Davide Libenzi <davidel@xxxxxxxxxxxxxxx>
> CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> ---
> fs/eventpoll.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> index cc8a9b7..94d887b 100644
> --- a/fs/eventpoll.c
> +++ b/fs/eventpoll.c
> @@ -1125,8 +1125,8 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
> ktime_t expires, *to = NULL;
>
> if (timeout > 0) {
> - ktime_get_ts(&end_time);
> - timespec_add_ns(&end_time, (u64)timeout * NSEC_PER_MSEC);
> + poll_select_set_timeout(&end_time, timeout / MSEC_PER_SEC,
> + NSEC_PER_MSEC * (timeout % MSEC_PER_SEC));
> slack = select_estimate_accuracy(&end_time);
> to = &expires;
> *to = timespec_to_ktime(end_time);

poll_select_set_timeout() jumps through some extra hoops that
aren't necessary in the epoll case so I actually like your previous
patch better.

--
Shawn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/