Re: sys_epoll_wait high CPU load in 2.6.37

From: Eric Dumazet
Date: Wed Jan 26 2011 - 12:52:02 EST


Le mercredi 26 janvier 2011 Ã 11:20 -0600, Shawn Bohrer a Ãcrit :
> On Wed, Jan 26, 2011 at 05:13:29PM +0100, Eric Dumazet wrote:
> > Le mercredi 26 janvier 2011 Ã 16:59 +0100, Eric Dumazet a Ãcrit :
> > > Le mercredi 26 janvier 2011 Ã 07:52 -0800, Davide Libenzi a Ãcrit :
> > >
> > > > For "above", I meant the current epoll expire time calculation, which was
> > > > described above in the message ;)
> > >
> > > Well, problem was not an overflow, but doing a loop 2.000.000 times ;)
> > >
> > > > The hint for a timespec_add_ms() was because we must be doing something
> > > > similar in poll, don't we (/me got no code in front ATM)?
> > >
> > > Apparently its done differently in poll(), using
> > > poll_select_set_timeout() helper.
> > >
> > >
> > > Give me some minutes I'll try to cook an alternate patch
> > >
> >
> > Here is the alternate patch, using poll_select_set_timeout() helper
> >
> > Thanks
> >
> > [PATCH v2] epoll: epoll_wait() should not use timespec_add_ns()
> >
> > commit 95aac7b1cd224f (epoll: make epoll_wait() use the hrtimer range
> > feature) added a performance regression because it uses
> > timespec_add_ns() with potential very large 'ns' values.
> >
> > Use poll_select_set_timeout() helper like poll()/select()
> >
> > Reported-by: Simon Kirby <sim@xxxxxxxxxx>
> > Signed-off-by: Eric Dumazet <eric.dumazet@xxxxxxxxx>
> > CC: Shawn Bohrer <shawn.bohrer@xxxxxxxxx>
> > CC: Davide Libenzi <davidel@xxxxxxxxxxxxxxx>
> > CC: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> > ---
> > fs/eventpoll.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/eventpoll.c b/fs/eventpoll.c
> > index cc8a9b7..94d887b 100644
> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -1125,8 +1125,8 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
> > ktime_t expires, *to = NULL;
> >
> > if (timeout > 0) {
> > - ktime_get_ts(&end_time);
> > - timespec_add_ns(&end_time, (u64)timeout * NSEC_PER_MSEC);
> > + poll_select_set_timeout(&end_time, timeout / MSEC_PER_SEC,
> > + NSEC_PER_MSEC * (timeout % MSEC_PER_SEC));
> > slack = select_estimate_accuracy(&end_time);
> > to = &expires;
> > *to = timespec_to_ktime(end_time);
>
> poll_select_set_timeout() jumps through some extra hoops that
> aren't necessary in the epoll case so I actually like your previous
> patch better.

Well, I dont care, I let Davide decide, he is the boss ;)

This is a stable candidate, so adding timespec_add_ms() sounds overkill.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/