Re: Problems with timerfd()

From: Andrew Morton
Date: Mon Jul 23 2007 - 02:39:16 EST


On Mon, 23 Jul 2007 08:32:29 +0200 Michael Kerrisk <mtk-manpages@xxxxxxx> wrote:

> Andrew,
>
> The timerfd() syscall went into 2.6.22. While writing the man page for
> this syscall I've found some notable limitations of the interface, and I am
> wondering whether you and Linus would consider having this interface fixed
> for 2.6.23.
>
> On the one hand, these fixes would be an ABI change, which is of course
> bad. (However, as noted below, you have already accepted one of the ABI
> changes that I suggested into -mm, after Davide submitted a patch.)
>
> On the other hand, the interface has not yet made its way into a glibc
> release, and the change will not break applications. (The 2.6.22 version
> of the interface would just be "broken".)

I think if the need is sufficient we can do this: fix it in 2.6.23 and in
2.6.22.x. That means that there will be a few broken-on-new-glibc kernels
out in the wild, but very few I suspect.

> Details of my suggested changes are below. A complication in all of this
> is that on Friday, while I was part way though discussing this with Davide,
> he went on vacation for a month and is likely to have only limited email
> access during that time. (See my further thoughts about what to do while
> Davide is away at the end of this mail message.) Our last communication,
> after Davide had expressed reluctance about making some of the interface
> changes, was a more extensive note from me describing the problems of the
> interface.
>
> The problems of the 2.6.22 timerfd() interface are as follows:
>
> Problem 1
> ---------
>
> The value returned by read(2)ing from a timerfd file descriptor is the
> number of timer overruns. In 2.6.22, this value is 4 bytes, limiting the
> overrun count to 2^32. Consider an application where the timer frequency
> was 100 kHz (feasible in the not-too-distant future, I would guess), then
> the overrun counter would cycle after ~40000 seconds (~11 hours).
> Furthermore returning 4 bytes from the read() is inconsistent with eventfd
> file descriptors, which return 8 byte integers from a read().
>
> Davide has already submitted a patch to you to make read() from a timerfd
> file descriptor return an 8 byte integer, and I understand it to have been
> accepted into -mm.

argh. Nobody told me it was an ABI change! We'll need to consider merging
make-timerfd-return-a-u64-and-fix-the-__put_user.patch into 2.6.22.x as
well.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/