Re: [RESEND PATCH] timerfd: Allow TFD_TIMER_CANCEL_ON_SET with relative timeouts

From: John Stultz
Date: Mon Oct 19 2015 - 14:53:31 EST


On Fri, Oct 9, 2015 at 1:25 AM, Jesper Nilsson <jesper.nilsson@xxxxxxxx> wrote:
> Allow TFD_TIMER_CANCEL_ON_SET on timerfd_settime() with relative
> as well as absolute timeout.
>
> Signed-off-by: Jesper Nilsson <jesper.nilsson@xxxxxxxx>
> ---
> Resending after some discussion with Thomas Gleixner at ELCE,
> and Cc:ing John Stultz and Michael Kerrisk who may have comments.
>
> Longer background:
>
> One of the uses for TFD_TIMER_CANCEL_ON_SET is to get
> an event when the CLOCK_REALTIME changes (as by NTP or user action).
> In this case, the timeout irrelevant, and the maximum
> available value would be selected to avoid mis-triggers.
>
> However, timerfd uses time_t for configuration, and the maximum
> value on a 32bit time_t system is actually a valid time
> (near 2038-01-19 03:14) in the 64bit ktime_t used internally in timerfd.
>
> One way of provoking this problem would be to set the time
> using "date '2038-01-19 03:14'" and letting the time roll over
> a few seconds later.
>
> After this time, a timerfd will continuously fire
> when configured with a maximum absolute timeout,
> potentially stealing all CPU and stopping the application
> from doing what it really should be doing.
> Which would be fine, unless the application is systemd
> and loops at startup, leaving the system in a state where
> the kernel is up, but nothing running in userspace. :-(
>
> This problem was further exposed in kernel v3.19 by
> commit a6d6e1c879efa4b77e250c34fe5fe1c34e6ef070
> which introduced 64bit time in the RTC subsystem.
> On an unconfigured RTC or an RTC with flat/removed battery
> the date on could be random, and in some cases past 2038.

My first impulse is: Yes, for now, 32bit systems are broken past y2038. :)


> Of course, the proposed patch only allows the setting of relative
> timeouts with TFD_TIMER_CANCEL_ON_SET, any application using
> it would also need to be patched to use the relative timer
> for this solve the described problem.

So is this not more of a workaround? Basically just allows
applications on systems that can't handle y2038 to be able to use
different interfaces to maybe allow the system to crawl onward to the
next problem?

Miroslav had a patch recently to try to keep 32bit systems from
getting into a week near y2038 to try to stave off some of these ugly
problems. While I'm not totally against such patches (Miroslav's
concern of a DoS vector is reasonable), I also want to avoid giving
folks a false sense of confidence that the problems are resolved.

Arnd and others are working on getting real 64bit time interfaces
piped out to 32bit userspace (once the in-kernel usage is fixed up).
So having some clear motivation to get folks to migrate to the new
interfaces will be needed.

But yea. At the same time I get you want to avoid user-pain like in
the case of the badly initialized RTC, but in that case would
returning 0 for RTC reads greater then y2038 on 32 bit systems be a
more sane fix?

thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/