Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()

From: Arnd Bergmann
Date: Wed May 10 2023 - 09:36:02 EST


On Wed, May 10, 2023, at 15:23, Geert Uytterhoeven wrote:
> read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> feature, just like its non-atomic counterpart. However, there are
> several issues with this, due to its use in atomic contexts:
>
> 1. When called in the s2ram path (as typically done by clock or PM
> domain drivers), timekeeping may be suspended, triggering the
> WARN_ON(timekeeping_suspended) in ktime_get():
>
> WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
>
> Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> rid of that warning. However, that would break timeout handling,
> as (at least on systems with an ARM architectured timer), the time
> returned by ktime_get_mono_fast_ns() does not advance while
> timekeeping is suspended.
> Interestingly, (on the same ARM systems) the time returned by
> ktime_get() does advance while timekeeping is suspended, despite
> the warning.
>
> 2. Depending on the actual clock source, and especially before a
> high-resolution clocksource (e.g. the ARM architectured timer)
> becomes available, time may not advance in atomic contexts, thus
> breaking timeout handling.
>
> Fix this by abandoning the idea that one can rely on timekeeping to
> implement timeout handling in all atomic contexts, and switch from a
> global time-based to a locally-estimated timeout handling. In most
> (all?) cases the timeout condition is exceptional and an error
> condition, hence any additional delays due to underestimating wall clock
> time are irrelevant.
>
> Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>

This looks reasonable to me,

Acked-by: Arnd Bergmann <arnd@xxxxxxxx>

I assume you sent this because you ran into the bug on a
particular driver. It might help to be more specific about
how this can be reproduced.

> ---
> Alternatively, one could use a mixed approach (use both
> ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> earliest occasion), but I think that would complicate things without
> much gain.

Agreed.

Arnd