Re: [PATCH v3 3/3] arm64: Early boot time stamps

From: Pavel Tatashin
Date: Fri Jan 04 2019 - 11:23:30 EST


Hi Marc,

Thank you for taking a look at this please see my replies below.

> > I think, given that on other platforms sched_clock() is already used
> > early, it is not a good idea to invent a different clock just for time
> > stamps.
>
> Square pegs vs round holes. Mimicking other architectures isn't always
> the right thing to do when faced with a different problem. We put a
> lot of effort in working around timer errata for a good reason, and
> feeding the rest of the system bogus timing information doesn't sound
> great.
>
> > We could limit arm64 approach only for chips where cntvct_el0 is
> > working: i.e. frequency is known, and the clock is stable, meaning
> > cannot go backward. Perhaps we would start early clock a little later,
> > but at least it will be available for the sane chips. The only
> > question, where during boot time this is known.
>
> How do you propose we do that? Defective timers can be a property of
> the implementation, of the integration, or both. In any case, it
> requires firmware support (DT, ACPI). All that is only available quite
> late, and moving it earlier is not easily doable.

OK, but could we at least whitelist something early with expectation
that the future chips won't be bogus?

> > Another approach is to modify sched_clock() in
> > kernel/time/sched_clock.c to never return backward value during boot.
> >
> > 1. Rename current implementation of sched_clock() to sched_clock_raw()
> > 2. New sched_clock() would look like this:
> >
> > u64 sched_clock(void)
> > {
> > if (static_branch(early_unstable_clock))
> > return sched_clock_unstable();
> > else
> > return sched_clock_raw();
> > }
> >
> > 3. sched_clock_unstable() would look like this:
> >
> > u64 sched_clock_unstable(void)
> > {
> > again:
> > static u64 old_clock;
> > u64 new_clock = sched_clock_raw();
> > static u64 old_clock_read = READ_ONCE(old_clock);
> > /* It is ok if time does not progress, but don't allow to go backward */
> > if (new_clock < old_clock_read)
> > return old_clock_read;
> > /* update the old_clock value */
> > if (cmpxchg64(&old_clock, old_clock_read, new_clock) != old_clock_read)
> > goto again;
> > return new_clock;
> > }
>
> You now have an "unstable" clock that is only allowed to move forward,
> until you switch to the real one. And at handover time, anything can
> happen.
>
> It is one thing to allow for the time stamping to be imprecise. But
> imposing the same behaviour on other parts of the kernel that have so
> far relied on a strictly monotonic sched_clock feels like a bad idea.

sched_clock() will still be strictly monotonic. During switch over we
will guarantee to continue from where the early clock left.

>
> What I'm proposing is that we allow architectures to override the hard
> tie between local_clock/sched_clock and kernel log time stamping, with
> the default being of course what we have today. This gives a clean
> separation between the two when the architecture needs to delay the
> availability of sched_clock until implementation requirements are
> discovered. It also keep sched_clock simple and efficient.
>
> To illustrate what I'm trying to argue for, I've pushed out a couple
> of proof of concept patches here[1]. I've briefly tested them in a
> guest, and things seem to work OK.

What I am worried is that decoupling time stamps from the
sched_clock() will cause uptime and other commands that show boot time
not to correlate with timestamps in dmesg with these changes. For them
to correlate we would still have to have a switch back to
local_clock() in timestamp_clock() after we are done with early boot,
which brings us back to using a temporarily unstable clock that I
proposed above but without adding an architectural hook for it. Again,
we would need to solve the problem of time continuity during switch
over, which is not a hard problem to solve, as we do it already in
sched_clock.c, and everytime clocksource changes.

During early boot time stamps project for x86 we were extra careful to
make sure that they stay the same.

Thank you,
Pasha