Re: [RFC PATCH 4/7] clocksource: arm_arch_timer: Export counter type, clocksource

From: Marc Zyngier
Date: Fri Jul 28 2023 - 04:12:01 EST


On Thu, 27 Jul 2023 11:22:11 +0100,
Peter Hilber <peter.hilber@xxxxxxxxxxxxxxx> wrote:
>
> On 03.07.23 10:13, Marc Zyngier wrote:
> > On Fri, 30 Jun 2023 18:10:47 +0100,
> > Peter Hilber <peter.hilber@xxxxxxxxxxxxxxx> wrote:
> >>
> >> Export helper functions to allow other code to
> >>
> >> - determine the counter type in use (virtual or physical, CP15 or memory),
> >>
> >> - get a pointer to the arm_arch_timer clocksource, which can be compared
> >> with the current clocksource.
> >>
> >> The virtio_rtc driver will require the clocksource pointer when using
> >> get_device_system_crosststamp(), and should communicate the actual Arm
> >> counter type to the Virtio RTC device (cf. spec draft [1]).
> >
> > I really don't see why you should poke at the clocksource backend:
> >
> > - the MMIO clocksource is only used in PM situations, which a virtio
> > driver has no business being involved with
> >
> > - only the virtual counter is relevant -- it is always at a 0-offset
> > from the physical one when userspace has an opportunity to run
> >
> > So it really looks that out of the four combinations, only one is
> > relevant.
>
> Thanks for the explanation. Dropping arch_timer_counter_get_type() and
> assuming that the CP15 virtual counter is in use should also work for
> now. With the physical/virtual counter distinction, I tried to be
> future-proof, as per the following considerations:
>
> The intended consumer of arch_timer_counter_get_type() is the Virtio RTC
> device (draft spec [2], patch series [1]). If the Virtio device has
> optional cross-timestamp support, it must know the current Linux kernel
> view of the current clocksource counter. The Virtio driver tells the
> Virtio device the current counter type (in the Arm case, CNTVCT_EL0 or
> CNTPCT_EL0). It is intentionally left unspecified how the Virtio device
> would know the counter value. AFAIU, if the Linux kernel were a
> virtualization host itself, it would be better for the Virtio device to
> look at the physical counter, since the virtual counter could be set for
> a guest. And in other cases, the guest OSes use a virtual counter with
> an offset.

The physical counter can equally be offset (and KVM does offset it),
just like the virtual counter. With NV, the offsets themselves are
partially under control of the guest itself.

So either counters *as seen from the guest* are absolutely pointless
to an observer on the host. That observer sees a virtual counter that
is strictly equal to the physical counter.

> This was the rationale to come up with the physical/virtual counter
> distinction for the Virtio RTC device. Looking at extensions such as
> FEAT_ECV, where the CNTPCT_EL0 value can depend on the EL, or FEAT_NV*,
> it might be a bit simplistic.

Not just simplistic. It doesn't make sense. For this to work, you'd
need to know the global offset that KVM applies to the global counter,
plus the *virtualised* CNTPOFF/CNTVOFF values that the guest can
change at any time on a *per-CPU* basis. None of that is available
outside of KVM, nor would it make any sense anyway.

> Does this physical/virtual counter distinction sound like a good idea?
> Otherwise I would drop the arch_timer_counter_get_type() in the next
> iteration.

My take on this is that only the global counter value makes any sense.
That value is already available from the host as the virtual counter,
because we guarantee that CNTVOFF is 0 when outside of the guest
(well, technically, outside of the vcpu_load/vcpu_put section).

>
> >
> > I'm not Cc'd on the rest of the series, so I can't even see in which
> > context this is used. But as it is, the approach looks wrong.
> >
>
> Sorry, I will Cc you on all relevant patches in the next iteration,
> which I will send out soon.
>
> The first patch series can be found at [1]. I think the second helper
> function in this patch, arch_timer_get_cs(), would still be needed, in
> order to supply the clocksource to get_device_system_crosststamp().

We already have to deal with the kvm_arch_ptp_get_crosststamp()
monstrosity (which I will forever regret having merged). Surely you
can reuse some of that?

Thanks,

M.

--
Without deviation from the norm, progress is not possible.