Re: [RFC PATCH 3/3] Enable ptp_kvm for arm64

From: Marc Zyngier
Date: Sat Sep 07 2019 - 05:15:55 EST


On Fri, 06 Sep 2019 12:58:15 +0100,
"Jianyong Wu (Arm Technology China)" <Jianyong.Wu@xxxxxxx> wrote:
>
> Hi Marc,
>
> Very sorry to have missed this comments.
>
> > -----Original Message-----
> > From: Marc Zyngier <maz@xxxxxxxxxx>
> > Sent: Thursday, August 29, 2019 6:33 PM
> > To: Jianyong Wu (Arm Technology China) <Jianyong.Wu@xxxxxxx>;
> > netdev@xxxxxxxxxxxxxxx; pbonzini@xxxxxxxxxx;
> > sean.j.christopherson@xxxxxxxxx; richardcochran@xxxxxxxxx; Mark Rutland
> > <Mark.Rutland@xxxxxxx>; Will Deacon <Will.Deacon@xxxxxxx>; Suzuki
> > Poulose <Suzuki.Poulose@xxxxxxx>
> > Cc: linux-kernel@xxxxxxxxxxxxxxx; Steve Capper <Steve.Capper@xxxxxxx>;
> > Kaly Xin (Arm Technology China) <Kaly.Xin@xxxxxxx>; Justin He (Arm
> > Technology China) <Justin.He@xxxxxxx>
> > Subject: Re: [RFC PATCH 3/3] Enable ptp_kvm for arm64
> >
> > On 29/08/2019 07:39, Jianyong Wu wrote:
> > > Currently in arm64 virtualization environment, there is no mechanism
> > > to keep time sync between guest and host. Time in guest will drift
> > > compared with host after boot up as they may both use third party time
> > > sources to correct their time respectively. The time deviation will be
> > > in order of milliseconds but some scenarios ask for higher time
> > > precision, like in cloud envirenment, we want all the VMs running in
> > > the host aquire the same level accuracy from host clock.
> > >
> > > Use of kvm ptp clock, which choose the host clock source clock as a
> > > reference clock to sync time clock between guest and host has been
> > > adopted by x86 which makes the time sync order from milliseconds to
> > nanoseconds.
> > >
> > > This patch enable kvm ptp on arm64 and we get the similar clock drift
> > > as found with x86 with kvm ptp.
> > >
> > > Test result comparison between with kvm ptp and without it in arm64
> > > are as follows. This test derived from the result of command 'chronyc
> > > sources'. we should take more cure of the last sample column which
> > > shows the offset between the local clock and the source at the last
> > measurement.
> > >
> > > no kvm ptp in guest:
> > > MS Name/IP address Stratum Poll Reach LastRx Last sample
> > >
> > ==========================================================
> > ==============
> > > ^* dns1.synet.edu.cn 2 6 377 13 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 21 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 29 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 37 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 45 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 53 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 61 +1040us[+1581us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 4 -130us[ +796us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 12 -130us[ +796us] +/- 21ms
> > > ^* dns1.synet.edu.cn 2 6 377 20 -130us[ +796us] +/- 21ms
> > >
> > > in host:
> > > MS Name/IP address Stratum Poll Reach LastRx Last sample
> > >
> > ==========================================================
> > ==============
> > > ^* 120.25.115.20 2 7 377 72 -470us[ -603us] +/- 18ms
> > > ^* 120.25.115.20 2 7 377 92 -470us[ -603us] +/- 18ms
> > > ^* 120.25.115.20 2 7 377 112 -470us[ -603us] +/- 18ms
> > > ^* 120.25.115.20 2 7 377 2 +872ns[-6808ns] +/- 17ms
> > > ^* 120.25.115.20 2 7 377 22 +872ns[-6808ns] +/- 17ms
> > > ^* 120.25.115.20 2 7 377 43 +872ns[-6808ns] +/- 17ms
> > > ^* 120.25.115.20 2 7 377 63 +872ns[-6808ns] +/- 17ms
> > > ^* 120.25.115.20 2 7 377 83 +872ns[-6808ns] +/- 17ms
> > > ^* 120.25.115.20 2 7 377 103 +872ns[-6808ns] +/- 17ms
> > > ^* 120.25.115.20 2 7 377 123 +872ns[-6808ns] +/- 17ms
> > >
> > > The dns1.synet.edu.cn is the network reference clock for guest and
> > > 120.25.115.20 is the network reference clock for host. we can't get
> > > the clock error between guest and host directly, but a roughly
> > > estimated value will be in order of hundreds of us to ms.
> > >
> > > with kvm ptp in guest:
> > > chrony has been disabled in host to remove the disturb by network clock.
> >
> > Is that a realistic use case? Why should the host not use NTP?
> >
>
> Not really, NTP will change the the host clock which will contaminate the data of sync between
> Host and guest. But in reality, we will keep NTP online.
>
> > >
> > > MS Name/IP address Stratum Poll Reach LastRx Last sample
> > >
> > ==========================================================
> > ==============
> > > * PHC0 0 3 377 8 -7ns[ +1ns] +/- 3ns
> > > * PHC0 0 3 377 8 +1ns[ +16ns] +/- 3ns
> > > * PHC0 0 3 377 6 -4ns[ -0ns] +/- 6ns
> > > * PHC0 0 3 377 6 -8ns[ -12ns] +/- 5ns
> > > * PHC0 0 3 377 5 +2ns[ +4ns] +/- 4ns
> > > * PHC0 0 3 377 13 +2ns[ +4ns] +/- 4ns
> > > * PHC0 0 3 377 12 -4ns[ -6ns] +/- 4ns
> > > * PHC0 0 3 377 11 -8ns[ -11ns] +/- 6ns
> > > * PHC0 0 3 377 10 -14ns[ -20ns] +/- 4ns
> > > * PHC0 0 3 377 8 +4ns[ +5ns] +/- 4ns
> > >
> > > The PHC0 is the ptp clock which choose the host clock as its source
> > > clock. So we can be sure to say that the clock error between host and
> > > guest is in order of ns.
> > >
> > > Signed-off-by: Jianyong Wu <jianyong.wu@xxxxxxx>
> > > ---
> > > arch/arm64/include/asm/arch_timer.h | 3 ++
> > > arch/arm64/kvm/arch_ptp_kvm.c | 76
> > ++++++++++++++++++++++++++++
> > > drivers/clocksource/arm_arch_timer.c | 6 ++-
> > > drivers/ptp/Kconfig | 2 +-
> > > include/linux/arm-smccc.h | 14 +++++
> > > virt/kvm/arm/psci.c | 17 +++++++
> > > 6 files changed, 115 insertions(+), 3 deletions(-) create mode
> > > 100644 arch/arm64/kvm/arch_ptp_kvm.c
> >
> > Please split this patch into two parts: the hypervisor code in a patch and the
> > guest code in another patch. Having both of them together is confusing.
> >
> Ok, really better.
>
> > >
> > > diff --git a/arch/arm64/include/asm/arch_timer.h
> > > b/arch/arm64/include/asm/arch_timer.h
> > > index 6756178c27db..880576a814b6 100644
> > > --- a/arch/arm64/include/asm/arch_timer.h
> > > +++ b/arch/arm64/include/asm/arch_timer.h
> > > @@ -229,4 +229,7 @@ static inline int arch_timer_arch_init(void)
> > > return 0;
> > > }
> > >
> > > +extern struct clocksource clocksource_counter; extern u64
> > > +arch_counter_read(struct clocksource *cs);
> >
> > I'm definitely not keen on exposing the internals of the arch_timer driver to
> > random subsystems. Furthermore, you seem to expect that the guest kernel
> > will only use the arch timer as a clocksource, and nothing really guarantees
> > that (in which case get_device_system_crosststamp will fail).
> >
> The code here is really ugly, I need a better solution to offer a clock source
> For the guest.
>
> > It looks to me that we'd be better off exposing a core timekeeping API that
> > populates a struct system_counterval_t based on the *current* timekeeper
> > monotonic clocksource. This would simplify the split between generic and
> > arch-specific code.
> >
> I think it really necessary.
>
> > Whether or not tglx will be happy with the idea is another problem, but I'm
> > certainly not taking any change to the arch timer code based on this.
> >
> I can have a try, but the detail is not clear for me now.

Something along those lines: