Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

From: Paul E. McKenney
Date: Fri Jun 02 2023 - 14:29:43 EST


On Mon, May 22, 2023 at 10:14:08AM +0200, Thomas Gleixner wrote:
> On Mon, May 22 2023 at 11:30, Feng Tang wrote:
> > Commit a7ec817d5542 ("x86/tsc: Add option to force frequency
> > recalibration with HW timer") was added to handle cases that the
> > firmware has bug and provides a wrong TSC frequency number, and it
> > is optional given that this kind of firmware issue rarely happens
> > (Paul reported once [1]).
> >
> > But Rui reported that some Sapphire Rapids platform met this issue
> > again recently, and as firmware is also a kind of 'software' which
> > can't be bug free, make the recalibration default on. When the
> > values from firmware and HW timer's calibration have big gap,
> > raise a warning and let vendor to check which side is broken.
>
> Sure firmware can have bugs, but if firmware validation does not even
> catch such a trivially to detect bug, then their validation is nothing
> else than rubber stamping. Seriously.
>
> Are any of these affected platforms shipping already or is this just
> Intel internal muck?
>
> > One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> > and they will also do this recalibration.
>
> It's also pointless for those SoCs which lack legacy hardware.
>
> So why do you force this on everyone?

Just for the record, this patch could be helpful in allowing victims
of TSC mis-synchronization to more easily provide a more complete bug
report to the firmware people. There is of course no point if there is
already a fix available.

But it is not all that hard to work around not having this patch upstream.
This can be hand-applied as needed, NTP drift rates can be pressed
into service for those of us having atomic clocks near all our servers,
or the firmware guys can be tasked with figuring it out.

So this patch would be nice to have, but we could live without it.

Thanx, Paul