[PATCH v2] x86/tsc: Have tsc=recalibrate override things

From: Peter Zijlstra
Date: Wed Nov 01 2023 - 07:16:40 EST


Subject: x86/tsc: Have tsc=recalibrate override things
From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Date: Mon, 30 Oct 2023 17:00:50 +0100

My brand-spanking new SPR supermicro workstation was reporting NTP
failures:

Oct 30 13:00:26 spr ntpd[3517]: CLOCK: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized
Oct 30 13:00:58 spr ntpd[3517]: CLOCK: time stepped by 32.316775
Oct 30 13:00:58 spr ntpd[3517]: CLOCK: frequency error 41699 PPM exceeds tolerance 500 PPM

CPUID provides:

Time Stamp Counter/Core Crystal Clock Information (0x15):
TSC/clock ratio = 200/2
nominal core crystal clock = 25000000 Hz
Processor Frequency Information (0x16):
Core Base Frequency (MHz) = 0x9c4 (2500)
Core Maximum Frequency (MHz) = 0x12c0 (4800)
Bus (Reference) Frequency (MHz) = 0x64 (100)

and the kernel believes this. Since commit a7ec817d5542 ("x86/tsc: Add
option to force frequency recalibration with HW timer") there is the
tsc=recalibrate option, which forces the recalibrate.

This duely reports:

Oct 30 12:42:39 spr kernel: tsc: Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!
Oct 30 12:42:39 spr kernel: tsc: Previous calibrated TSC freq: 2500.000 MHz
Oct 30 12:42:39 spr kernel: tsc: TSC freq recalibrated by [HPET]: 2399.967 MHz

but then continues running at 2500, offering no solace and keeping NTP
upset -- drifting ~30 seconds for every 15 minutes.

Have tsc=recalibrate override the CPUID provided numbers. This makes the
machine usable and keeps NTP 'happy':

Oct 30 16:48:44 spr ntpd[2200]: CLOCK: time stepped by -0.768679

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
---
arch/x86/kernel/tsc.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1430,14 +1430,13 @@ static void tsc_refine_calibration_work(
hpet ? "HPET" : "PM_TIMER",
(unsigned long)freq / 1000,
(unsigned long)freq % 1000);
+ } else {

- return;
+ /* Make sure we're within 1% */
+ if (abs(tsc_khz - freq) > tsc_khz/100)
+ goto out;
}

- /* Make sure we're within 1% */
- if (abs(tsc_khz - freq) > tsc_khz/100)
- goto out;
-
tsc_khz = freq;
pr_info("Refined TSC clocksource calibration: %lu.%03lu MHz\n",
(unsigned long)tsc_khz / 1000,
@@ -1479,14 +1478,12 @@ static int __init init_tsc_clocksource(v
* When TSC frequency is known (retrieved via MSR or CPUID), we skip
* the refined calibration and directly register it as a clocksource.
*/
- if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {
+ if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ) && !tsc_force_recalibrate) {
if (boot_cpu_has(X86_FEATURE_ART))
art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
clocksource_unregister(&clocksource_tsc_early);
-
- if (!tsc_force_recalibrate)
- return 0;
+ return 0;
}

schedule_delayed_work(&tsc_irqwork, 0);