Re: Re: [PATCH v2 2/2] KVM: x86: use x86_get_freq to get freq for kvmclock

From: Peter Zijlstra
Date: Thu Dec 02 2021 - 17:46:12 EST


On Thu, Dec 02, 2021 at 09:19:25AM +0200, Maxim Levitsky wrote:
> On Thu, 2021-12-02 at 13:26 +0800, zhenwei pi wrote:

> Note that on my Zen2 machine (3970X), aperf/mperf returns current cpu freqency,

Correct, and it computes it over a random period of history. IOW, it's a
random number generator.

> 1. It sucks that on AMD, the TSC frequency is calibrated from other
> clocksources like PIT/HPET, since the result is not exact and varies
> from boot to boot.

CPUID.15h is supposed to tell us the actual frequency; except even Intel
find it very hard to actually put the right (or any, really) number in
there :/ Bribe your friendly AMD engineer with beers or something.

> 2. In the guest on AMD, we mark the TSC as unsynchronized always due to the code
> in unsynchronized_tsc, unless invariant tsc is used in guest cpuid,
> which is IMHO not fair to AMD as we don't do this for Intel cpus.
> (look at unsynchronized_tsc function)

Possibly we could treat >= Zen similar to Intel there. Also that comment
there is hillarious, it talks about multi-socket and then tests
num_possible_cpus(). Clearly that code hasn't been touched in like
forever.

> 3. I wish the kernel would export the tsc frequency it found to userspace
> somewhere in /sys or /proc, as this would be very useful for userspace applications.
> Currently it can only be found in dmesg if I am not mistaken..
> I don't mind if such frequency would only be exported if the TSC is stable,
> always running, not affected by CPUfreq, etc.

Perf exposes it, it's not really convenient if you're not using perf,
but it can be found there.


---
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 2e076a459a0c..09da2935534a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -29,6 +29,7 @@
#include <asm/intel-family.h>
#include <asm/i8259.h>
#include <asm/uv/uv.h>
+#include <asm/topology.h>

unsigned int __read_mostly cpu_khz; /* TSC clocks / usec, not used here */
EXPORT_SYMBOL(cpu_khz);
@@ -1221,9 +1222,20 @@ int unsynchronized_tsc(void)
* Intel systems are normally all synchronized.
* Exceptions must mark TSC as unstable:
*/
- if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL) {
+ switch (boot_cpu_data.x86_vendor) {
+ case X86_VENDOR_INTEL:
+ /* Really only Core and later */
+ break;
+
+ case X86_VENDOR_AMD:
+ case X86_VENDOR_HYGON:
+ if (boot_cpu_data.x86 >= 0x17) /* >= Zen */
+ break;
+ fallthrough;
+
+ default:
/* assume multi socket systems are not synchronized: */
- if (num_possible_cpus() > 1)
+ if (topology_max_packages() > 1)
return 1;
}