RE: [PATCH 0/2] clocksource/Hyper-V: Add Hyper-V specific sched clock function

From: Vitaly Kuznetsov
Date: Wed Aug 21 2019 - 04:54:35 EST


Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> writes:

> Michael Kelley <mikelley@xxxxxxxxxxxxx> writes:
>
>> I talked to KY Srinivasan for any history about TSC page on 32-bit. He said
>> there was no technical reason not to implement it, but our focus was always
>> 64-bit Linux, so the 32-bit was much less important. Also, on 32-bit Linux,
>> the required 64x64 multiply and shift is more complex and takes more
>> more cycles (compare 32-bit implementation of mul_u64_u64_shr vs.
>> the 64-bit implementation), so the win over a MSR read is less. I
>> don't know of any actual measurements being made to compare vs.
>> MSR read.
>
> VMExit is 1000 CPU cycles or so, I would guess that TSC page
> calculations are better. Let me try to build 32bit kernel and do some
> quick measurements.

So I tried and the difference is HUGE.

For in-kernel clocksource reads (like sched_clock()), the testing code
was:

before = rdtsc_ordered();
for (i = 0; i < 1000; i++)
(void)read_hv_sched_clock_msr();
after = rdtsc_ordered();
printk("MSR based clocksource: %d cycles\n", ((u32)(after - before))/1000);

before = rdtsc_ordered();
for (i = 0; i < 1000; i++)
(void)read_hv_sched_clock_tsc();
after = rdtsc_ordered();
printk("TSC page clocksource: %d cycles\n", ((u32)(after - before))/1000);

The result (WS2016) is:
[ 1.101910] MSR based clocksource: 3361 cycles
[ 1.105224] TSC page clocksource: 49 cycles

For userspace reads the absolute difference is even bigger as TSC page
gives us functional vDSO:

Testing code:
before = rdtsc();
for (i = 0; i < COUNT; i++)
clock_gettime(CLOCK_REALTIME, &tp);
after = rdtsc();
printf("%d\n", (after - before)/COUNT);

Result:

TSC page:
# ./gettime_cycles
131

MSR:
# ./gettime_cycles
5664

With all that I see no reason for us to not enable TSC page on 32bit,
even if the number of users is negligible, this will allow us to get rid
of ugly #ifdef CONFIG_HYPERV_TSCPAGE in the code.

I'll send a patch for discussion.

--
Vitaly