Re: [PATCH 0/2] clocksource: Avoid incorrect hpet fallback

From: Feng Tang
Date: Wed Nov 10 2021 - 20:23:09 EST


Hi Waiman, Paul,

On Wed, Nov 10, 2021 at 05:17:30PM -0500, Waiman Long wrote:
> It was found that when an x86 system was being stressed by running
> various different benchmark suites, the clocksource watchdog might
> occasionally mark TSC as unstable and fall back to hpet which will
> have a signficant impact on system performance.

We've seen similar cases while running 'netperf' and 'lockbus/ioport'
cases of 'stress-ng' tool.

In those scenarios, the clocksource used by kernel is tsc, while
hpet is used as watchdog. And when the "screwing" happens, we found
mostly it's the hpet's 'fault', that when system is under extreme
pressure, the read of hpet could take a long time, and even 2
consecutive read of hpet will have a big gap (up to 1ms+) in between.
So the screw we saw is actually caused by hpet instead of tsc, as
tsc read is a lightweight cpu operation

I tried the following patch to detect the screw of watchdog itself,
and avoid wrongly judging the tsc to be unstable. It does help in
our tests, please help to review.

And one futher idea is to also adding 2 consecutive read of current
clocksource, and compare its gap with watchdog's, and skip the check
if the watchdog's is bigger.