Re: [PATCH] x86/PCI: Convert force_disable_hpet() to standard quirk

From: Yu Liao
Date: Thu Sep 29 2022 - 11:53:59 EST


On 2020/12/2 15:28, Zhang Rui wrote:
> On Mon, 2020-11-30 at 20:21 +0100, Thomas Gleixner wrote:
>> Feng,
>>
>> On Fri, Nov 27 2020 at 14:11, Feng Tang wrote:
>>> On Fri, Nov 27, 2020 at 12:27:34AM +0100, Thomas Gleixner wrote:
>>>> On Thu, Nov 26 2020 at 09:24, Feng Tang wrote:
>>>> Yes, that can happen. But OTOH, we should start to think about
>>>> the
>>>> requirements for using the TSC watchdog.
>
> My original proposal is to disable jiffies and refined-jiffies as the
> clocksource watchdog, because they are not reliable and it's better to
> use clocksource that has a hardware counter as watchdog, like the patch
> below, which I didn't sent out for upstream.
>
>>From cf9ce0ecab8851a3745edcad92e072022af3dbd9 Mon Sep 17 00:00:00 2001
> From: Zhang Rui <rui.zhang@xxxxxxxxx>
> Date: Fri, 19 Jun 2020 22:03:23 +0800
> Subject: [RFC PATCH] time/clocksource: do not use refined-jiffies as watchdog
>
> On IA platforms, if HPET is disabled, either via x86 early-quirks, or
> via kernel commandline, refined-jiffies will be used as clocksource
> watchdog in early boot phase, before acpi_pm timer registered.
>
> This is not a problem if jiffies are accurate.
> But in some cases, for example, when serial console is enabled, it may
> take several milliseconds to write to the console, with irq disabled,
> frequently. Thus many ticks may become longer than it should be.
>
> Using refined-jiffies as watchdog in this case breaks the system because
> a) duration calculated by refined-jiffies watchdog is always consistent
> with the watchdog timeout issued using add_timer(), say, around 500ms.
> b) duration calculated by the running clocksource, usually TSC on IA
> platforms, reflects the real time cost, which may be much larger.
> This results in the running clocksource being disabled erroneously.
>
> This is reproduced on ICL because HPET is disabled in x86 early-quirks,
> and also reproduced on a KBL and a WHL platform when HPET is disabled
> via command line.
>
> BTW, commit fd329f276eca
> ("x86/mtrr: Skip cache flushes on CPUs with cache self-snooping") is
> another example that refined-jiffies causes the same problem when ticks
> become slow for some other reason.

Hi, Zhang Rui, we have met the same problem as you mentioned above. I have
tested the following modification. It can solve the problem. Do you have plan
to push it to upstream ?

Thanks,
Liao Yu

>
> IMO, the right solution is to only use hardware clocksource as watchdog.
> Then even if ticks are slow, both the running clocksource and the watchdog
> returns real time cost, and they still match.
>
> Signed-off-by: Zhang Rui <rui.zhang@xxxxxxxxx>
> ---
> kernel/time/clocksource.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 02441ead3c3b..e7e703858fa6 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -364,6 +364,10 @@ static void clocksource_select_watchdog(bool fallback)
> watchdog = NULL;
>
> list_for_each_entry(cs, &clocksource_list, list) {
> + /* Do not use refined-jiffies as clocksource watchdog */
> + if (cs->rating <= 2)
> + continue;
> +
> /* cs is a clocksource to be watched. */
> if (cs->flags & CLOCK_SOURCE_MUST_VERIFY)
> continue;