Re: [PATCH] hrtimer: CPU and entry_time is added to a warning message in hrtimer_interrupt()

From: Rei Yamamoto
Date: Fri Dec 02 2022 - 05:20:44 EST


Hi Thomas Gleixner,

Thanks for your reply.

On Thu, 01 Dec 2022 22:30:41, Thomas Gleixner wrote:
> On Fri, Jun 24 2022 at 16:00, Rei Yamamoto wrote:
>> A warning message in hrtimer_interrupt() is output up to 5 times
>> by default, and CPU and entry_time are also shown.
>
> This describes to some extent _what_ the patch is doing, but not the
> why.
>
>> These changes are helpful that the function spending a lot of time is clear
>> by using ftrace:
>
> That's a constructed case. There are multiple reasons why this can
> happen, not just because a single hrtimer callback misbehaves.
>
>> @@ -2038,6 +2039,15 @@ static struct ctl_table kern_table[] = {
>> .extra1 = SYSCTL_ONE,
>> .extra2 = SYSCTL_INT_MAX,
>> },
>> +#endif
>> +#ifdef CONFIG_HIGH_RES_TIMERS
>> + {
>> + .procname = "hrtimer_interrupt_warnings",
>> + .data = &sysctl_hrtimer_interrupt_warnings,
>> + .maxlen = sizeof(int),
>> + .mode = 0644,
>> + .proc_handler = proc_dointvec,
>> + },
>
> So this adds a new sysctl, but the changelog does not tell anything
> about it. Aside of the dubious value of this sysctl, this lacks the
> required documentation for new sysctls.
>
>> + /*
>> + * If a message is output many times, the delayed funciton
>> + * may be identified by resetting sysctl_hrtimer_interrupt_warnings
>> + * and enabling ftrace.
>
> What has the reset of sysctl_hrtimer_interrupt_warnings to do with
> ftrace and how is that reset helpful to identify the root cause?
>
> Also repeating the printk 5 times does not add any value at all. The
> runaway detection already has logic to supress spurious events and if
> the problem persists then it can be observed by ftrace without any of
> these changes.

If the system performance degradation exists and this message is output
some times at the same time, my idea was that the reset of
sysctl_hrtimer_interrupt_warnings helps to check whether hrtimer event
is related to the degradation.
However, as you said, it can be observed by ftrace.
Therefore, I withdraw an opinion to add hrtimer_interrupt_warnings as
the sysctl parameter.

> I assume - because you did not tell so - that you try to have a
> correlation between ftrace and dmesg via the entry timestamp output,
> right?

Yes, that's right.

> That's just a half thought out debug bandaid, really.
>
> You can provide a way better mechanism by adding a tracepoint right at
> the pr_warn_once(), which emits information for correlation right into
> the trace.
>
> That allows you to stop the trace once the tracepoint is emitted instead
> of having to do all of this including the correlation manually.

OK, I will recreate the patch with following your idea.

Thanks.
Rei