Re: [PATCH 2/3] watchdog/softlockup: report the most time-consuming hardirq

From: Doug Anderson
Date: Wed Feb 14 2024 - 18:40:16 EST


Hi,

On Wed, Feb 14, 2024 at 3:36 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> On Tue, Jan 23 2024 at 20:12, Bitao Hu wrote:
> > +/*
> > + * If the proportion of time spent handling irq exceeds 50% during a sampling period,
> > + * then it is necessary to tally the handling time of each irq.
> > + */
> > +static inline bool need_trace_irqtime(int type)
> > +{
> > + int tail = this_cpu_read(cpustat_tail);
> > +
> > + if (--tail == -1)
> > + tail = 4;
> > + return this_cpu_read(cpustat_diff[tail][type]) > sample_period/2;
> > +}
> > +
> > +static void irq_handler_entry_callback(void *data, int irq, struct irqaction *action)
> > +{
> > + this_cpu_ptr(irq_to_desc(irq)->irq_time)->start_time = local_clock();
> > +}
> > +
> > +static void irq_handler_exit_callback(void *data, int irq, struct irqaction *action, int ret)
> > +{
> > + u64 delta;
> > + struct per_irqtime *irq_time;
> > +
> > + if (test_bit(SOFTLOCKUP_HARDIRQ, this_cpu_ptr(&softlockup_flags))) {
> > + irq_time = this_cpu_ptr(irq_to_desc(irq)->irq_time);
> > + delta = local_clock() - irq_time->start_time;
> > + irq_time->delta += delta;
> > + }
> > +}
>
> I can understand what you are trying to achieve, but you inflict the
> overhead of two tracepoints to every interrupt unconditionally.
>
> For the vast majority of usage scenarios that's just pointless overhead
> for no value. That's not acceptable at all.
>
> Instrumentation and debugging has to find the least intrusive solution
> and not just go for the perfect picture. Remember: perfect is the enemy
> of good.
>
> You really have to think hard about what you really need to achieve for
> a particular problem case.
>
> In this case it's completely sufficient to know the number of interrupts
> which happened since softlockup detection took a snapshot and the actual
> analysis.
>
> That's especially true when interrupt time accounting is active because
> then the only interesting information is which interrupts fired during
> the observation period.
>
> Even if that's not available it is a reasonable assumption that the
> number of interrupts during the observation period gives a pretty
> conclusive hint about the potential cause of the problem.
>
> This is not meant to find the oddball issue of an interrupt handler
> which consumes a lot of time per invocation. This is about storm
> detection where the handler runs briefly and actually returns
> IRQ_HANDLED so that the spurious detection does not catch it.
>
> This can be implemented with exactly zero overhead for the good case,
> keeping it self contained to the interrupt core code and providing
> sensible interfaces for the watchdog code.
>
> See the uncompiled patch for illustration below.
>
> As a side note: While C does not allow proper encapsulation it's a non
> starter to fiddle with the interrupt descriptor internals in random code
> just because the compiler allows you to do so. While not enforced there
> are clear boundaries and we went a long way to encapsulate this.

I think you must have gotten dropped from all the future versions of
this patch series when Bitao took my advice and started using
interrupt counts instead of tracing. For what it's worth, the latest
version can be found at:

https://lore.kernel.org/r/20240214021430.87471-1-yaoma@xxxxxxxxxxxxxxxxx