Re: [PATCHv7 2/2] watchdog/softlockup: report the most frequent interrupts

From: Thomas Gleixner
Date: Thu Feb 15 2024 - 06:30:24 EST


On Wed, Feb 14 2024 at 10:14, Bitao Hu wrote:
> +static void start_counting_irqs(void)
> +{
> + int i;
> + int local_nr_irqs;
> + struct irq_desc *desc;
> + u32 *counts = __this_cpu_read(hardirq_counts);
> +
> + if (!counts) {
> + /*
> + * nr_irqs has the potential to grow at runtime. We should read
> + * it and store locally to avoid array out-of-bounds access.
> + */
> + local_nr_irqs = nr_irqs;
> + counts = kcalloc(local_nr_irqs, sizeof(u32), GFP_ATOMIC);

Seriously? The system has a problem and you allocate memory from the
detection code in hard interrupt context?

> + if (!counts)
> + return;
> +
> + for (i = 0; i < local_nr_irqs; i++) {
> + desc = irq_to_desc(i);
> + if (!desc)
> + continue;
> + counts[i] = desc->kstat_irqs ?
> + *this_cpu_ptr(desc->kstat_irqs) : 0;
> + }

This code has absolutely no business to access an interrupt
descriptor. There is an existing interface to retrieve the stats.

Also iterating one by one over the total number of interrupts is a
complete waste as the interrupt number space is sparse.

> + for (i = 0; i < NUM_HARDIRQ_REPORT; i++) {
> + if (irq_counts_sorted[i].irq == -1)
> + break;
> +
> + desc = irq_to_desc(irq_counts_sorted[i].irq);
> + if (desc && desc->action)
> + printk(KERN_CRIT "\t#%u: %-10u\tirq#%d(%s)\n",
> + i + 1, irq_counts_sorted[i].counts,
> + irq_counts_sorted[i].irq, desc->action->name);

You cannot dereference desc->action here:

1) It can be NULL'ed between check and dereference.

2) Both 'action' and 'action->name' can be freed in parallel

And no, you cannot take desc->lock here to prevent this. Stop fiddling
in the internals of interrupt descriptors.


See my reply on V1 how the stats can be done. That neither needs a
memory allocation nor the local_nr_irqs heuristics and just can use
proper interfaces.

Your initialization code then becomes:

if (!this_cpu_read(snapshot_taken)) {
kstat_snapshot_irqs();
this_cpu_write(snapshot_taken, true);
}

and the analysis boils down to:

u64 cnt, sorted[3] = {};
unsigned int irq, i;

for_each_active_irq(irq) {
cnt = kstat_get_irq_since_snapshot(irq);

if (cnt) {
for (cnt = (cnt << 32) + irq, i = 0; i < 3; i++) {
if (cnt > sorted[i])
swap(cnt, sorted[i]);
}
}
}

Resetting the thing just becomes:

this_cpu_write(snapshot_taken, false);

No allocation/free, no bound checks, proper abstractions. See?

Thanks,

tglx