Re: [RFC 5/5] workqueue: Print backtraces from CPUs with hung CPU bound workqueues

From: Petr Mladek
Date: Fri Feb 03 2023 - 09:27:08 EST

Next message: Hyeonggon Yoo: "Re: [PATCH] mm/slub: fix memory leak with using debugfs_lookup()"
Previous message: Matthew Rosato: "Re: [PATCH v3] vfio: fix deadlock between group lock and kvm lock"
In reply to: Tejun Heo: "Re: [RFC 5/5] workqueue: Print backtraces from CPUs with hung CPU bound workqueues"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello,

On Thu 2023-02-02 13:45:05, Tejun Heo wrote:
> > +static bool show_pool_suspicious_workers(struct worker_pool *pool, bool shown_title)
> > +{
> > + bool shown_any = false;
> > + struct worker *worker;
> > + unsigned long flags;
> > + int bkt;
> > +
> > + raw_spin_lock_irqsave(&pool->lock, flags);
> > +
> > + if (pool->cpu < 0)
> > + goto out;
>
> This can be tested before grabbling the lock.

I see.

> > + if (!per_cpu(wq_watchdog_hung_cpu, pool->cpu))
> > + goto out;
>
> Given that the state is per-pool, would it make sense to mark this on the
> pool instead?

Makes sense. I think that I started with the per-CPU variable before
I sorted my thoughts about what backtraces were useful ;-)

> > + if (list_empty(&pool->worklist))
> > + goto out;
>
> This condition isn't really necessary, right?

IMHO, it should be there. The watchdog reports the problem only when
there are pending work items, see

if (list_empty(&pool->worklist))
continue;

in wq_watchdog_timer_fn().

My understanding is that it is OK to process work items longer
time when they are sleeping and waiting for something.

Best Regards,
Petr

Next message: Hyeonggon Yoo: "Re: [PATCH] mm/slub: fix memory leak with using debugfs_lookup()"
Previous message: Matthew Rosato: "Re: [PATCH v3] vfio: fix deadlock between group lock and kvm lock"
In reply to: Tejun Heo: "Re: [RFC 5/5] workqueue: Print backtraces from CPUs with hung CPU bound workqueues"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]