Re: Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

From: Geert Uytterhoeven
Date: Tue Jul 11 2023 - 10:06:41 EST


On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Tejun,
>
> On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
> > Workqueue now automatically marks per-cpu work items that hog CPU for too
> > long as CPU_INTENSIVE, which excludes them from concurrency management and
> > prevents stalling other concurrency-managed work items. If a work function
> > keeps running over the thershold, it likely needs to be switched to use an
> > unbound workqueue.
> >
> > This patch adds a debug mechanism which tracks the work functions which
> > trigger the automatic CPU_INTENSIVE mechanism and report them using
> > pr_warn() with exponential backoff.
> >
> > v2: Drop bouncing through kthread_worker for printing messages. It was to
> > avoid introducing circular locking dependency but wasn't effective as it
> > still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
> > print directly using printk_deferred().
> >
> > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>
> Thanks for your patch, which is now commit 6363845005202148
> ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
> mechanism") in v6.5-rc1.
>
> I guess you are interested to know where this triggers.
> I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
> the result on various machines...

> OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb:
>
> workqueue: check_lifetime hogged CPU for >10000us 4 times, consider
> switching to WQ_UNBOUND
> workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024
> times, consider switching to WQ_UNBOUND
> workqueue: fb_flashcursor hogged CPU for >10000us 128 times,
> consider switching to WQ_UNBOUND
> workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times,
> consider switching to WQ_UNBOUND
> workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider
> switching to WQ_UNBOUND

Got one more after a while:

workqueue: neigh_managed_work hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds