Consider switching to WQ_UNBOUND messages (was: Re: [PATCH v2 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism)

From: Geert Uytterhoeven
Date: Tue Jul 11 2023 - 09:55:47 EST


Hi Tejun,

On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@xxxxxxxxxx> wrote:
> Workqueue now automatically marks per-cpu work items that hog CPU for too
> long as CPU_INTENSIVE, which excludes them from concurrency management and
> prevents stalling other concurrency-managed work items. If a work function
> keeps running over the thershold, it likely needs to be switched to use an
> unbound workqueue.
>
> This patch adds a debug mechanism which tracks the work functions which
> trigger the automatic CPU_INTENSIVE mechanism and report them using
> pr_warn() with exponential backoff.
>
> v2: Drop bouncing through kthread_worker for printing messages. It was to
> avoid introducing circular locking dependency but wasn't effective as it
> still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just
> print directly using printk_deferred().
>
> Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
> Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>

Thanks for your patch, which is now commit 6363845005202148
("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE
mechanism") in v6.5-rc1.

I guess you are interested to know where this triggers.
I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested
the result on various machines...

SH/R-Mobile:

workqueue: genpd_power_off_work_fn hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND

Atmark Techno Armadillo800-EVA with shmob_drm:

workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 16
times, consider switching to WQ_UNBOUND

R-Car Gen2:

workqueue: rtc_timer_do_work hogged CPU for >10000us 4 times,
consider switching to WQ_UNBOUND

R-Car Gen2/Gen3:

workqueue: pm_runtime_work hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND

R-Car Gen3:

workqueue: kfree_rcu_work hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND

OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb:

workqueue: check_lifetime hogged CPU for >10000us 4 times, consider
switching to WQ_UNBOUND
workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024
times, consider switching to WQ_UNBOUND
workqueue: fb_flashcursor hogged CPU for >10000us 128 times,
consider switching to WQ_UNBOUND
workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times,
consider switching to WQ_UNBOUND
workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider
switching to WQ_UNBOUND

Atari (ARAnyM):

workqueue: ata_sff_pio_task hogged CPU for >10000us 64 times,
consider switching to WQ_UNBOUND

The OrangeCrab is a slow machine, so it's not that surprising to see these
messages...

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds