Re: [PATCH] mark power efficient workqueue as unbounded if nohz_full enabled

From: Tejun Heo
Date: Tue Jan 23 2024 - 14:03:42 EST


Hello, Marcelo.

On Mon, Jan 22, 2024 at 11:22:10AM -0300, Marcelo Tosatti wrote:
> About the performance difference (of running locally VS running
> remotely), can you list a few performance sensitive work queues
> (where per-CPU execution makes a significant difference).

Unfortunately, I have no idea. It goes way back and I'm not sure anyone
actually tested the difference in a long time. We'd have to dig through
history to gather some context, set up a benchmark which exercises the path
heavily and see whether the difference is still there.

> Because i suppose it would be safe (from a performance regression
> perspective) to move all delayed works to housekeeping CPUs.

Yeah, replacing power_efficient with unbound should be safe.

> And also, being more extreme, why not an option to mark all workqueues
> as unbounded (or perhaps userspace control of bounding, even for
> workqueues marked as "per-CPU").

There are correctness issues with per-cpu workqueues - e.g. accessing local
atomic counters, cpu states and what not. Also, many per-cpu users already
know that the cpu is hot as they're queueing on the local CPU. I'm not
against moving more users towards unbound workqueues but that'd have be done
case by case unfortunately.

Thanks.

--
tejun