Re: [RFC PATCH v1 1/1] wq: Avoid using isolated cpus' timers on unbounded queue_delayed_work

From: Leonardo Bras
Date: Wed Jan 24 2024 - 20:46:24 EST


On Wed, Jan 24, 2024 at 11:47:29AM -1000, Tejun Heo wrote:
> On Wed, Jan 24, 2024 at 05:29:37AM -0300, Leonardo Bras wrote:
> > + /*
> > + * If the work is cpu-unbound, and cpu isolation is in place, only
> > + * schedule use timers from housekeeping cpus. In favor of avoiding
> > + * cacheline bouncing, run the WQ in the same cpu as the timer.
> > + */
> > + if (cpu == WORK_CPU_UNBOUND && housekeeping_enabled(HK_TYPE_TIMER))
> > + cpu = housekeeping_any_cpu(HK_TYPE_TIMER);
>
> Would it make more sense to use wq_unbound_cpumask?

Hello Tejun, thank you for this reply!

That's a good suggestion, but looking at workqueue_init_early() I see that,
in short:
wq_unbound_cpumask = cpu_possible_mask &
housekeeping_cpumask(HK_TYPE_WQ) &
housekeeping_cpumask(HK_TYPE_DOMAIN) &
wq_cmdline_cpumask

So wq_unbound_cpumask relates to domain and workqueue cpu isolation.

In our case, we are using this to choose in which cpu is the timer we want
to use, so it makes sense to use timer-related cpu isolation, instead.

As of today, your suggestion would work the same, as the only way to enable
WQ cpu isolation is to use nohz_full, which also enables TIMER cpu
isolation. But since that can change in the future, for any reason, I would
suggest that we stick to using the HK_TYPE_TIMER cpumask.

I can now notice that this can end up introducing an issue: possibly
running on a workqueue on a cpu outside of a valid wq_cmdline_cpumask.

I would suggest fixing this in a couple ways:
1 - We introduce a new cpumask which is basically
housekeeping_cpumask(HK_TYPE_DOMAIN) & wq_cmdline_cpumask, allowing us
to keep the timer interrupt in the same cpu as the scheduled function,
2- We use the resulting cpu only to pick the right timer.

What are your thouhts on that?

Thank you!
Leo