Re: contention on pwq->pool->lock under heavy NFS workload

From: Tejun Heo
Date: Wed Jun 21 2023 - 17:28:51 EST


Hello,

On Wed, Jun 21, 2023 at 03:26:22PM +0000, Chuck Lever III wrote:
> lock_stat reports that the pool->lock kernel/workqueue.c:1483 is the highest
> contended lock on my test NFS client. The issue appears to be that the three
> NFS-related workqueues, rpciod_workqueue, xprtiod_workqueue, and nfsiod all
> get placed in the same worker_pool, so they have to fight over one pool lock.
>
> I notice that ib_comp_wq is allocated with the same flags, but I don't see
> significant contention there, and a trace_printk in __queue_work shows that
> work items queued on that WQ seem to alternate between at least two different
> worker_pools.
>
> Is there a preferred way to ensure the NFS WQs get spread a little more fairly
> amongst the worker_pools?

Can you share the output of lstopo on the test machine?

The following branch has pending workqueue changes which makes unbound
workqueues finer grained by default and a lot more flexible in how they're
segmented.

git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git affinity-scopes-v2

Can you please test with the brnach? If the default doesn't improve the
situation, you can set WQ_SYSFS on the affected workqueues and change their
scoping by writing to /sys/devices/virtual/WQ_NAME/affinity_scope. Please
take a look at

https://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git/tree/Documentation/core-api/workqueue.rst?h=affinity-scopes-v2#n350

for more details.

Thanks.

--
tejun