Re: contention on pwq->pool->lock under heavy NFS workload

From: Chuck Lever III
Date: Thu Jun 22 2023 - 15:39:21 EST




> On Jun 22, 2023, at 3:23 PM, Tejun Heo <tj@xxxxxxxxxx> wrote:
>
> Hello,
>
> On Thu, Jun 22, 2023 at 03:45:18PM +0000, Chuck Lever III wrote:
>> The good news:
>>
>> On stock 6.4-rc7:
>>
>> fio 8k [r=108k,w=46.9k IOPS]
>>
>> On the affinity-scopes-v2 branch (with no other tuning):
>>
>> fio 8k [r=130k,w=55.9k IOPS]
>
> Ah, okay, that's probably coming from per-cpu pwq. Didn't expect that to
> make that much difference but that's nice.

"cpu" and "smt" work equally well on this system.

"cache", "numa", and "system" work equally poorly.

I have HT disabled, and there's only one NUMA node, so
the difference here is plausible.


>> The bad news:
>>
>> pool->lock is still the hottest lock on the system during the test.
>>
>> I'll try some of the alternate scope settings this afternoon.
>
> Yeah, in your system, there's still gonna be one pool shared across all
> CPUs. SMT or CPU may behave better but it might make sense to add a way to
> further segment the scope so that e.g. one can split a cache domain N-ways.

If there could be more than one pool to choose from, then these
WQs would not be hitting the same lock. Alternately, finding a
lockless way to queue the work on a pool would be a huge win.


--
Chuck Lever