Re: [PATCH] nvme/tcp: Add support to set the tcp worker cpu affinity

From: Hannes Reinecke
Date: Wed Apr 26 2023 - 07:31:18 EST


On 4/25/23 10:32, Li Feng wrote:
Hi Sagi,

On Wed, Apr 19, 2023 at 5:32 PM Sagi Grimberg <sagi@xxxxxxxxxxx> wrote:


Hey Li,

The default worker affinity policy is using all online cpus, e.g. from 0
to N-1. However, some cpus are busy for other jobs, then the nvme-tcp will
have a bad performance.
This patch adds a module parameter to set the cpu affinity for the nvme-tcp
socket worker threads. The parameter is a comma separated list of CPU
numbers. The list is parsed and the resulting cpumask is used to set the
affinity of the socket worker threads. If the list is empty or the
parsing fails, the default affinity is used.

I can see how this may benefit a specific set of workloads, but I have a
few issues with this.

- This is exposing a user interface for something that is really
internal to the driver.

- This is something that can be misleading and could be tricky to get
right, my concern is that this would only benefit a very niche case.
Our storage products needs this feature~
If the user doesn’t know what this is, they can keep it default, so I thinks this is
not unacceptable.

It doesn't work like that. A user interface is not something exposed to
a specific consumer.

- If the setting should exist, it should not be global.
V2 has fixed it.

- I prefer not to introduce new modparams.

- I'd prefer to find a way to support your use-case without introducing
a config knob for it.

I’m looking forward to it.

If you change queue_work_on to queue_work, ignoring the io_cpu, does it
address your problem?
Sorry for the late response, I just got my machine back.
Replace the queue_work_on to queue_work, looks like it has a little
good performance.
The busy worker is `kworker/56:1H+nvme_tcp_wq`, and fio binds to
90('cpus_allowed=90'),
I don't know why the worker 56 is selected.
The performance of 256k read up from 1.15GB/s to 1.35GB/s.


Not saying that this should be a solution though.

How many queues does your controller support that you happen to use
queue 0 ?
Our controller only support one io queue currently.

Ouch.
Remember, NVMe gets most of the performance improvements by using several queues, and be able to bind the queues to cpu sets.
Exposing just one queue will be invalidating any assumptions we do,
and trying to improve interrupt steering won't work anyway.

I sincerely doubt we should try to 'optimize' for this rather peculiar setup.

Cheers,

Hannes