Re: [PATCH] nvme/tcp: Add support to set the tcp worker cpu affinity

From: Sagi Grimberg
Date: Wed Apr 19 2023 - 05:32:52 EST



Hey Li,

The default worker affinity policy is using all online cpus, e.g. from 0
to N-1. However, some cpus are busy for other jobs, then the nvme-tcp will
have a bad performance.
This patch adds a module parameter to set the cpu affinity for the nvme-tcp
socket worker threads. The parameter is a comma separated list of CPU
numbers. The list is parsed and the resulting cpumask is used to set the
affinity of the socket worker threads. If the list is empty or the
parsing fails, the default affinity is used.

I can see how this may benefit a specific set of workloads, but I have a
few issues with this.

- This is exposing a user interface for something that is really
internal to the driver.

- This is something that can be misleading and could be tricky to get
right, my concern is that this would only benefit a very niche case.
Our storage products needs this feature~
If the user doesn’t know what this is, they can keep it default, so I thinks this is
not unacceptable.

It doesn't work like that. A user interface is not something exposed to
a specific consumer.

- If the setting should exist, it should not be global.
V2 has fixed it.

- I prefer not to introduce new modparams.

- I'd prefer to find a way to support your use-case without introducing
a config knob for it.

I’m looking forward to it.

If you change queue_work_on to queue_work, ignoring the io_cpu, does it
address your problem?

Not saying that this should be a solution though.

How many queues does your controller support that you happen to use
queue 0 ?

Also, what happens if you don't pin your process to a specific cpu, does
that change anything?