Re: [PATCH V5 2/2] genirq/affinity: Spread vectors on node according to nr_cpu ratio

From: Derrick, Jonathan
Date: Fri Aug 16 2019 - 12:57:38 EST


On Fri, 2019-08-16 at 09:53 -0600, Keith Busch wrote:
> On Thu, Aug 15, 2019 at 07:28:49PM -0700, Ming Lei wrote:
> > Now __irq_build_affinity_masks() spreads vectors evenly per node, and
> > all vectors may not be spread in case that each numa node has different
> > CPU number, then the warning in irq_build_affinity_masks() can
> > be triggered.
> >
> > Improve current spreading algorithm by assigning vectors according to
> > the ratio of node's nr_cpu to nr_remaining_cpus, meantime running the
> > assignment from smaller nodes to bigger nodes to guarantee that every
> > active node gets allocated at least one vector, then we can avoid
> > cross-node spread in normal situation.
> >
> > Meantime the reported warning can be fixed.
> >
> > Another big goodness is that the spread approach becomes more fair if
> > node has different CPU number.
> >
> > For example, on the following machine:
> > [root@ktest-01 ~]# lscpu
> > ...
> > CPU(s): 16
> > On-line CPU(s) list: 0-15
> > Thread(s) per core: 1
> > Core(s) per socket: 8
> > Socket(s): 2
> > NUMA node(s): 2
> > ...
> > NUMA node0 CPU(s): 0,1,3,5-9,11,13-15
> > NUMA node1 CPU(s): 2,4,10,12
> >
> > When driver requests to allocate 8 vectors, the following spread can
> > be got:
> > irq 31, cpu list 2,4
> > irq 32, cpu list 10,12
> > irq 33, cpu list 0-1
> > irq 34, cpu list 3,5
> > irq 35, cpu list 6-7
> > irq 36, cpu list 8-9
> > irq 37, cpu list 11,13
> > irq 38, cpu list 14-15
> >
> > Without this patch, kernel warning is triggered on above situation, and
> > allocation result was supposed to be 4 vectors for each node.
> >
> > Cc: Christoph Hellwig <hch@xxxxxx>
> > Cc: Keith Busch <kbusch@xxxxxxxxxx>
> > Cc: linux-nvme@xxxxxxxxxxxxxxxxxxx,
> > Cc: Jon Derrick <jonathan.derrick@xxxxxxxxx>
> > Cc: Jens Axboe <axboe@xxxxxxxxx>
> > Reported-by: Jon Derrick <jonathan.derrick@xxxxxxxxx>
> > Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxx>
>
> I had every intention to thoroughly test this on imbalanced node
> configurations, but that's not going to happen anytime soon. It looks
> correct to me, so I'll append my review here.
>
I can only test this with 2 nodes but I have varied nr_cpus as well as
using different devices with fewer and more vectors than CPUs. Spread
looks good.

Thank you

Reviewed-by: Jon Derrick <jonathan.derrick@xxxxxxxxx>


[snip]

Attachment: smime.p7s
Description: S/MIME cryptographic signature