Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0

From: jianchao.wang
Date: Wed Feb 28 2018 - 10:54:49 EST

Next message: Ben Hutchings: "[PATCH 3.16 195/254] net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()"
Previous message: Ben Hutchings: "[PATCH 3.16 236/254] [media] V4L2: fix VIDIOC_CREATE_BUFS 32-bit compatibility mode data copy-back"
In reply to: jianchao.wang: "Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0"
Next in thread: Keith Busch: "Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 02/28/2018 11:42 PM, jianchao.wang wrote:
> Hi Keith
>
> Thanks for your kindly response and directive
>
> On 02/28/2018 11:27 PM, Keith Busch wrote:
>> On Wed, Feb 28, 2018 at 10:53:31AM +0800, jianchao.wang wrote:
>>> On 02/27/2018 11:13 PM, Keith Busch wrote:
>>>> On Tue, Feb 27, 2018 at 04:46:17PM +0800, Jianchao Wang wrote:
>>>>> Currently, adminq and ioq0 share the same irq vector. This is
>>>>> unfair for both amdinq and ioq0.
>>>>> - For adminq, its completion irq has to be bound on cpu0.
>>>>> - For ioq0, when the irq fires for io completion, the adminq irq
>>>>> action has to be checked also.
>>>>
>>>> This change log could use some improvements. Why is it bad if admin
>>>> interrupts affinity is with cpu0?
>>>
>>> adminq interrupts should be able to fire everywhere.
>>> do we have any reason to bound it on cpu0 ?
>>
>> Your patch will have the admin vector CPU affinity mask set to
>> 0xff..ff. The first set bit for an online CPU is the one the IRQ handler
>> will run on, so the admin queue will still only run on CPU 0.
>
> hmmm...yes.
> When I test there is only one irq vector, I get following result:
> 124: 0 0 253541 0 0 0 0 0 IR-PCI-MSI 1048576-edge nvme0q0, nvme0q1
>

the irqbalance may migrate the adminq irq away from cpu0.

>>
>>>> Are you able to measure _any_ performance difference on IO queue 1 vs IO
>>>> queue 2 that you can attribute to IO queue 1's sharing vector 0?
>>>
>>> Actually, I didn't get any performance improving on my own NVMe card.
>>> But it may be needed on some enterprise card, especially the media is persist memory.
>>> nvme_irq will be invoked twice when ioq0 irq fires, this will introduce another unnecessary DMA
>>> accessing on cq entry.
>>
>> A CPU reading its own memory isn't a DMA. It's just a cheap memory read.
>
> Oh sorry, my bad, I mean it is operation on DMA address, it is uncached.
> nvme_irq
> -> nvme_process_cq
> -> nvme_read_cqe
> -> nvme_cqe_valid
>
> static inline bool nvme_cqe_valid(struct nvme_queue *nvmeq, u16 head,
> u16 phase)
> {
> return (le16_to_cpu(nvmeq->cqes[head].status) & 1) == phase;
> }
>
> Sincerely
> Jianchao
>

Next message: Ben Hutchings: "[PATCH 3.16 195/254] net: ipv4: emulate READ_ONCE() on ->hdrincl bit-field in raw_sendmsg()"
Previous message: Ben Hutchings: "[PATCH 3.16 236/254] [media] V4L2: fix VIDIOC_CREATE_BUFS 32-bit compatibility mode data copy-back"
In reply to: jianchao.wang: "Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0"
Next in thread: Keith Busch: "Re: [PATCH] nvme-pci: assign separate irq vectors for adminq and ioq0"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]