RE: [PATCH RT] vfio-pci: Set MSI/MSI-X ISR to non-threaded

From: Su, David W
Date: Wed Dec 06 2017 - 20:27:10 EST


From: Steven Rostedt [mailto:rostedt@xxxxxxxxxxx]
Sent: Thursday, November 30, 2017 6:09 PM
>
>On Thu, 30 Nov 2017 17:05:35 -0800
>David Su <david.w.su@xxxxxxxxx> wrote:
>
>> Setting MSI/MSI-X ISR to be non-threaded will result in shorter and more
>> deterministic IRQ delivery latencies to VFIO applications, because
>> context switches to the ISR thread are eliminated. This is important
>> for applications with low latency requirement running in virtual
>> machines on RT Linux host with assigned devices through vfio-pci.
>>
>> A FPGA based interrupt testing device was used to compare latencies with
>> threaded and non-threaded vfio-pci ISR. The device has a free running
>> time stamp counter and a register recording the time an interrupt was
>> sent to the host. With these registers the device driver and test
>> application for the device are able to calculate and record the latency
>> between the time an interrupt was sent and the time the ISR in the
>> device's driver was invoked.
>>
>> The result is with non-threaded vfio-pci ISR the average latency is
>> reduced by about 54% and the maximum-minimum latency range is reduced
>by
>> about 65%.
>>
>> Non-threaded vfio-pci ISR:
>> Minimum 4.18us, Average 4.47us, Maximum 10.26us
>>
>> Threaded vfio-pci ISR:
>> Minimum 8.97us, Average 9.65us, Maximum 26.11us
>>
>> Signed-off-by: David Su <david.w.su@xxxxxxxxx>
>> ---
>> drivers/vfio/pci/vfio_pci_intrs.c | 2 +-
>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c
>b/drivers/vfio/pci/vfio_pci_intrs.c
>> index 1c46045..4c54e56 100644
>> --- a/drivers/vfio/pci/vfio_pci_intrs.c
>> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
>> @@ -333,7 +333,7 @@ static int vfio_msi_set_vector_signal(struct
>vfio_pci_device *vdev,
>> pci_write_msi_msg(irq, &msg);
>> }
>>
>> - ret = request_irq(irq, vfio_msihandler, 0,
>> + ret = request_irq(irq, vfio_msihandler, IRQF_NO_THREAD,
>> vdev->ctx[vector].name, trigger);
>
>Hmm, but we have this:
>static irqreturn_t vfio_msihandler(int irq, void *arg)
>{
> struct eventfd_ctx *trigger = arg;
>
> eventfd_signal(trigger, 1);
> return IRQ_HANDLED;
>}
>
>__u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n)
>{
> unsigned long flags;
>
> spin_lock_irqsave(&ctx->wqh.lock, flags);
> if (ULLONG_MAX - ctx->count < n)
> n = ULLONG_MAX - ctx->count;
> ctx->count += n;
> if (waitqueue_active(&ctx->wqh))
> wake_up_locked_poll(&ctx->wqh, POLLIN);
> spin_unlock_irqrestore(&ctx->wqh.lock, flags);
>
> return n;
>}
>
>And spin_lock() turns into a mutex in PREEMPT_RT, which means it can
>sleep. You can't sleep in hard interrupt context. This will eventually
>crash the kernel.

Steve, thanks for your review and comment.

I can think of 2 scenarios where there is contention for the eventfd
context lock.

One scenario is an eventfd is used to notify a VFIO application of
2 or more IRQs. But in this case the application wouldn't be able to
tell which IRQ occurred and so I think it should be considered a
programming error of the application and not a proper usage of
VFIO.

The other is a device IRQ is configured to be delivered to multiple
CPU cores at the same time. However, I have never seen such a
device and cannot think of any good reason for a device to be
designed this way.

So, IMHO it is safe to set vfio-pci ISR to non-threaded.

>
>And no, we are not going to convert the ctx->wqh.lock into a
>raw_spin_lock.
>
>-- Steve
>
>> if (ret) {
>> kfree(vdev->ctx[vector].name);