vfio_pin_map_dma cause synchronize_sched wait too long

From: Longpeng (Mike)
Date: Mon Dec 02 2019 - 04:10:53 EST


Hi guys,

Suppose there're two VMs: VM1 is bind to node-0 and calling vfio_pin_map_dma(),
VM2 is a migrate incoming VM which bind to node-1. We found the vm_start( QEMU
function) of VM2 will take too long occasionally, the reason is as follow.

- VM2 -
qemu: vm_start
vm_start_notify
virtio_vmstate_change
virtio_pci_vmstate_change
virtio_pci_start_ioeventfd
virtio_device_start_ioeventfd_impl
event_notifier_init
eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC) <-- too long
kern: sys_eventfd2
get_unused_fd_flags
__alloc_fd
expand_files
expand_fdtable
synchronize_sched <-- too long

- VM1 -
The VM1 is doing vfio_pin_map_dma at the same time.

The CPU must finish vfio_pin_map_dma and then rcu-sched grace period can be
elapsed, so synchronize_sched would wait for a long time.

Is there any solution to this ? Any suggestion would be greatly appreciated, thanks!

--
Regards,
Longpeng(Mike)