Re: 5.13-rt1 + KVM = WARNING: at fs/eventfd.c:74 eventfd_signal()

From: Jason Wang
Date: Wed Jul 14 2021 - 05:23:15 EST



在 2021/7/14 下午4:10, Paolo Bonzini 写道:
On 14/07/21 10:01, Daniel Bristot de Oliveira wrote:
Hey

I use kvm-vm for regular development, and while using the kernel-rt v5.13-rt1
(the latest) on the host, and a regular kernel on the guest, after a while,
this happens:

[ 1723.404979] ------------[ cut here ]------------
[ 1723.404981] WARNING: CPU: 12 PID: 2554 at fs/eventfd.c:74 eventfd_signal+0x7e/0x90

[ 1723.405055] RIP: 0010:eventfd_signal+0x7e/0x90
[ 1723.405059] Code: 01 00 00 00 be 03 00 00 00 4c 89 ef e8 5b ec d9 ff 65 ff 0d e4 34 c9 5a 4c 89 ef e8 ec a8 86 00 4c 89 e0 5b 5d 41 5c 41 5d c3 <0f> 0b 45 31 e4 5b 5d 4c 89 e0 41 5c 41 5d c3 0f 1f 00 0f 1f 44 00
[ 1723.405078]  vhost_tx_batch.constprop.0+0x7d/0xc0 [vhost_net]
[ 1723.405083]  handle_tx_copy+0x15b/0x5c0 [vhost_net]
[ 1723.405088]  ? __vhost_add_used_n+0x200/0x200 [vhost]
[ 1723.405092]  handle_tx+0xa5/0xe0 [vhost_net]
[ 1723.405095]  vhost_worker+0x93/0xd0 [vhost]
[ 1723.405099]  kthread+0x186/0x1a0
[ 1723.405103]  ? __kthread_parkme+0xa0/0xa0
[ 1723.405105]  ret_from_fork+0x22/0x30
[ 1723.405110] ---[ end trace 0000000000000002 ]---

The WARN has this comment above:

        /*
         * Deadlock or stack overflow issues can happen if we recurse here
         * through waitqueue wakeup handlers. If the caller users potentially
         * nested waitqueues with custom wakeup handlers, then it should
         * check eventfd_signal_count() before calling this function. If
         * it returns true, the eventfd_signal() call should be deferred to a
         * safe context.
         */

This was added in 2020, so it's unlikely to be the direct cause of the
change.  What is a known-good version for the host?

Since it is not KVM stuff, I'm CCing Michael and Jason.

Paolo


I think this can be probably fixed here:

https://lore.kernel.org/lkml/20210618084412.18257-1-zhe.he@xxxxxxxxxxxxx/

Thanks