[BUG] vhost: two possible deadlocks involving locking and waiting

From: Jia-Ju Bai
Date: Tue Feb 01 2022 - 02:27:59 EST


Hello,

My static analysis tool reports two possible deadlocks in the vhost driver in Linux 5.16:

#BUG 1
vhost_net_set_backend()
  mutex_lock(&n->dev.mutex); --> Line 1511(Lock A)
  vhost_net_ubuf_put_wait_and_free()
    vhost_net_ubuf_put_and_wait()
    wait_event(ubufs->wait ...); --> Line 260 (Wait X)

vhost_net_ioctl()
  mutex_lock(&n->dev.mutex); --> Line 1734 (Lock A)
  vhost_net_flush()
    vhost_net_ubuf_put_and_wait()
      vhost_net_ubuf_put()
        wake_up(&ubufs->wait); --> Line 253 (Wake X)

When vhost_net_set_backend() is executed, "Wait X" is performed by holding "Lock A". If vhost_net_ioctl() is executed at this time, "Wake X" cannot be performed to wake up "Wait X" in vhost_net_set_backend(), because "Lock A" has been already hold by vhost_net_set_backend(), causing a possible deadlock.

#BUG2
vhost_net_set_backend()
  mutex_lock(&vq->mutex); --> Line 1522(Lock A)
  vhost_net_ubuf_put_wait_and_free()
    vhost_net_ubuf_put_and_wait()
    wait_event(ubufs->wait ...); --> Line 260 (Wait X)

handle_tx()
  mutex_lock_nested(&vq->mutex, ...); --> Line 966 (Lock A)
  handle_tx_zerocopy()
    vhost_net_ubuf_put()
      wake_up(&ubufs->wait); --> Line 253 (Wake X)

When vhost_net_set_backend() is executed, "Wait X" is performed by holding "Lock A". If handle_tx() is executed at this time, "Wake X" cannot be performed to wake up "Wait X" in vhost_net_set_backend(), because "Lock A" has been already hold by vhost_net_set_backend(), causing a possible deadlock.

I am not quite sure whether these possible problems are real and how to fix them if they are real.
Any feedback would be appreciated, thanks :)


Best wishes,
Jia-Ju Bai