[PATCH v2 3/3] hvc/xen: fix console unplug

From: David Woodhouse
Date: Fri Oct 20 2023 - 12:16:08 EST


From: David Woodhouse <dwmw@xxxxxxxxxxxx>

On unplug of a Xen console, xencons_disconnect_backend() unconditionally
calls free_irq() via unbind_from_irqhandler(), causing a warning of
freeing an already-free IRQ:

(qemu) device_del con1
[ 32.050919] ------------[ cut here ]------------
[ 32.050942] Trying to free already-free IRQ 33
[ 32.050990] WARNING: CPU: 0 PID: 51 at kernel/irq/manage.c:1895 __free_irq+0x1d4/0x330

It should be using evtchn_put() to tear down the event channel binding,
and let the Linux IRQ side of it be handled by notifier_del_irq() through
the HVC code.

On which topic... xencons_disconnect_backend() should call hvc_remove()
*first*, rather than tearing down the event channel and grant mapping
while they are in use. And then the IRQ is guaranteed to be freed by
the time it's torn down by evtchn_put().

Since evtchn_put() also closes the actual event channel, avoid calling
xenbus_free_evtchn() except in the failure path where the IRQ was not
successfully set up.

However, calling hvc_remove() at the start of xencons_disconnect_backend()
still isn't early enough. An unplug request is indicated by the backend
setting its state to XenbusStateClosing, which triggers a notification
to xencons_backend_changed(), which... does nothing except set its own
frontend state directly to XenbusStateClosed without *actually* tearing
down the HVC device or, you know, making sure it isn't actively in use.

So the backend sees the guest frontend set its state to XenbusStateClosed
and stops servicing the interrupt... and the guest spins for ever in the
domU_write_console() function waiting for the ring to drain.

Fix that one by calling hvc_remove() from xencons_backend_changed() before
signalling to the backend that it's OK to proceed with the removal.

Tested with 'dd if=/dev/zero of=/dev/hvc1' while telling Qemu to remove
the console device.

Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
drivers/tty/hvc/hvc_xen.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 4a768b504263..34c01874f45b 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -377,18 +377,21 @@ void xen_console_resume(void)
#ifdef CONFIG_HVC_XEN_FRONTEND
static void xencons_disconnect_backend(struct xencons_info *info)
{
- if (info->irq > 0)
- unbind_from_irqhandler(info->irq, NULL);
- info->irq = 0;
+ if (info->hvc != NULL)
+ hvc_remove(info->hvc);
+ info->hvc = NULL;
+ if (info->irq > 0) {
+ evtchn_put(info->evtchn);
+ info->irq = 0;
+ info->evtchn = 0;
+ }
+ /* evtchn_put() will also close it so this is only an error path */
if (info->evtchn > 0)
xenbus_free_evtchn(info->xbdev, info->evtchn);
info->evtchn = 0;
if (info->gntref > 0)
gnttab_free_grant_references(info->gntref);
info->gntref = 0;
- if (info->hvc != NULL)
- hvc_remove(info->hvc);
- info->hvc = NULL;
}

static void xencons_free(struct xencons_info *info)
@@ -553,10 +556,23 @@ static void xencons_backend_changed(struct xenbus_device *dev,
if (dev->state == XenbusStateClosed)
break;
fallthrough; /* Missed the backend's CLOSING state */
- case XenbusStateClosing:
+ case XenbusStateClosing: {
+ struct xencons_info *info = dev_get_drvdata(&dev->dev);;
+
+ /*
+ * Don't tear down the evtchn and grant ref before the other
+ * end has disconnected, but do stop userspace from trying
+ * to use the device before we allow the backend to close.
+ */
+ if (info->hvc) {
+ hvc_remove(info->hvc);
+ info->hvc = NULL;
+ }
+
xenbus_frontend_closed(dev);
break;
}
+ }
}

static const struct xenbus_device_id xencons_ids[] = {
@@ -616,7 +632,7 @@ static int __init xen_hvc_init(void)
list_del(&info->list);
spin_unlock_irqrestore(&xencons_lock, flags);
if (info->irq)
- unbind_from_irqhandler(info->irq, NULL);
+ evtchn_put(info->evtchn);
kfree(info);
return r;
}
--
2.40.1