Re: [PATCH v11] drm: Unplug drm device when unregistering it (v8)

From: Chris Wilson
Date: Thu Jun 01 2017 - 09:04:39 EST


On Thu, Jun 01, 2017 at 02:26:01PM +0200, Hans de Goede wrote:
> Hi,
>
> On 01-06-17 14:22, Chris Wilson wrote:
> >On Thu, Jun 01, 2017 at 02:13:28PM +0200, Hans de Goede wrote:
> >>Hi,
> >>
> >>On 31-05-17 04:39, jeffy wrote:
> >>>Hi Hans,
> >>>
> >>>thanx for investigating :)
> >>>
> >>>On 05/30/2017 03:06 PM, Hans de Goede wrote:
> >>>>Hi,
> >>>>
> >>>>On 29-05-17 22:25, Chris Wilson wrote:
> >>>>>On Fri, Apr 14, 2017 at 11:15:04AM -0400, Sean Paul wrote:
> >>>>>>On Thu, Apr 13, 2017 at 03:32:44PM +0800, Jeffy Chen wrote:
> >>>>>>>After unbinding drm, the user space may still owns the drm dev fd, and
> >>>>>>>may still be able to call drm ioctl.
> >>>>>>>
> >>>>>>>We're using an unplugged state to prevent something like that, so let's
> >>>>>>>reuse it here.
> >>>>>>>
> >>>>>>>Also drop drm_unplug_dev, because it would be unused after other
> >>>>>>>changes.
> >>>>>>>
> >>>>>>>Verified on rk3399 chromebook kevin(with cros 4.4 kernel), no more
> >>>>>>>crashes
> >>>>>>>when unbinding drm with ui service still running.
> >>>>>>>
> >>>>>>>v2: Fix some commit messages.
> >>>>>>>v3: Reuse unplug status.
> >>>>>>>v4: Add drm_device_set_plug_state helper.
> >>>>>>>v5: Fix hang when unregistering drm dev with open_count 0.
> >>>>>>>v6: Move drm_device_set_plug_state into drm_drv.
> >>>>>>>v7: Add missing drm_dev_unref in udl_drv.
> >>>>>>>v8: Fix compiler errors after enable udl.
> >>>>>>>
> >>>>>>>Signed-off-by: Jeffy Chen <jeffy.chen@xxxxxxxxxxxxxx>
> >>>>>>>
> >>>>>>>---
> >>>>>>
> >>>>>>Hi Jeffy,
> >>>>>>Given the trouble we've had with this patch already, coupled with the
> >>>>>>fact that
> >>>>>>unbinding while userspace is active is a contrived/pathological case,
> >>>>>>I don't
> >>>>>>think it's worth picking this patch upstream.
> >>>>>>
> >>>>>>If it's really causing issues downstream, you can add my Reviewed-by
> >>>>>>for a CHROMIUM
> >>>>>>patch, but I'd rather not carry patches in the CrOS repo if we don't
> >>>>>>need to.
> >>>>>
> >>>>>Would a
> >>>>>
> >>>>>Fixes: a39be606f99d ("drm: Do a full device unregister when unplugging")
> >>>>>Reported-by: Marco Diego Aurélio Mesquita <marcodiegomesquita@xxxxxxxxx>
> >>>>>Cc: Hans de Goede <hdegoede@xxxxxxxxxx>
> >>>>>
> >>>>>convince us to look into this patch again?
> >>>>
> >>>>The problem is this patch is wrong, see below.
> >>>>
> >>>>>>> drivers/gpu/drm/drm_drv.c | 26 ++++++++++----------------
> >>>>>>> drivers/gpu/drm/udl/udl_drv.c | 3 ++-
> >>>>>>> include/drm/drmP.h | 6 ------
> >>>>>>> include/drm/drm_drv.h | 1 -
> >>>>>>> 4 files changed, 12 insertions(+), 24 deletions(-)
> >>>>>>>
> >>>>>>>diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> >>>>>>>index b5c6bb4..e1da4d1 100644
> >>>>>>>--- a/drivers/gpu/drm/drm_drv.c
> >>>>>>>+++ b/drivers/gpu/drm/drm_drv.c
> >>>>>>>@@ -355,22 +355,6 @@ void drm_put_dev(struct drm_device *dev)
> >>>>>>> }
> >>>>>>> EXPORT_SYMBOL(drm_put_dev);
> >>>>>>>-void drm_unplug_dev(struct drm_device *dev)
> >>>>>>>-{
> >>>>>>>- /* for a USB device */
> >>>>>>>- drm_dev_unregister(dev);
> >>>>>>>-
> >>>>>>>- mutex_lock(&drm_global_mutex);
> >>>>>>>-
> >>>>>>>- drm_device_set_unplugged(dev);
> >>>>>>>-
> >>>>>>>- if (dev->open_count == 0) {
> >>>>>>>- drm_put_dev(dev);
> >>>>>>>- }
> >>>>>>>- mutex_unlock(&drm_global_mutex);
> >>>>>>>-}
> >>>>>>>-EXPORT_SYMBOL(drm_unplug_dev);
> >>>>>>>-
> >>>>>>> /*
> >>>>>>> * DRM internal mount
> >>>>>>> * We want to be able to allocate our own "struct address_space"
> >>>>>>>to control
> >>>>>>>@@ -733,6 +717,13 @@ static void remove_compat_control_link(struct
> >>>>>>>drm_device *dev)
> >>>>>>> kfree(name);
> >>>>>>> }
> >>>>>>>+static inline void drm_device_set_plug_state(struct drm_device *dev,
> >>>>>>>+ bool plugged)
> >>>>>>>+{
> >>>>>>>+ smp_wmb();
> >>>>>>>+ atomic_set(&dev->unplugged, !plugged);
> >>>>>>>+}
> >>>>>>>+
> >>>>>>> /**
> >>>>>>> * drm_dev_register - Register DRM device
> >>>>>>> * @dev: Device to register
> >>>>>>>@@ -787,6 +778,8 @@ int drm_dev_register(struct drm_device *dev,
> >>>>>>>unsigned long flags)
> >>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET))
> >>>>>>> drm_modeset_register_all(dev);
> >>>>>>>+ drm_device_set_plug_state(dev, true);
> >>>>>>>+
> >>>>>>> ret = 0;
> >>>>>>> DRM_INFO("Initialized %s %d.%d.%d %s for %s on minor %d\n",
> >>>>>>>@@ -826,6 +819,7 @@ void drm_dev_unregister(struct drm_device *dev)
> >>>>>>> drm_lastclose(dev);
> >>>>>>> dev->registered = false;
> >>>>>>>+ drm_device_set_plug_state(dev, false);
> >>>>>>> if (drm_core_check_feature(dev, DRIVER_MODESET))
> >>>>>>> drm_modeset_unregister_all(dev);
> >>>>>>>diff --git a/drivers/gpu/drm/udl/udl_drv.c
> >>>>>>>b/drivers/gpu/drm/udl/udl_drv.c
> >>>>>>>index cd8b017..fc73e24 100644
> >>>>>>>--- a/drivers/gpu/drm/udl/udl_drv.c
> >>>>>>>+++ b/drivers/gpu/drm/udl/udl_drv.c
> >>>>>>>@@ -108,7 +108,8 @@ static void udl_usb_disconnect(struct
> >>>>>>>usb_interface *interface)
> >>>>>>> drm_kms_helper_poll_disable(dev);
> >>>>>>> udl_fbdev_unplug(dev);
> >>>>>>> udl_drop_usb(dev);
> >>>>>>>- drm_unplug_dev(dev);
> >>>>>>>+ drm_dev_unregister(dev);
> >>>>>>>+ drm_dev_unref(dev);
> >>>>
> >>>>The unref here will cause the device struct to get free-ed even if
> >>>>userspace still holds references to it through the drm_dev. To fix
> >>>>this we would need to call drm_dev_ref on a open from userspace and
> >>>>drm_dev_unref from drm_release.
> >>>
> >>>right, but i think we are already did the ref/unref in the open/release through drm_minor_acquire/drm_minor_release.
> >>
> >>Ah yes, I see. Still calling drm_dev_unregister() directly from
> >>udl_usb_disconnect() is not going to work, see the patch titled
> >>"drm: Fix oops + Xserver hang when unplugging USB drm devices"
> >>
> >>The problem is that drm_dev_unregister() probably should be
> >>split into a drm_dev_unregister() and drm_dev_cleanup()
> >>function with the cleanup part getting called by the last unref,
> >>and at least calling drm_lastclose and the driver->unload call
> >>needs to be moved to the new drm_dev_cleanup.
> >
> >It already is. driver->unload() has been deprecated for a while, in
> >spirit at least, we probably should add a warning to it, and is
> >not meant to be used anymore.
>
> Ok, what about drm_lastclose ? Calling that while userspace still
> has fds open seems wrong too. Same for this part of drm_dev_unregister():

lastclose should not be a part of unregister, or it is spectacularly
misnamed.

> list_for_each_entry_safe(r_list, list_temp, &dev->maplist, head)
> drm_legacy_rmmap(dev, r_list->map);

Just give Daniel some more fuel for killing legacy, he'll be thankful.
>
> Removing mmapping backing data (if I understand correctly) while
> it may still be used seems like a bad idea too, so I do believe that
> drm_dev_unregister needs some love / some bits moved to the last
> drm_dev_put call before it will be safe to call it directly
> on usb-disconnect.

True, it needs more love, and the usb devices are the best test cases
for this unless we write some hw independent selftests.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre