Re: [ 02/38] PCI/PM: Fix deadlock when unbinding device if parentin D3cold

From: Ben Hutchings
Date: Thu Nov 22 2012 - 21:35:41 EST


On Wed, 2012-11-21 at 16:39 -0800, Greg Kroah-Hartman wrote:
> 3.0-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Huang Ying <ying.huang@xxxxxxxxx>
>
> commit 90b5c1d7c45eeb622302680ff96ed30c1a2b6f0e upstream.
>
> If a PCI device and its parents are put into D3cold, unbinding the
> device will trigger deadlock as follow:
>
> - driver_unbind
> - device_release_driver
> - device_lock(dev) <--- previous lock here
> - __device_release_driver
> - pm_runtime_get_sync
> ...
> - rpm_resume(dev)
> - rpm_resume(dev->parent)
> ...
> - pci_pm_runtime_resume
> ...
> - pci_set_power_state
> - __pci_start_power_transition
> - pci_wakeup_bus(dev->parent->subordinate)
> - pci_walk_bus
> - device_lock(dev) <--- deadlock here
>
>
> If we do not do device_lock in pci_walk_bus, we can avoid deadlock.
> Device_lock in pci_walk_bus is introduced in commit:
> d71374dafbba7ec3f67371d3b7e9f6310a588808, corresponding email thread
> is: https://lkml.org/lkml/2006/5/26/38. The patch author Zhang Yanmin
> said device_lock is added to pci_walk_bus because:
>
> Some error handling functions call pci_walk_bus. For example, PCIe
> aer. Here we lock the device, so the driver wouldn't detach from the
> device, as the cb might call driver's callback function.
>
> So I fixed the deadlock as follows:
>
> - remove device_lock from pci_walk_bus
> - add device_lock into callback if callback will call driver's callback
>
> I checked pci_walk_bus users one by one, and found only PCIe aer needs
> device lock.
[...]

What about eeh_report_error() in
arch/powerpc/platforms/pseries/eeh_driver.c?

Also, is the deadlock even possible before this change in Linux 3.6?

commit 448bd857d48e69b33ef323739dc6d8ca20d4cda7
Author: Huang Ying <ying.huang@xxxxxxxxx>
Date: Sat Jun 23 10:23:51 2012 +0800

PCI/PM: add PCIe runtime D3cold support

Ben.

--
Ben Hutchings
Never attribute to conspiracy what can adequately be explained by stupidity.

Attachment: signature.asc
Description: This is a digitally signed message part