RE: [Intel-wired-lan] e1000e pci_disable_link_state_locked() issues

From: Lubetkin, YanirX
Date: Thu May 21 2015 - 11:56:24 EST


Hi Bjorn,

I'm going to check this and will get back to you with input/questions/resolution.

Thanks,
Yanir


> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@xxxxxxxxxxxxxxxx] On
> Behalf Of Bjorn Helgaas
> Sent: Wednesday, May 20, 2015 22:48
> To: Yinghai Lu; Kirsher, Jeffrey T
> Cc: linux-pci@xxxxxxxxxxxxxxx; intel-wired-lan@xxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx
> Subject: [Intel-wired-lan] e1000e pci_disable_link_state_locked() issues
>
> I think we have some issues with the e1000e usage of
> pci_disable_link_state_locked(), which Yinghai added with 9f728f53dd70
> ("PCI/e1000e: Add and use pci_disable_link_state_locked()").
>
> That fixed an AER deadlock in the following path, where pci_bus_sem is held
> by pci_walk_bus(), and we deadlocked when we tried to re-acquire it in
> pci_disable_link_state():
>
> do_recovery
> broadcast_error_message(..., report_slot_reset)
> pci_walk_bus
> down_read(&pci_bus_sem)
> cb(...) # report_slot_reset
> report_slot_reset
> dev->driver->err_handler->slot_reset # e1000_io_slot_reset
> e1000_io_slot_reset
> e1000e_disable_aspm
> pci_disable_link_state
> down_read(&pci_bus_sem)
>
> 9f728f53dd70 fixed that by changing e1000e_disable_aspm() to use
> pci_disable_link_state_locked() instead, which assumes pci_bus_sem is
> already held.
>
> That's fine for the e1000_io_slot_reset() path, where pci_bus_sem really
> *is* held. But e1000e_disable_aspm() is also called from e1000_probe() and
> __e1000_resume(), and in those paths, we *don't* hold pci_bus_sem.
>
> In effect, the caller of pci_disable_link_state_locked() is promising that
> pci_bus_sem is held, and __pci_disable_link_state() relies on that promise
> for its locking. But e1000e isn't upholding its end of the bargain.
>
> I'm not 100% sure __pci_disable_link_state() actually *needs* that locking:
> it is only called from a driver, and it should be impossible for a device or any
> upstream bridge to go away while a driver is bound to it. If somebody
> wanted to analyze this further and propose a patch to remove the locking (if
> it seems safe), that would be great.
>
> But in any case, __pci_disable_link_state() should be able to rely on its callers
> following the rules, so I'd like to see an e1000e change to use
> pci_disable_link_state() from the paths where pci_bus_sem is not held.
>
> Bjorn
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@xxxxxxxxxxxxxxxx
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/