RE: Why hold device_lock when calling callback in pci_walk_bus?

From: Zhang, Yanmin
Date: Fri Sep 28 2012 - 04:29:38 EST


Some error handling functions call pci_walk_bus. For example, pci-e aer. Here we lock the device, so the driver wouldn't detach from the device, as the cb might call driver's callback function.

-----Original Message-----
From: Huang, Ying
Sent: Friday, September 28, 2012 4:15 PM
To: bhelgaas@xxxxxxxxxx
Cc: Greg Kroah-Hartman; Zhang, Yanmin; linux-pci@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; rjw@xxxxxxx
Subject: Why hold device_lock when calling callback in pci_walk_bus?

Hi, All,

If my understanding were correct, device_lock is used to provide mutual exclusion between device probe/remove/suspend/resume etc. Why hold device_lock when calling callback in pci_walk_bus.

This is introduced by the following commit.

commit d71374dafbba7ec3f67371d3b7e9f6310a588808
Author: Zhang Yanmin <yanmin.zhang@xxxxxxxxx>
Date: Fri Jun 2 12:35:43 2006 +0800

[PATCH] PCI: fix race with pci_walk_bus and pci_destroy_dev

pci_walk_bus has a race with pci_destroy_dev. When cb is called
in pci_walk_bus, pci_destroy_dev might unlink the dev pointed by next.
Later on in the next loop, pointer next becomes NULL and cause
kernel panic.

Below patch against 2.6.17-rc4 fixes it by changing pci_bus_lock (spin_lock)
to pci_bus_sem (rw_semaphore).

Signed-off-by: Zhang Yanmin <yanmin.zhang@xxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>

Corresponding email thread is: https://lkml.org/lkml/2006/5/26/38

But from the commit and email thread, I can not find why we need to do that.

I ask this question because I want to use pci_walk_bus in a function (in pci runtime resume path) which may be called with device_lock held.

Can anyone help me on that?

Best Regards,
Huang Ying


¢éì®&Þ~º&¶¬–+-±éÝ¥Šw®žË±Êâmébžìdz¹Þ)í…æèw*jg¬±¨¶‰šŽŠÝj/êäz¹ÞŠà2ŠÞ¨è­Ú&¢)ß«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù>Wš±êÞiÛaxPjØm¶Ÿÿà -»+ƒùdš_