Re: [RFC 2/2] PCI: acpiphp: slowdown hotplug if hotplugging multiple devices at a time

From: Dongli Zhang
Date: Wed Dec 13 2023 - 12:25:24 EST


Hi Igor,

On 12/13/23 02:05, Igor Mammedov wrote:
> On Wed, 13 Dec 2023 00:13:37 -0800
> Dongli Zhang <dongli.zhang@xxxxxxxxxx> wrote:
>
>> Hi Igor,
>>
>>
>> On 12/12/23 16:36, Igor Mammedov wrote:
>>> previous commit ("PCI: acpiphp: enable slot only if it hasn't been enabled already"
>>> introduced a workaround to avoid a race between SCSI_SCAN_ASYNC job and
>>> bridge reconfiguration in case of single HBA hotplug.
>>> However in virt environment it's possible to pause machine hotplug several
>>> HBAs and let machine run. That can hit the same race when 2nd hotplugged
>>
>> Would you mind helping explain what does "pause machine hotplug several HBAs and
>> let machine run" indicate?
>
> qemu example would be:
> {qemu) stop
> (qemu) device_add device_add vhost-scsi-pci,wwpn=naa.5001405324af0985,id=vhost01,bus=bridge1,addr=8
> (qemu) device_add vhost-scsi-pci,wwpn=naa.5001405324af0986,id=vhost02,bus=bridge1,addr=0
> (qemu) cont
>
> this way when machine continues to run acpiphp code will see 2 HBAs at once
> and try to process one right after another. So [1/2] patch is not enough
> to cover above case, and hence the same hack SHPC employs by adding delay.
> However 2 separate hotplug events as in your reproducer should be covered
> by the 1st patch.

Thank you very much for the explanation.

That indicates the two PCI devices will be detected and enabled in the same
event. Neither of the two PCI devices used to be enabled.

As mentioned in another email, I do not think this is the way to even workaround
the issue, because there are other ways to do mmio at the same time point.

Dongli Zhang

>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>>> HBA will start re-configuring bridge.
>>> Do the same thing as SHPC and throttle down hotplug of 2nd and up
>>> devices within single hotplug event.
>>>
>>> Signed-off-by: Igor Mammedov <imammedo@xxxxxxxxxx>
>>> ---
>>> drivers/pci/hotplug/acpiphp_glue.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
>>> index 6b11609927d6..30bca2086b24 100644
>>> --- a/drivers/pci/hotplug/acpiphp_glue.c
>>> +++ b/drivers/pci/hotplug/acpiphp_glue.c
>>> @@ -37,6 +37,7 @@
>>> #include <linux/mutex.h>
>>> #include <linux/slab.h>
>>> #include <linux/acpi.h>
>>> +#include <linux/delay.h>
>>>
>>> #include "../pci.h"
>>> #include "acpiphp.h"
>>> @@ -700,6 +701,7 @@ static void trim_stale_devices(struct pci_dev *dev)
>>> static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
>>> {
>>> struct acpiphp_slot *slot;
>>> + int nr_hp_slots = 0;
>>>
>>> /* Bail out if the bridge is going away. */
>>> if (bridge->is_going_away)
>>> @@ -723,6 +725,10 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
>>>
>>> /* configure all functions */
>>> if (slot->flags != SLOT_ENABLED) {
>>> + if (nr_hp_slots)
>>> + msleep(1000);
>>> +
>>> + ++nr_hp_slots;
>>> enable_slot(slot, true);
>>> }
>>> } else {
>>
>