RE: [EXTERNAL] [PATCH 2/2] PCI: hv: Add support for protocol 1.3 and support PCI_BUS_RELATIONS2

From: Long Li
Date: Mon Dec 02 2019 - 16:23:14 EST


>Subject: RE: [EXTERNAL] [PATCH 2/2] PCI: hv: Add support for protocol 1.3 and
>support PCI_BUS_RELATIONS2
>
>From: longli@xxxxxxxxxxxxxxxxx Sent: Friday, November 22, 2019 5:57 PM
>>
>> From: Long Li <longli@xxxxxxxxxxxxx>
>>
>> Starting with Hyper-V PCI protocol version 1.3, the host VSP can send
>> PCI_BUS_RELATIONS2 and pass the vNUMA node information for devices
>on the bus.
>> The vNUMA node tells which guest NUMA node this device is on based on
>> guest VM configuration topology and physical device inforamtion.
>>
>> The patch adds code to negotiate v1.3 and process PCI_BUS_RELATIONS2.
>>
>> Signed-off-by: Long Li <longli@xxxxxxxxxxxxx>
>> ---
>> drivers/pci/controller/pci-hyperv.c | 107
>> ++++++++++++++++++++++++++++
>> 1 file changed, 107 insertions(+)
>>
>
>[snip]
>
>> +/*
>> + * Set NUMA node for the devices on the bus */ static void
>> +pci_assign_numa_node(struct hv_pcibus_device *hbus) {
>> + struct pci_dev *dev;
>> + struct pci_bus *bus = hbus->pci_bus;
>> + struct hv_pci_dev *hv_dev;
>> +
>> + list_for_each_entry(dev, &bus->devices, bus_list) {
>> + hv_dev = get_pcichild_wslot(hbus, devfn_to_wslot(dev-
>>devfn));
>> + if (!hv_dev)
>> + continue;
>> +
>> + if (hv_dev->desc.flags &
>HV_PCI_DEVICE_FLAG_NUMA_AFFINITY)
>> + set_dev_node(&dev->dev, hv_dev-
>>desc.virtual_numa_node);
>> + }
>> +}
>> +
>
>get_pcichild_wslot() gets a reference to the hv_dev, so a call to put_pcichild()
>is needed to balance.

Thanks for pointing this out! I will send v2 to fix this.

>
>But more broadly, is the call to set_dev_node() operating on the correct struct
>device? There's a struct device in the struct hv_device, and also one in the
>struct pci_dev. Everything in this module seems to be operating on the
>former.
>For example, all the dev_err() calls identify the struct device in struct
>hv_device.
>And enumerating all the devices on a virtual PCI bus is done by iterating
>through the hbus->children list, not the bus->devices list. I don't completely
>understand the interplay between the two struct device entries, but the
>difference makes me wonder if the above code should be setting the NUMA
>node on the struct device that's in struct hv_device.

There are two "bus" variables in this function. "bus" is a "struct pci_bus" from the PCI layer. "hbus" is a "struct hv_pcibus_device" defined in pci-hyperv.

The parameter passed to set_dev_node is &dev->dev, the dev is enumerated from bus->devices. So dev (struct pci_dev) is from the PCI layer, this function sets the node on the device that will be used to probe and load its corresponding driver.

There is a separate list of hbus->children. It's represents pci-hyperv's view of devices on its bus. pci-hyperv presents those devices to PCI layer when pci_scan_child_bus() is called.

Long