Re: iommu/vt-d: Cure VF irqdomain hickup

From: Lu Baolu
Date: Fri Nov 13 2020 - 02:21:13 EST


Hi Thomas,

On 2020/11/13 3:15, Thomas Gleixner wrote:
The recent changes to store the MSI irqdomain pointer in struct device
missed that Intel DMAR does not register virtual function devices. Due to
that a VF device gets the plain PCI-MSI domain assigned and then issues
compat MSI messages which get caught by the interrupt remapping unit.

Cure that by inheriting the irq domain from the physical function
device.

That's a temporary workaround. The correct fix is to inherit the irq domain
from the bus, but that's a larger effort which needs quite some other
changes to the way how x86 manages PCI and MSI domains.

Fixes: 85a8dfc57a0b ("iommm/vt-d: Store irq domain in struct device")
Reported-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
---
drivers/iommu/intel/dmar.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -333,6 +333,11 @@ static void dmar_pci_bus_del_dev(struct
dmar_iommu_notify_scope_dev(info);
}
+static inline void vf_inherit_msi_domain(struct pci_dev *pdev)
+{
+ dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
+}
+
static int dmar_pci_bus_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -342,8 +347,20 @@ static int dmar_pci_bus_notifier(struct
/* Only care about add/remove events for physical functions.
* For VFs we actually do the lookup based on the corresponding
* PF in device_to_iommu() anyway. */
- if (pdev->is_virtfn)
+ if (pdev->is_virtfn) {
+ /*
+ * Note: This is a horrible hack and needs to be cleaned
+ * up by assigning the domain to the bus, but that's too
+ * big of a change for post rc3.
+ *
+ * Ensure that the VF device inherits the irq domain of the
+ * PF device:
+ */
+ if (action == BUS_NOTIFY_ADD_DEVICE)
+ vf_inherit_msi_domain(pdev);
return NOTIFY_DONE;
+ }
+
if (action != BUS_NOTIFY_ADD_DEVICE &&
action != BUS_NOTIFY_REMOVED_DEVICE)
return NOTIFY_DONE;

We also encountered this problem in internal testing. This patch can
solve the problem.

Acked-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>

Best regards,
baolu