[PATCH 13/13] accel/habanalabs: modify pci health check

From: Oded Gabbay
Date: Tue Feb 20 2024 - 11:04:12 EST


From: Ofir Bitton <obitton@xxxxxxxxx>

Today we read PCI VENDOR-ID in order to make sure PCI link is
healthy. Apparently the VENDOR-ID might be stored on host and
hence, when we read it we might not access the PCI bus.
In order to make sure PCI health check is reliable, we will start
checking the DEVICE-ID instead.

Signed-off-by: Ofir Bitton <obitton@xxxxxxxxx>
Reviewed-by: Oded Gabbay <ogabbay@xxxxxxxxxx>
Signed-off-by: Oded Gabbay <ogabbay@xxxxxxxxxx>
---
drivers/accel/habanalabs/common/device.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/accel/habanalabs/common/device.c b/drivers/accel/habanalabs/common/device.c
index 3b9e8a21d7df..8f92445c5a90 100644
--- a/drivers/accel/habanalabs/common/device.c
+++ b/drivers/accel/habanalabs/common/device.c
@@ -1035,14 +1035,14 @@ static void device_early_fini(struct hl_device *hdev)

static bool is_pci_link_healthy(struct hl_device *hdev)
{
- u16 vendor_id;
+ u16 device_id;

if (!hdev->pdev)
return false;

- pci_read_config_word(hdev->pdev, PCI_VENDOR_ID, &vendor_id);
+ pci_read_config_word(hdev->pdev, PCI_DEVICE_ID, &device_id);

- return (vendor_id == PCI_VENDOR_ID_HABANALABS);
+ return (device_id == hdev->pdev->device);
}

static int hl_device_eq_heartbeat_check(struct hl_device *hdev)
--
2.34.1