[PATCH v2] scsi: qedi: Fix use after free bug in qedi_remove due to race condition

From: Zheng Wang
Date: Wed Apr 12 2023 - 23:35:04 EST


In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
with qedi_recovery_handler and bound &qedi->board_disable_work
with qedi_board_disable_work.

When it calls qedi_schedule_recovery_handler, it will finally
call schedule_delayed_work to start the work.

When we call qedi_remove to remove the driver, there
may be a sequence as follows:

Fix it by finishing the work before cleanup in qedi_remove.

CPU0 CPU1

|qedi_recovery_handler
qedi_remove |
__qedi_remove |
iscsi_host_free |
scsi_host_put |
//free shost |
|iscsi_host_for_each_session
|//use qedi->shost

Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
Signed-off-by: Zheng Wang <zyytlz.wz@xxxxxxx>
---
v2:
- remove unnecessary comment suggested by Mike Christie and cancel the work
after qedi_ops->stop and qedi_ops->ll2->stop which ensure there is no more
work suggested by Manish Rangankar
---
drivers/scsi/qedi/qedi_main.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index f2ee49756df8..45d359554182 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -2450,6 +2450,9 @@ static void __qedi_remove(struct pci_dev *pdev, int mode)
qedi_ops->ll2->stop(qedi->cdev);
}

+ cancel_delayed_work_sync(&qedi->recovery_work);
+ cancel_delayed_work_sync(&qedi->board_disable_work);
+
qedi_free_iscsi_pf_param(qedi);

rval = qedi_ops->common->update_drv_state(qedi->cdev, false);
--
2.25.1