Re: [PATCH] nvme-fc: fix sleep-in-atomic-context bug caused by nvme_fc_rcv_ls_req

From: James Smart
Date: Mon Oct 03 2022 - 13:49:01 EST


On 10/2/2022 7:56 PM, James Smart wrote:
On 10/2/2022 6:50 PM, duoming@xxxxxxxxxx wrote:
Hello,

On Sun, 2 Oct 2022 10:12:15 -0700 James Smart wrote:

On 10/1/2022 5:19 PM, Duoming Zhou wrote:
The function lpfc_poll_timeout() is a timer handler that runs in an
atomic context, but it calls "kzalloc(.., GFP_KERNEL)" that may sleep.
As a result, the sleep-in-atomic-context bug will happen. The processes
is shown below:

lpfc_poll_timeout()
   lpfc_sli_handle_fast_ring_event()
    lpfc_sli_process_unsol_iocb()
     lpfc_complete_unsol_iocb()
      lpfc_nvme_unsol_ls_handler()
       lpfc_nvme_handle_lsreq()
        nvme_fc_rcv_ls_req()
         kzalloc(sizeof(.., GFP_KERNEL) //may sleep

This patch changes the gfp_t parameter of kzalloc() from GFP_KERNEL to
GFP_ATOMIC in order to mitigate the bug.

Fixes: 14fd1e98afaf ("nvme-fc: Add Disconnect Association Rcv support")
Signed-off-by: Duoming Zhou <duoming@xxxxxxxxxx>
---
   drivers/nvme/host/fc.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 127abaf9ba5..36698dfc8b3 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -1754,7 +1754,7 @@ nvme_fc_rcv_ls_req(struct nvme_fc_remote_port *portptr,
       lsop = kzalloc(sizeof(*lsop) +
               sizeof(union nvmefc_ls_requests) +
               sizeof(union nvmefc_ls_responses),
-            GFP_KERNEL);
+            GFP_ATOMIC);
       if (!lsop) {
           dev_info(lport->dev,
               "RCV %s LS failed: No memory\n",

I would prefer this was fixed within lpfc rather than introducing atomic
allocations (1st in either host or target transport).  It was introduced
by lpfc change in irq handling style.

Thank your for your reply and suggestions!

Do you think change the lpfc_poll_timeout() to a delayed_work is better?

Best regards,
Duoming Zhou

as a minimum: the lpfc_complete_unsol_iocb handler should be passing off the iocb to a work queue routine - so that the context changes so that either nvme host or nvmet ls callback routines can be called. If possible, it should do the axchg alloc - to avoid a GFP_ATOMIC there as well...

It's usually best for these nvme LS's and ELS's to be done in a slow path thread/work queue element. That may mean segmenting a little earlier in the path.

-- james


looking further... lpfc_poll_timeout() should only be used on an SLI-3 adapter. The existing SLI-3 adapters don't support NVMe. So I'm a little confused by this stack trace.

Can you describe what the system config/software setup is and specifically what lpfc adapter is being used (dmesg attachment logs are sufficient, or lspci output).

-- james