Re: [PATCH v3 0/2] nvme-fabrics: short-circuit connect retries

From: Hannes Reinecke
Date: Thu Mar 07 2024 - 09:11:07 EST


On 3/7/24 13:14, Sagi Grimberg wrote:


On 07/03/2024 13:45, Hannes Reinecke wrote:
On 3/7/24 12:30, Sagi Grimberg wrote:

[ .. ]

Where is this retried today, I don't see where connect failure is retried, outside of a periodic reconnect.
Maybe I'm missing where what is the actual failure here.

static void nvme_tcp_reconnect_ctrl_work(struct work_struct *work)
{
        struct nvme_tcp_ctrl *tcp_ctrl =
                        container_of(to_delayed_work(work),
                        struct nvme_tcp_ctrl, connect_work);
        struct nvme_ctrl *ctrl = &tcp_ctrl->ctrl;

        ++ctrl->nr_reconnects;

        if (nvme_tcp_setup_ctrl(ctrl, false))
                goto requeue;

        dev_info(ctrl->device, "Successfully reconnected (%d attempt)\n",
                        ctrl->nr_reconnects);

        ctrl->nr_reconnects = 0;

        return;

requeue:
        dev_info(ctrl->device, "Failed reconnect attempt %d\n",

and nvme_tcp_setup_ctrl() returns either a negative errno or an NVMe status code (which might include the DNR bit).

I thought this is about the initialization. yes today we ignore the status in re-connection assuming that whatever
happened, may (or may not) resolve itself. The basis for this assumption is that if we managed to connect the first
time there is no reason to assume that connecting again should fail persistently.

And that is another issue where I'm not really comfortable with.
While it would make sense to have the connect functionality to be
one-shot, and let userspace retry if needed, the problem is that we
don't have a means of transporting that information to userspace.
The only thing which we can transport is an error number, which
could be anything and mean anything.
If we had a defined way stating: 'This is a retryable, retry with the same options.' vs 'This is retryable error, retry with modified options.' vs 'This a non-retryable error, don't bother.' I'd be
fine with delegating retries to userspace.
But currently we don't.

If there is a consensus that we should not assume it, its a valid argument. I didn't see where this happens with respect
to authentication though.

nvmf_connect_admin_queue():

/* Authentication required */
ret = nvme_auth_negotiate(ctrl, 0);
if (ret) {
dev_warn(ctrl->device,
"qid 0: authentication setup failed\n");
ret = NVME_SC_AUTH_REQUIRED;
goto out_free_data;
}
ret = nvme_auth_wait(ctrl, 0);
if (ret)
dev_warn(ctrl->device,
"qid 0: authentication failed\n");
else
dev_info(ctrl->device,
"qid 0: authenticated\n");

The first call to 'nvme_auth_negotiate()' is just for setting up
the negotiation context and start the protocol. So if we get
an error here it's pretty much non-retryable as it's completely
controlled by the fabrics options.
nvme_auth_wait(), OTOH, contains the actual result from the negotiation,
so there we might or might not retry, depending on the value of 'ret'.

Cheers,

Hannes