Associations take a refcount on queues, queues take a refcount onIs it really worth is using an rcu pointer here?
associations.
The existing code lead to the situation that the target executes a
disconnect and the host triggers a reconnect immediately. The reconnect
command still finds an existing association and uses this. Though the
reconnect crashes later on because nvmet_fc_delete_target_assoc()
blindly goes ahead and removes resources while the reconnect code wants
to use it. The problem is that nvmet_fc_find_target_assoc() is able to
lookup an association which is being removed.
So the first thing to address nvmet_fc_find_target_queue() is to remove
the association out of the list and wait a RCU cycle and free resources
in the free function callback of the kref_put().
The live time of the queues are strictly bound to the lifetime of an
association. Thus we don't need to take reverse refcounts (queue ->
association).
Furthermore, streamline the cleanup code by using the workqueue for
delete the association in nvmet_fc_ls_disconnect. This ensures, that we
run through the same shutdown path in all non error cases.
Reproducer: nvme/003
Signed-off-by: Daniel Wagner <dwagner@xxxxxxx>
---
drivers/nvme/target/fc.c | 83 +++++++++++++++++++---------------------
1 file changed, 40 insertions(+), 43 deletions(-)
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 28e432e62361..db992df13c73 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -166,6 +166,7 @@ struct nvmet_fc_tgt_assoc {
struct nvmet_fc_hostport *hostport;
struct nvmet_fc_ls_iod *rcv_disconn;
struct list_head a_list;
+ struct nvmet_fc_tgt_queue *_queues[NVMET_NR_QUEUES + 1];
struct nvmet_fc_tgt_queue __rcu *queues[NVMET_NR_QUEUES + 1];
struct kref ref;
struct work_struct del_work;
@@ -803,14 +804,11 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
if (!queue)
return NULL;
- if (!nvmet_fc_tgt_a_get(assoc))
- goto out_free_queue;
-
queue->work_q = alloc_workqueue("ntfc%d.%d.%d", 0, 0,
assoc->tgtport->fc_target_port.port_num,
assoc->a_id, qid);
if (!queue->work_q)
- goto out_a_put;
+ goto out_free_queue;
queue->qid = qid;
queue->sqsize = sqsize;
@@ -831,7 +829,8 @@ nvmet_fc_alloc_target_queue(struct nvmet_fc_tgt_assoc *assoc,
if (ret)
goto out_fail_iodlist;
- WARN_ON(assoc->queues[qid]);
+ WARN_ON(assoc->_queues[qid]);
+ assoc->_queues[qid] = queue;
rcu_assign_pointer(assoc->queues[qid], queue);