Re: [PATCH v12 5/9] nvme: add copy offload support

From: Nitesh Shetty
Date: Tue Jun 06 2023 - 10:19:25 EST


On 23/06/05 06:43AM, Christoph Hellwig wrote:
break;
case REQ_OP_READ:
- ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read);
+ if (unlikely(req->cmd_flags & REQ_COPY))
+ nvme_setup_copy_read(ns, req);
+ else
+ ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read);
break;
case REQ_OP_WRITE:
- ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write);
+ if (unlikely(req->cmd_flags & REQ_COPY))
+ ret = nvme_setup_copy_write(ns, req, cmd);
+ else
+ ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write);

Yikes. Overloading REQ_OP_READ and REQ_OP_WRITE with something entirely
different brings us back the horrors of the block layer 15 years ago.
Don't do that. Please add separate REQ_COPY_IN/OUT (or maybe
SEND/RECEIVE or whatever) methods.


Downside will be duplicating checks which are present for read, write in
block layer, device-mapper and zoned devices.
But we can do this, shouldn't be an issue.

+ /* setting copy limits */
+ if (blk_queue_flag_test_and_set(QUEUE_FLAG_COPY, q))

I don't understand this comment.


It was a mistake. Comment is misplaced and it should have been
"setting copy flag" instead of "setting copy limits".
Anyway now we feel this comment is redundant, will remove it.
Also, we should have used blk_queue_flag_set to enable copy offload.

+struct nvme_copy_token {
+ char *subsys;
+ struct nvme_ns *ns;
+ sector_t src_sector;
+ sector_t sectors;
+};

Why do we need a subsys token? Inter-namespace copy is pretty crazy,
and not really anything we should aim for. But this whole token design
is pretty odd anyway. The only thing we'd need is a sequence number /
idr / etc to find an input and output side match up, as long as we
stick to the proper namespace scope.


The idea behind subsys is to prevent copy across different subsystem.
For example, copy across nvme subsystem and the scsi subsystem. [1]
At present, we don't support inter-namespace(copy across NVMe namespace),
but after community feedback for previous series we left scope for it.
About idr per namespace, it will be similar to namespace check that
we are doing to prevent copy across namespace.
We went with current structure for token, as it was solving above
issues as well as provides a placeholder for storing source LBA and
number of sectors.
Do have any suggestions on how we can store source info, if we go with
idr based approach ?

[1] https://lore.kernel.org/all/alpine.LRH.2.02.2202011327350.22481@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m407f24fb4454d35c3283a5e51fdb04f1600463af

+ if (unlikely((req->cmd_flags & REQ_COPY) &&
+ (req_op(req) == REQ_OP_READ))) {
+ blk_mq_start_request(req);
+ return BLK_STS_OK;
+ }

This really needs to be hiden inside of nvme_setup_cmd. And given
that other drivers might need similar handling the best way is probably
to have a new magic BLK_STS_* value for request started but we're
not actually sending it to hardware.

Sure we will add new BLK_STS_* for completion and move the snippet.

Thank you,
Nitesh Shetty