Re: [PATCH v5 02/10] block: Add copy offload support infrastructure

From: Nitesh Shetty
Date: Wed Nov 23 2022 - 07:14:05 EST


On Wed, Nov 23, 2022 at 04:04:18PM +0800, Ming Lei wrote:
> On Wed, Nov 23, 2022 at 11:28:19AM +0530, Nitesh Shetty wrote:
> > Introduce blkdev_issue_copy which supports source and destination bdevs,
> > and an array of (source, destination and copy length) tuples.
> > Introduce REQ_COPY copy offload operation flag. Create a read-write
> > bio pair with a token as payload and submitted to the device in order.
> > Read request populates token with source specific information which
> > is then passed with write request.
> > This design is courtesy Mikulas Patocka's token based copy
>
> I thought this patchset is just for enabling copy command which is
> supported by hardware. But turns out it isn't, because blk_copy_offload()
> still submits read/write bios for doing the copy.
>
> I am just wondering why not let copy_file_range() cover this kind of copy,
> and the framework has been there.
>

Main goal was to enable copy command, but community suggested to add
copy emulation as well.

blk_copy_offload - actually issues copy command in driver layer.
The way read/write BIOs are percieved is different for copy offload.
In copy offload we check REQ_COPY flag in NVMe driver layer to issue
copy command. But we did missed it to add in other driver's, where they
might be treated as normal READ/WRITE.

blk_copy_emulate - is used if we fail or if device doesn't support native
copy offload command. Here we do READ/WRITE. Using copy_file_range for
emulation might be possible, but we see 2 issues here.
1. We explored possibility of pulling dm-kcopyd to block layer so that we
can readily use it. But we found it had many dependecies from dm-layer.
So later dropped that idea.
2. copy_file_range, for block device atleast we saw few check's which fail
it for raw block device. At this point I dont know much about the history of
why such check is present.

> When I was researching pipe/splice code for supporting ublk zero copy[1], I
> have got idea for async copy_file_range(), such as: io uring based
> direct splice, user backed intermediate buffer, still zero copy, if these
> ideas are finally implemented, we could get super-fast generic offload copy,
> and bdev copy is really covered too.
>
> [1] https://lore.kernel.org/linux-block/20221103085004.1029763-1-ming.lei@xxxxxxxxxx/
>

Seems interesting, We will take a look into this.

>
>

Thanks,
Nitesh