Re: [PATCH v3 2/3] dm: Add support for block provisioning

From: Mike Snitzer
Date: Fri Apr 14 2023 - 14:15:39 EST


On Fri, Apr 14 2023 at 9:32P -0400,
Joe Thornber <thornber@xxxxxxxxxx> wrote:

> On Fri, Apr 14, 2023 at 7:52 AM Sarthak Kukreti <sarthakkukreti@xxxxxxxxxxxx>
> wrote:
>
> > Add support to dm devices for REQ_OP_PROVISION. The default mode
> > is to passthrough the request to the underlying device, if it
> > supports it. dm-thinpool uses the provision request to provision
> > blocks for a dm-thin device. dm-thinpool currently does not
> > pass through REQ_OP_PROVISION to underlying devices.
> >
> > For shared blocks, provision requests will break sharing and copy the
> > contents of the entire block.
> >
>
> I see two issue with this patch:
>
> i) You use get_bio_block_range() to see which blocks the provision bio
> covers. But this function only returns
> complete blocks that are covered (it was designed for discard). Unlike
> discard, provision is not a hint so those
> partial blocks will need to be provisioned too.
>
> ii) You are setting off multiple dm_thin_new_mapping operations in flight
> at once. Each of these receives
> the same virt_cell and frees it when it completes. So I think we have
> multiple frees occuring? In addition you also
> release the cell yourself in process_provision_cell(). Fixing this is not
> trivial, you'll need to reference count the cells,
> and aggregate the mapping operation results.
>
> I think it would be far easier to restrict the size of the provision bio to
> be no bigger than one thinp block (as we do for normal io). This way dm
> core can split the bios, chain the child bios rather than having to
> reference count mapping ops, and aggregate the results.

I happened to be looking at implementing WRITE_ZEROES support for DM
thinp yesterday and reached the same conclussion relative to it (both
of Joe's points above, for me "ii)" was: the dm-bio-prison-v1 range
locking we do for discards needs work for other types of IO).

We can work to make REQ_OP_PROVISION spanning multiple thinp blocks
possible as follow-on optimization; but in the near-term DM thinp
needs REQ_OP_PROVISION to be split on a thinp block boundary.

This splitting can be assisted by block core in terms of a new
'provision_granularity' (which for thinp, it'd be set to the thinp
blocksize). But I don't know that we need to go that far (I'm
thinking its fine to have DM do the splitting it needs and only
elevate related code to block core if/when needed in the future).

DM core can take on conditionally imposing its max_io_len() to handle
splitting REQ_OP_PROVISION as needed on a per-target basis. This DM
core commit I've staged for 6.4 makes this quite a simple change:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-6.4&id=13f6facf3faeed34ca381aef4c9b153c7aed3972

So please rebase on linux-dm.git's dm-6.4 branch, and for
REQ_OP_PROVISION dm.c:__process_abnormal_io() you'd add this:

case REQ_OP_PROVISION:
num_bios = ti->num_provision_bios;
if (ti->max_provision_granularity)
max_granularity = limits->max_provision_sectors;
break;

I'll reply again later today (to patch 2's actual code changes),
because I caught at least one other thing worth mentioning.

Thanks,
Mike