Re: [PATCH v2 01/16] block: Add atomic write operations to request_queue limits

From: Ming Lei
Date: Tue Dec 12 2023 - 20:26:04 EST


On Tue, Dec 12, 2023 at 11:08:29AM +0000, John Garry wrote:
> From: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
>
> Add the following limits:
> - atomic_write_boundary_bytes
> - atomic_write_max_bytes
> - atomic_write_unit_max_bytes
> - atomic_write_unit_min_bytes
>
> All atomic writes limits are initialised to 0 to indicate no atomic write
> support. Stacked devices are just not supported either for now.
>
> Signed-off-by: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> #jpg: Heavy rewrite
> Signed-off-by: John Garry <john.g.garry@xxxxxxxxxx>
> ---
> Documentation/ABI/stable/sysfs-block | 47 ++++++++++++++++++++++
> block/blk-settings.c | 60 ++++++++++++++++++++++++++++
> block/blk-sysfs.c | 33 +++++++++++++++
> include/linux/blkdev.h | 37 +++++++++++++++++
> 4 files changed, 177 insertions(+)
>
> diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
> index 1fe9a553c37b..ba81a081522f 100644
> --- a/Documentation/ABI/stable/sysfs-block
> +++ b/Documentation/ABI/stable/sysfs-block
> @@ -21,6 +21,53 @@ Description:
> device is offset from the internal allocation unit's
> natural alignment.
>
> +What: /sys/block/<disk>/atomic_write_max_bytes
> +Date: May 2023
> +Contact: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> +Description:
> + [RO] This parameter specifies the maximum atomic write
> + size reported by the device. This parameter is relevant
> + for merging of writes, where a merged atomic write
> + operation must not exceed this number of bytes.
> + The atomic_write_max_bytes may exceed the value in
> + atomic_write_unit_max_bytes if atomic_write_max_bytes
> + is not a power-of-two or atomic_write_unit_max_bytes is
> + limited by some queue limits, such as max_segments.
> +
> +
> +What: /sys/block/<disk>/atomic_write_unit_min_bytes
> +Date: May 2023
> +Contact: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> +Description:
> + [RO] This parameter specifies the smallest block which can
> + be written atomically with an atomic write operation. All
> + atomic write operations must begin at a
> + atomic_write_unit_min boundary and must be multiples of
> + atomic_write_unit_min. This value must be a power-of-two.
> +
> +
> +What: /sys/block/<disk>/atomic_write_unit_max_bytes
> +Date: January 2023
> +Contact: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> +Description:
> + [RO] This parameter defines the largest block which can be
> + written atomically with an atomic write operation. This
> + value must be a multiple of atomic_write_unit_min and must
> + be a power-of-two.
> +
> +
> +What: /sys/block/<disk>/atomic_write_boundary_bytes
> +Date: May 2023
> +Contact: Himanshu Madhani <himanshu.madhani@xxxxxxxxxx>
> +Description:
> + [RO] A device may need to internally split I/Os which
> + straddle a given logical block address boundary. In that
> + case a single atomic write operation will be processed as
> + one of more sub-operations which each complete atomically.
> + This parameter specifies the size in bytes of the atomic
> + boundary if one is reported by the device. This value must
> + be a power-of-two.
> +
>
> What: /sys/block/<disk>/diskseq
> Date: February 2021
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 0046b447268f..d151be394c98 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -59,6 +59,10 @@ void blk_set_default_limits(struct queue_limits *lim)
> lim->zoned = BLK_ZONED_NONE;
> lim->zone_write_granularity = 0;
> lim->dma_alignment = 511;
> + lim->atomic_write_unit_min_sectors = 0;
> + lim->atomic_write_unit_max_sectors = 0;
> + lim->atomic_write_max_sectors = 0;
> + lim->atomic_write_boundary_sectors = 0;

Can we move the four into single structure and setup them in single
API? Then cross-validation can be done in this API.

> }
>
> /**
> @@ -183,6 +187,62 @@ void blk_queue_max_discard_sectors(struct request_queue *q,
> }
> EXPORT_SYMBOL(blk_queue_max_discard_sectors);
>
> +/**
> + * blk_queue_atomic_write_max_bytes - set max bytes supported by
> + * the device for atomic write operations.
> + * @q: the request queue for the device
> + * @size: maximum bytes supported
> + */
> +void blk_queue_atomic_write_max_bytes(struct request_queue *q,
> + unsigned int bytes)
> +{
> + q->limits.atomic_write_max_sectors = bytes >> SECTOR_SHIFT;
> +}
> +EXPORT_SYMBOL(blk_queue_atomic_write_max_bytes);

What if driver doesn't call it but driver supports atomic write?

I guess the default max sectors should be atomic_write_unit_max_sectors
if the feature is enabled.

> +
> +/**
> + * blk_queue_atomic_write_boundary_bytes - Device's logical block address space
> + * which an atomic write should not cross.
> + * @q: the request queue for the device
> + * @bytes: must be a power-of-two.
> + */
> +void blk_queue_atomic_write_boundary_bytes(struct request_queue *q,
> + unsigned int bytes)
> +{
> + q->limits.atomic_write_boundary_sectors = bytes >> SECTOR_SHIFT;
> +}
> +EXPORT_SYMBOL(blk_queue_atomic_write_boundary_bytes);

Default atomic_write_boundary_sectors should be
atomic_write_unit_max_sectors in case of atomic write?

> +
> +/**
> + * blk_queue_atomic_write_unit_min_sectors - smallest unit that can be written
> + * atomically to the device.
> + * @q: the request queue for the device
> + * @sectors: must be a power-of-two.
> + */
> +void blk_queue_atomic_write_unit_min_sectors(struct request_queue *q,
> + unsigned int sectors)
> +{
> + struct queue_limits *limits = &q->limits;
> +
> + limits->atomic_write_unit_min_sectors = sectors;
> +}
> +EXPORT_SYMBOL(blk_queue_atomic_write_unit_min_sectors);

atomic_write_unit_min_sectors should be >= (physical block size >> 9)
given the minimized atomic write unit is physical sector for all disk.

> +
> +/*
> + * blk_queue_atomic_write_unit_max_sectors - largest unit that can be written
> + * atomically to the device.
> + * @q: the request queue for the device
> + * @sectors: must be a power-of-two.
> + */
> +void blk_queue_atomic_write_unit_max_sectors(struct request_queue *q,
> + unsigned int sectors)
> +{
> + struct queue_limits *limits = &q->limits;
> +
> + limits->atomic_write_unit_max_sectors = sectors;
> +}
> +EXPORT_SYMBOL(blk_queue_atomic_write_unit_max_sectors);

atomic_write_unit_max_sectors should be >= atomic_write_unit_min_sectors.


Thanks,
Ming