Re: [PATCH 1/1] block: Check the queue limit before bio submitting

From: Ed Tsai (蔡宗軒)
Date: Mon Nov 06 2023 - 21:53:40 EST


On Mon, 2023-11-06 at 19:54 +0800, Ming Lei wrote:
> On Mon, Nov 06, 2023 at 12:53:31PM +0800, Ming Lei wrote:
> > On Mon, Nov 06, 2023 at 01:40:12AM +0000, Ed Tsai (蔡宗軒) wrote:
> > > On Mon, 2023-11-06 at 09:33 +0800, Ed Tsai wrote:
> > > > On Sat, 2023-11-04 at 11:43 +0800, Ming Lei wrote:
> >
> > ...
> >
> > > Sorry for missing out on my dd command. Here it is:
> > > dd if=/data/test_file of=/dev/null bs=64m count=1 iflag=direct
> >
> > OK, thanks for the sharing.
> >
> > I understand the issue now, but not sure if it is one good idea to
> check
> > queue limit in __bio_iov_iter_get_pages():
> >
> > 1) bio->bi_bdev may not be set
> >
> > 2) what matters is actually bio's alignment, and bio size still can
> > be big enough
> >
> > So I cooked one patch, and it should address your issue:
>
> The following one fixes several bugs, and is verified to be capable
> of
> making big & aligned bios, feel free to run your test against this
> one:
>
> block/bio.c | 28 +++++++++++++++++++++++++++-
> 1 file changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/block/bio.c b/block/bio.c
> index 816d412c06e9..80b36ce57510 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1211,6 +1211,7 @@ static int bio_iov_add_zone_append_page(struct
> bio *bio, struct page *page,
> }
>
> #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) /
> sizeof(struct page *))
> +#define BIO_CHUNK_SIZE(256U << 10)
>
> /**
> * __bio_iov_iter_get_pages - pin user or kernel pages and add them
> to a bio
> @@ -1266,6 +1267,31 @@ static int __bio_iov_iter_get_pages(struct bio
> *bio, struct iov_iter *iter)
> size -= trim;
> }
>
> +/*
> + * Try to make bio aligned with 128KB if it isn't the last one, so
> + * we can avoid small bio in case of big chunk sequential IO because
> + * of bio split and multipage bvec.
> + *
> + * If nothing is added to this bio, simply allow unaligned since we
> + * have chance to add more bytes
> + */
> +if (iov_iter_count(iter) && bio->bi_iter.bi_size) {
> +unsigned int aligned_size = (bio->bi_iter.bi_size + size) &
> +~(BIO_CHUNK_SIZE - 1);
> +
> +if (aligned_size <= bio->bi_iter.bi_size) {
> +/* stop to add page if this bio can't keep aligned */
> +if (!(bio->bi_iter.bi_size & (BIO_CHUNK_SIZE - 1))) {
> +ret = left = size;
> +goto revert;
> +}
> +} else {
> +aligned_size -= bio->bi_iter.bi_size;
> +iov_iter_revert(iter, size - aligned_size);
> +size = aligned_size;
> +}
> +}
> +
> if (unlikely(!size)) {
> ret = -EFAULT;
> goto out;
> @@ -1285,7 +1311,7 @@ static int __bio_iov_iter_get_pages(struct bio
> *bio, struct iov_iter *iter)
>
> offset = 0;
> }
> -
> +revert:
> iov_iter_revert(iter, left);
> out:
> while (i < nr_pages)
> --
> 2.41.0
>
>
>
> Thanks,
> Ming
>

The latest patch you provided with 256 alignment does help alleviate
the severity of fragmentation. However, the actual aligned size may
vary depending on the device. Using a fixed and universal size of 128
or 256KB only provides partial relief from fragmentation.

I performed a dd direct I/O read of 64MB with your patch, and although
most of the bios were aligned, there were still cases of missalignment
to the device limit (e.g., 512MB for my device), as shown below:

dd [000] ..... 392.976830: block_bio_queue: 254,52 R 2997760 + 3584
dd [000] ..... 392.979940: block_bio_queue: 254,52 R 3001344 + 3584
dd [000] ..... 392.983235: block_bio_queue: 254,52 R 3004928 + 3584
dd [000] ..... 392.986468: block_bio_queue: 254,52 R 3008512 + 3584

Comparing the results of the Antutu Sequential test to the previous
data, it is indeed an improvement, but still a slight difference with
limiting the bio size to max_sectors:

Sequential Read (average of 5 rounds):
Original: 3033.7 MB/sec
Limited to max_sectors: 3520.9 MB/sec
Aligned 256KB: 3471.5 MB/sec

Sequential Write (average of 5 rounds):
Original: 2225.4 MB/sec
Limited to max_sectors: 2800.3 MB/sec
Aligned 256KB: 2618.1 MB/sec

What if we limit the bio size only for those who have set the
max_sectors?

Best,
Ed