Re: Some new bio merging behaviors in __bio_try_merge_page

From: Gao Xiang
Date: Thu Apr 11 2019 - 06:21:09 EST




On 2019/4/11 16:09, Ming Lei wrote:
> On Thu, Apr 11, 2019 at 03:43:02PM +0800, Gao Xiang wrote:
>>
>>
>> On 2019/4/11 15:08, Ming Lei wrote:
>>> Hi Gao Xiang,
>>>
>>> On Thu, Apr 11, 2019 at 01:47:49PM +0800, Gao Xiang wrote:
>>>> Hi Ming,
>>>>
>>>> I found a erofs issue after commit 07173c3ec276
>>>> ("block: enable multipage bvecs") is merged. It seems that
>>>> it tries to merge more physical continuous pages in one iovec.
>>>>
>>>> However it breaks the current erofs_read_raw_page logic since it uses
>>>> nr_iovecs of bio_alloc to limit the maximum number of physical
>>>
>>> I believe you can do the limit outside easily, such as by checking
>>> how many pages have been added to the bio.
>>
>> Yes, I agree that. However, I noticed that bio_add_page is exported as
>> EXPORT_SYMBOL. I have no idea how many out-of-tree drivers break as well.
>
> I don't think it is a good behaviour to use bio->bi_max_vecs to limit
> max allowed page, you may see the idea from the naming simply...

I hoped to simplify all these page limits to one number which is
the minimum number among them all then...

>
> If there were other such drivers, we may fix it easily, and the following
> patch should fix your issue:
>
> diff --git a/drivers/staging/erofs/data.c b/drivers/staging/erofs/data.c
> index 526e0dbea5b5..8878f2f2593e 100644
> --- a/drivers/staging/erofs/data.c
> +++ b/drivers/staging/erofs/data.c
> @@ -298,7 +298,7 @@ static inline struct bio *erofs_read_raw_page(struct bio *bio,
> *last_block = current_block;
>
> /* shift in advance in case of it followed by too many gaps */
> - if (unlikely(bio->bi_vcnt >= bio->bi_max_vecs)) {
> + if (unlikely(bio->bi_iter.bi_size >= bio->bi_max_vecs * PAGE_SIZE)) {
> /* err should reassign to 0 after submitting */
> err = 0;
> goto submit_bio_out;

Yeah, it is fine, thanks for your suggestion :)

Thanks,
Gao Xiang

>>
>> Is there no way to take old behavior into consideration in the block layer as well?
>
> One new interface, such as bio_add_page_limit(), may be helpful for
> addressing this issue, but not sure if it is necessary, since it is
> more reasonable to put the logic in user side.
>
>>
>>>
>>>> continuous blocks as well. It was practicable since the old
>>>> __bio_try_merge_page only tries to merge in the same page.
>>>> it is a kAPI behavior change which also affects bio_alloc...
>>>>
>>>> ...
>>>> 231 err = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW);
>>>> 232 if (unlikely(err))
>>>> 233 goto err_out;
>>>> ...
>>>> 284 /* max # of continuous pages */
>>>> 285 if (nblocks > DIV_ROUND_UP(map.m_plen, PAGE_SIZE))
>>>> 286 nblocks = DIV_ROUND_UP(map.m_plen, PAGE_SIZE);
>>>> 287 if (nblocks > BIO_MAX_PAGES)
>>>> 288 nblocks = BIO_MAX_PAGES;
>>>> 289
>>>> 290 bio = erofs_grab_bio(sb, blknr, nblocks, sb,
>>>> 291 read_endio, false);
>>>
>>> Previously this bio is allowed to add at most 'nblocks' pages, however,
>>> now we are allowed to add at most 'nblocks' io vecs, instead of pages.
>>>
>>>> 292 if (IS_ERR(bio)) {
>>>> 293 err = PTR_ERR(bio);
>>>> 294 bio = NULL;
>>>> 295 goto err_out;
>>>> 296 }
>>>> 297 }
>>>> 298
>>>> 299 err = bio_add_page(bio, page, PAGE_SIZE, 0);
>>>> 300 /* out of the extent or bio is full */
>>>> 301 if (err < PAGE_SIZE)
>>>> 302 goto submit_bio_retry;
>>>> ...
>>>>
>>>> After commit 07173c3ec276 ("block: enable multipage bvecs"), erofs could
>>>> read more beyond what erofs_map_blocks assigns, and out-of-bound data could
>>>> be read and it breaks tail-end inline determination.
>>>
>>> I don't understand why, could you explain a bit why erofs reads more?
>>
>> In current erofs, file could be break in two parts (non tail-end and tail-end blocks).
>> Considering this on-disk layout,
>> ________________________________________________________________
>> | non tail-end data | meta data |
>> |____________________________________|_(inode) + non-inline data_|
>> ^ ^
>> | |
>> bio start \-- what nr_iovecs indicates as well:
>> the end of the non tail-end data.
>>
>> Before this commit, it will stop just before the next meta data. However,
>> if the new bio merging behavior is introduced, it could add pages more than
>> nr_iovecs, thus meta data will be read as the normal data...
>
> I'd suggest you to not re-use bio->bi_max_vecs for this purpose.
>
>>
>>
>>>
>>> The amount depends on how many pages you added to the bio.
>>>
>>>>
>>>> I can change the logic in erofs. However, out of curiosity, I have no idea
>>>> if some other places also are designed like this.
>>>>
>>>> IMO, it's better to provide a total count which indicates how many real
>>>> pages have been added in this bio. some thoughts?
>>>
>>> As I mentioned, you may count how many pages added to bio, or you still
>>> can get the number via bio_segments(bio).
>>
>> It is unsuitable to introduce bio_segments for each bio_add_page...
>
> It isn't necessary, see the above patch.
>
>
> Thanks,
> Ming
>