Re: [PATCH] ext4: correct best extent lstart adjustment logic

From: Jan Kara
Date: Wed Jan 31 2024 - 07:46:50 EST


[Added Ojaswin to CC as an author of the discussed patch]

On Mon 22-01-24 20:33:32, Baokun Li wrote:
> When yangerkun review commit 93cdf49f6eca ("ext4: Fix best extent lstart
> adjustment logic in ext4_mb_new_inode_pa()"), it was found that the best
> extent did not completely cover the original request after adjusting the
> best extent lstart in ext4_mb_new_inode_pa() as follows:
>
> original request: 2/10(8)
> normalized request: 0/64(64)
> best extent: 0/9(9)
>
> When we check if best ex can be kept at start of goal, ac_o_ex.fe_logical
> is 2 less than the adjusted best extent logical end 9, so we think the
> adjustment is done. But obviously 0/9(9) doesn't cover 2/10(8), so we
> should determine here if the original request logical end is less than or
> equal to the adjusted best extent logical end.

I'm sorry for a bit delayed reply. Why do you think it is a problem if the
resulting extent doesn't cover the full original range? We must always
cover the first block of the original extent so that the allocation makes
forward progress. But otherwise we choose to align to the start / end of
the goal range to reduce fragmentation even if we don't cover the whole
requested range - the rest of the range will be covered by the next
allocation. Also there is a problem with trying to cover the whole original
range described in [1]. Essentially the goal range does not need to cover
the whole original range and if we try to align the allocated range to
cover the whole original range, it may result in exceeding the goal range
and thus overlapping preallocations and triggering asserts in the prealloc
code.

So if we decided we want to handle the case you describe in a better way,
we'd need something making sure we don't exceed the goal range.

Honza

[1] https://lore.kernel.org/all/Y+UzQJRIJEiAr4Z4@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

>
> Moreover, the best extent len is not modified during the adjustment
> process, and it is already checked by the previous assertion, so replace
> the check for fe_len with a check for the best extent logical end.
>
> Cc: stable@xxxxxxxxxx
> Fixes: 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()")
> Signed-off-by: yangerkun <yangerkun@xxxxxxxxxx>
> Signed-off-by: Baokun Li <libaokun1@xxxxxxxxxx>
> ---
> fs/ext4/mballoc.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index f44f668e407f..fa5977fe8d72 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -5146,6 +5146,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> .fe_len = ac->ac_orig_goal_len,
> };
> loff_t orig_goal_end = extent_logical_end(sbi, &ex);
> + loff_t o_ex_end = extent_logical_end(sbi, &ac->ac_o_ex);
>
> /* we can't allocate as much as normalizer wants.
> * so, found space must get proper lstart
> @@ -5161,7 +5162,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> * 1. Check if best ex can be kept at end of goal (before
> * cr_best_avail trimmed it) and still cover original start
> * 2. Else, check if best ex can be kept at start of goal and
> - * still cover original start
> + * still cover original end
> * 3. Else, keep the best ex at start of original request.
> */
> ex.fe_len = ac->ac_b_ex.fe_len;
> @@ -5171,7 +5172,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> goto adjust_bex;
>
> ex.fe_logical = ac->ac_g_ex.fe_logical;
> - if (ac->ac_o_ex.fe_logical < extent_logical_end(sbi, &ex))
> + if (o_ex_end <= extent_logical_end(sbi, &ex))
> goto adjust_bex;
>
> ex.fe_logical = ac->ac_o_ex.fe_logical;
> @@ -5179,7 +5180,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac)
> ac->ac_b_ex.fe_logical = ex.fe_logical;
>
> BUG_ON(ac->ac_o_ex.fe_logical < ac->ac_b_ex.fe_logical);
> - BUG_ON(ac->ac_o_ex.fe_len > ac->ac_b_ex.fe_len);
> + BUG_ON(o_ex_end > extent_logical_end(sbi, &ex));
> BUG_ON(extent_logical_end(sbi, &ex) > orig_goal_end);
> }
>
> --
> 2.31.1
>
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR