Re: [PATCH] blk-throttle: Calculate allowed value only when the throttle is enabled

From: Yu Kuai
Date: Mon Oct 16 2023 - 22:38:37 EST


Hi,

在 2023/10/17 4:06, Khazhy Kumykov 写道:
On Sun, Oct 15, 2023 at 6:47 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:

Hi,

在 2023/10/14 5:51, Khazhy Kumykov 写道:
Looking at the generic mul_u64_u64_div_u64 impl, it doesn't handle
overflow of the final result either, as far as I can tell. So while on
x86 we get a DE, on non-x86 we just get the wrong result.

(Aside: after 8d6bbaada2e0 ("blk-throttle: prevent overflow while
calculating wait time"), setting a very-high bps_limit would probably
also cause this crash, no?)

Would it be possible to have a "check_mul_u64_u64_div_u64_overflow()",
where if the result doesn't fit in u64, we indicate (and let the
caller choose what to do? Here we should just return U64_MAX)?

Absent that, maybe we can take inspiration from the generic
mul_u64_u64_div_u64? (Forgive the paste)

static u64 calculate_bytes_allowed(u64 bps_limit, unsigned long jiffy_elapsed)
{
+ /* Final result probably won't fit in u64 */
+ if (ilog2(bps_limit) + ilog2(jiffy_elapsed) - ilog2(HZ) > 62)

I'm not sure, but this condition looks necessary, but doesn't look
sufficient, for example, jiffy_elapsed cound be greater than HZ, while
ilog2(jiffy_elapsed) is equal to ilog2(HZ).
I believe 62 is correct, although admittedly it's less "intuitive"
than the check in mul_u64_u64_div_u64()....

The result overflows if log2(A * B / C) >= 64, so we want to ensure that:
log2(A) + log2(B) - log2(C) < 64

Given that:
ilog2(A) <= log2(A) < ilog2(A) + 1 // truncation defn
It follows that:
-log2(A) <= -ilog2(A) // Inverse rule
log2(A) - 1 < ilog2(A)

Starting from:
ilog2(A) + ilog2(B) - ilog2(C) <= X

We can show:
(log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < ilog2(A) + ilog2(B) +
(-ilog2(C)) // strict inequality here since the substitutions for A
and B terms are strictly less
(log2(A) - 1) + (log2(B) - 1) + (-log2(C)) < X
log2(A) + log2(B) - log2(C) < X + 2

So for X = 62, log2(A) + log2(B) - log2(C) < 64 must be true, and we
must be safe from overflow.

So... by converse, if ilog2(A) + ilog2(B) - ilog2(C) > 62, we cannot
guarantee that the result will not overflow - thus we bail out.

// end math

Thanks for the explanation, I understand that, so the problem is that
if the above condition(>62) match, the result may not overflow, but
U64_MAX is returned here, this is not correct but I guess this doesn't
matter in real word, it's impossible that user will issue more than
1<<63 bytes IO in an extended slice.

I'm good with this fix with some comments.

Thanks,
Kuai


It /is/ less exact than your proposal (sufficient, but not necessary),
but saves an extra 128bit mul.

I mostly just want us to pick /something/, since 6.6-rc and the LTSs
with the patch in question are busted currently. :)




Thanks,
Kuai

+ return U64_MAX;
return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed, (u64)HZ);
}

.



.