Re: [PATCH net] net: ena: fix shift-out-of-bounds in exponential backoff

From: Shay Agroskin
Date: Thu Jul 13 2023 - 03:50:50 EST



Krister Johansen <kjlx@xxxxxxxxxxxxxxxxxx> writes:


On Tue, Jul 11, 2023 at 08:47:32PM +0300, Shay Agroskin wrote:

Krister Johansen <kjlx@xxxxxxxxxxxxxxxxxx> writes:

> diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c
> b/drivers/net/ethernet/amazon/ena/ena_com.c
> index 451c3a1b6255..633b321d7fdd 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_com.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_com.c
> @@ -35,6 +35,8 @@
> #define ENA_REGS_ADMIN_INTR_MASK 1
> +#define ENA_MAX_BACKOFF_DELAY_EXP 16U
> +
> #define ENA_MIN_ADMIN_POLL_US 100
> #define ENA_MAX_ADMIN_POLL_US 5000
> @@ -536,6 +538,7 @@ static int > ena_com_comp_status_to_errno(struct
> ena_com_admin_queue *admin_queue,
> static void ena_delay_exponential_backoff_us(u32 exp, u32 > delay_us)
> {
> + exp = min_t(u32, exp, ENA_MAX_BACKOFF_DELAY_EXP);
> delay_us = max_t(u32, ENA_MIN_ADMIN_POLL_US, delay_us);
> delay_us = min_t(u32, delay_us * (1U << exp), > ENA_MAX_ADMIN_POLL_US);
> usleep_range(delay_us, 2 * delay_us);

Hi, thanks for submitting this patch (:

Absolutely; thanks for the review!

Going over the logic here, the driver sleeps for `delay_us` micro-seconds in
each iteration that this function gets called.

For an exp = 14 it'd sleep (I added units notation)
delay_us * (2 ^ exp) us = 100 * (2 ^ 14) us = (10 * (2 ^ 14)) / (1000000) s
= 1.6 s

For an exp = 15 it'd sleep
(10 * (2 ^ 15)) / (1000000) = 3.2s

To even get close to an overflow value, say exp=29 the driver would sleep in
a single iteration
53687 s = 14.9 hours.

The driver should stop trying to get a response from the device after a
timeout period received from the device which is 3 seconds by default.

The point being, it seems very unlikely to hit this overflow. Did you
experience it or was the issue discovered by a static analyzer ?

No, no use of fuzzing or static analysis. This was hit on a production
instance that was having ENA trouble.

I'm apparently reading the code differently. I thought this line:

> delay_us = min_t(u32, delay_us * (1U << exp), > ENA_MAX_ADMIN_POLL_US);

Was going to cap that delay_us at (delay_us * (1U << exp)) or
5000us, whichever is smaller. By that measure, if delay_us is 100 and
ENA_MAX_ADMIN_POLL_US is 5000, this should start getting capped after
exp = 6, correct? By my estimate, that puts it at between 160ms and
320ms of sleeping before one could hit this problem.

I went and pulled the logs out of the archive and have the following
timeline. This is seconds from boot as reported by dmesg:

11244.226583 - ena warns TX not completed on time, 10112000 usecs since
last napi execution, missing tx timeout val of 5000 msec

11245.190453 - netdev watchdog fires

11245.190781 - ena records Transmit timeout
11245.250739 - ena records Trigger reset on

11246.812620 - UBSAN message to console

11248.590441 - ena reports Reset inidication didn't turn off
11250.633545 - ena reports failure to reset device
12013.529338 - last logline before new boot

While the difference between the panic and the trigger reset is more
than 320ms, it is definitely on the order of seconds instead of hours.


Yup you're right. I was so entangled in my exponent calculations that I completely missed the min_t expression there.

That's quite an awkward design to be honest, I hope to submit a re-write for it in one of the releases.

Thanks again for the work you've put into it