Re: [PATCH] tg3: add new module param to force device power down on reboot

From: Heiner Kallweit
Date: Tue Jan 09 2024 - 15:31:39 EST


On 09.01.2024 20:45, Andrea Fois wrote:
> The bug #1917471 was fixed in commit 2ca1c94ce0b6 ("tg3: Disable tg3
> device on system reboot to avoid triggering AER") but was reintroduced
> by commit 9fc3bc764334 ("tg3: power down device only on
> SYSTEM_POWER_OFF").
>
> The problem described in #1917471 is still consistently replicable on
> reboots on Dell Servers (i.e. R750xs with BCM5720 LOM), causing NMIs
> (i.e. NMI received for unknown reason 38 on cpu 0) after 9fc3bc764334
> was committed.
>
> The problem is detected also by the Lifecycle controller and logged as
> a PCI Bus Error for the device.
>
> As the problems addressed by 2ca1c94ce0b6 and by 9fc3bc764334 requires
> opposite strategies, a new module param "force_pwr_down_on_reboot"
> <bool> is introduced to fix both scenarios:
>
Adding module parameters is discouraged. What I see could try:

- limit 9fc3bc764334 to the specific machine type mentioned in the
commit message (based DMI info)
- 2ca1c94ce0b6 performs two actions: power down tg3 and disable device
Based on the commit description disabling the device might be sufficient.

> force_pwr_down_on_reboot = 0/N/n = disable, keep the current
> behavior, don't force dev
> power down on reboot
>
> force_pwr_down_on_reboot = 1/Y/y = enable, revert to the
> behavior of 2ca1c94ce0b6,
> force dev power down on reboot
>
> Fixes: 9fc3bc764334 ("tg3: power down device only on SYSTEM_POWER_OFF")
> Signed-off-by: Andrea Fois <andrea.fois@xxxxxxxxxxxxx>
> ---
> drivers/net/ethernet/broadcom/tg3.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
> index f52830dfb26a..287786357c9b 100644
> --- a/drivers/net/ethernet/broadcom/tg3.c
> +++ b/drivers/net/ethernet/broadcom/tg3.c
> @@ -233,6 +233,12 @@ static int tg3_debug = -1; /* -1 == use TG3_DEF_MSG_ENABLE as value */
> module_param(tg3_debug, int, 0);
> MODULE_PARM_DESC(tg3_debug, "Tigon3 bitmapped debugging message enable value");
>
> +static bool force_pwr_down_on_reboot; /* false == Don't force the power down of
> + * the device during reboot, only on SYSTEM_POWER_OFF
> + */
> +module_param(force_pwr_down_on_reboot, bool, 0x644);
> +MODULE_PARM_DESC(force_pwr_down_on_reboot, "Tigon3 force power down of the device on reboot enable value");
> +
> #define TG3_DRV_DATA_FLAG_10_100_ONLY 0x0001
> #define TG3_DRV_DATA_FLAG_5705_10_100 0x0002
>
> @@ -18197,7 +18203,7 @@ static void tg3_shutdown(struct pci_dev *pdev)
> if (netif_running(dev))
> dev_close(dev);
>
> - if (system_state == SYSTEM_POWER_OFF)
> + if (system_state == SYSTEM_POWER_OFF || force_pwr_down_on_reboot)
> tg3_power_down(tp);
>
> rtnl_unlock();