Re: [PATCH net-next v1] net: phy: nxp-tja11xx: log critical health state

From: Andrew Lunn
Date: Tue Aug 10 2021 - 11:06:17 EST


Hi Oleksij

> @@ -89,6 +91,12 @@ static struct tja11xx_phy_stats tja11xx_hw_stats[] = {
> { "phy_polarity_detect", 25, 6, BIT(6) },
> { "phy_open_detect", 25, 7, BIT(7) },
> { "phy_short_detect", 25, 8, BIT(8) },
> + { "phy_temp_warn (temp > 155C°)", 25, 9, BIT(9) },
> + { "phy_temp_high (temp > 180C°)", 25, 10, BIT(10) },
> + { "phy_uv_vddio", 25, 11, BIT(11) },
> + { "phy_uv_vddd_1v8", 25, 13, BIT(13) },
> + { "phy_uv_vdda_3v3", 25, 14, BIT(14) },
> + { "phy_uv_vddd_3v3", 25, 15, BIT(15) },
> { "phy_rem_rcvr_count", 26, 0, GENMASK(7, 0) },
> { "phy_loc_rcvr_count", 26, 8, GENMASK(15, 8) },

I'm not so happy abusing the statistic counters like this. Especially
when we have a better API for temperature and voltage: hwmon.

phy_temp_warn maps to hwmon_temp_max_alarm. phy_temp_high maps to
either hwmon_temp_crit_alarm or hwmon_temp_emergency_alarm.

The under voltage maps to hwmon_in_lcrit_alarm.

> @@ -630,6 +640,11 @@ static irqreturn_t tja11xx_handle_interrupt(struct phy_device *phydev)
> return IRQ_NONE;
> }
>
> + if (irq_status & MII_INTSRC_TEMP_ERR)
> + dev_err(dev, "Overtemperature error detected (temp > 155C°).\n");
> + if (irq_status & MII_INTSRC_UV_ERR)
> + dev_err(dev, "Undervoltage error detected.\n");
> +

These are not actual errors, in the linux sense. So dev_warn() or
maybe dev_info().

Andrew