Re: [PATCH net-next v1] net: phy: nxp-tja11xx: log critical health state

From: Guenter Roeck
Date: Tue Aug 10 2021 - 14:25:22 EST


On 8/10/21 8:05 AM, Andrew Lunn wrote:
Hi Oleksij

@@ -89,6 +91,12 @@ static struct tja11xx_phy_stats tja11xx_hw_stats[] = {
{ "phy_polarity_detect", 25, 6, BIT(6) },
{ "phy_open_detect", 25, 7, BIT(7) },
{ "phy_short_detect", 25, 8, BIT(8) },
+ { "phy_temp_warn (temp > 155C°)", 25, 9, BIT(9) },
+ { "phy_temp_high (temp > 180C°)", 25, 10, BIT(10) },
+ { "phy_uv_vddio", 25, 11, BIT(11) },
+ { "phy_uv_vddd_1v8", 25, 13, BIT(13) },
+ { "phy_uv_vdda_3v3", 25, 14, BIT(14) },
+ { "phy_uv_vddd_3v3", 25, 15, BIT(15) },
{ "phy_rem_rcvr_count", 26, 0, GENMASK(7, 0) },
{ "phy_loc_rcvr_count", 26, 8, GENMASK(15, 8) },

I'm not so happy abusing the statistic counters like this. Especially
when we have a better API for temperature and voltage: hwmon.

phy_temp_warn maps to hwmon_temp_max_alarm. phy_temp_high maps to
either hwmon_temp_crit_alarm or hwmon_temp_emergency_alarm.

The under voltage maps to hwmon_in_lcrit_alarm.


FWIW, the statistics counters in this driver are already abused
(phy_polarity_detect, phy_open_detect, phy_short_detect), so
I am not sure if adding more abuse makes a difference (and/or
if such abuse is common for phy drivers in general).

Guenter

@@ -630,6 +640,11 @@ static irqreturn_t tja11xx_handle_interrupt(struct phy_device *phydev)
return IRQ_NONE;
}
+ if (irq_status & MII_INTSRC_TEMP_ERR)
+ dev_err(dev, "Overtemperature error detected (temp > 155C°).\n");
+ if (irq_status & MII_INTSRC_UV_ERR)
+ dev_err(dev, "Undervoltage error detected.\n");
+

These are not actual errors, in the linux sense. So dev_warn() or
maybe dev_info().

Andrew