[PATCH net-next 2/2] net: dsa: realtek: rtl8365mb: serialize indirect PHY register access

From: Alvin Šipraga
Date: Wed Feb 16 2022 - 11:03:48 EST


From: Alvin Šipraga <alsi@xxxxxxxxxxxxxxx>

Realtek switches in the rtl8365mb family can access the PHY registers of
the internal PHYs via the switch registers. This method is called
indirect access. At a high level, the indirect PHY register access
method involves reading and writing some special switch registers in a
particular sequence. This works for both SMI and MDIO connected
switches.

Currently the rtl8365mb driver does not take any care to serialize the
aforementioned access to the switch registers. In particular, it is
permitted for other driver code to access other switch registers while
the indirect PHY register access is ongoing. Locking is only done at the
regmap level. This, however, is a bug: concurrent register access, even
to unrelated switch registers, risks corrupting the PHY register value
read back via the indirect access method described above.

Arınç reported that the switch sometimes returns nonsense data when
reading the PHY registers. In particular, a value of 0 causes the
kernel's PHY subsystem to think that the link is down, but since most
reads return correct data, the link then flip-flops between up and down
over a period of time.

The aforementioned bug can be readily observed by:

1. Enabling ftrace events for regmap and mdio
2. Polling BSMR PHY register for a connected port;
it should always read the same (e.g. 0x79ed)
3. Wait for step 2 to give a different value

Example command for step 2:

while true; do phytool read swp2/2/0x01; done

On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY
register within a matter of seconds. The interleaved register access it
then evident in the trace log:

kworker/3:4-70 [003] ....... 1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd
phytool-16816 [002] ....... 1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0
kworker/3:4-70 [003] ....... 1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0
phytool-16816 [002] ....... 1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69
kworker/3:4-70 [003] ....... 1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0
phytool-16816 [002] ....... 1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041
kworker/3:4-70 [003] ....... 1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0
phytool-16816 [002] ....... 1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1
kworker/3:4-70 [003] ....... 1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be
phytool-16816 [002] ....... 1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0
kworker/3:4-70 [003] ....... 1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0
phytool-16816 [002] ....... 1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0
phytool-16816 [002] ....... 1927.142641: mdio_access: SMI-0 read phy:0x02 reg:0x01 val:0x0000 <- ?!
kworker/3:4-70 [003] ....... 1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0
kworker/3:4-70 [003] ....... 1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89
kworker/3:4-70 [003] ....... 1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be
kworker/3:4-70 [003] ....... 1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0
kworker/3:4-70 [003] ....... 1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0
kworker/3:4-70 [003] ....... 1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6

The kworker here is polling MIB counters for stats, as evidenced by the
register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This
polling is performed every 3 seconds, but is just one example of such
unsynchronized access.

Further investigation reveals the underlying problem: if we read from an
arbitrary register A and this read coincides with the indirect access
method in rtl8365mb_phy_ocp_read, then the final read from
RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG will always return the value in
register A. The value read back can be readily poisoned by repeatedly
reading back the value of another register A via debugfs in a busy loop
via the dd utility or similar.

This issue appears to be unique to the indirect PHY register access
pattern. In particular, it does not seem to impact similar sequential
register operations such MIB counter access.

To fix this problem, one must guard against exactly the scenario seen in
the above trace. In particular, other parts of the driver using the
regmap API must not be permitted to access the switch registers until
the PHY register access is complete. Fix this by using the newly
introduced "nolock" regmap in all PHY-related functions, and by aquiring
the regmap mutex at the top level of the PHY register access callbacks.
Although no issue has been observed with PHY register _writes_, this
change also serializes the indirect access method there. This is done
purely as a matter of convenience.

Fixes: 4af2950c50c8 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7LXKofQ@xxxxxxxxxxxxxx/
Reported-by: Arınç ÜNAL <arinc.unal@xxxxxxxxxx>
Reported-by: Luiz Angelo Daros de Luca <luizluca@xxxxxxxxx>
Signed-off-by: Alvin Šipraga <alsi@xxxxxxxxxxxxxxx>
---
drivers/net/dsa/realtek/rtl8365mb.c | 54 ++++++++++++++++++-----------
1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/drivers/net/dsa/realtek/rtl8365mb.c b/drivers/net/dsa/realtek/rtl8365mb.c
index 2ed592147c20..c39d6b744597 100644
--- a/drivers/net/dsa/realtek/rtl8365mb.c
+++ b/drivers/net/dsa/realtek/rtl8365mb.c
@@ -590,7 +590,7 @@ static int rtl8365mb_phy_poll_busy(struct realtek_priv *priv)
{
u32 val;

- return regmap_read_poll_timeout(priv->map,
+ return regmap_read_poll_timeout(priv->map_nolock,
RTL8365MB_INDIRECT_ACCESS_STATUS_REG,
val, !val, 10, 100);
}
@@ -604,7 +604,7 @@ static int rtl8365mb_phy_ocp_prepare(struct realtek_priv *priv, int phy,
/* Set OCP prefix */
val = FIELD_GET(RTL8365MB_PHY_OCP_ADDR_PREFIX_MASK, ocp_addr);
ret = regmap_update_bits(
- priv->map, RTL8365MB_GPHY_OCP_MSB_0_REG,
+ priv->map_nolock, RTL8365MB_GPHY_OCP_MSB_0_REG,
RTL8365MB_GPHY_OCP_MSB_0_CFG_CPU_OCPADR_MASK,
FIELD_PREP(RTL8365MB_GPHY_OCP_MSB_0_CFG_CPU_OCPADR_MASK, val));
if (ret)
@@ -617,8 +617,8 @@ static int rtl8365mb_phy_ocp_prepare(struct realtek_priv *priv, int phy,
ocp_addr >> 1);
val |= FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_ADDRESS_OCPADR_9_6_MASK,
ocp_addr >> 6);
- ret = regmap_write(priv->map, RTL8365MB_INDIRECT_ACCESS_ADDRESS_REG,
- val);
+ ret = regmap_write(priv->map_nolock,
+ RTL8365MB_INDIRECT_ACCESS_ADDRESS_REG, val);
if (ret)
return ret;

@@ -631,36 +631,42 @@ static int rtl8365mb_phy_ocp_read(struct realtek_priv *priv, int phy,
u32 val;
int ret;

+ mutex_lock(&priv->map_lock);
+
ret = rtl8365mb_phy_poll_busy(priv);
if (ret)
- return ret;
+ goto out;

ret = rtl8365mb_phy_ocp_prepare(priv, phy, ocp_addr);
if (ret)
- return ret;
+ goto out;

/* Execute read operation */
val = FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_MASK,
RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_VALUE) |
FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_RW_MASK,
RTL8365MB_INDIRECT_ACCESS_CTRL_RW_READ);
- ret = regmap_write(priv->map, RTL8365MB_INDIRECT_ACCESS_CTRL_REG, val);
+ ret = regmap_write(priv->map_nolock, RTL8365MB_INDIRECT_ACCESS_CTRL_REG,
+ val);
if (ret)
- return ret;
+ goto out;

ret = rtl8365mb_phy_poll_busy(priv);
if (ret)
- return ret;
+ goto out;

/* Get PHY register data */
- ret = regmap_read(priv->map, RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG,
- &val);
+ ret = regmap_read(priv->map_nolock,
+ RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG, &val);
if (ret)
- return ret;
+ goto out;

*data = val & 0xFFFF;

- return 0;
+out:
+ mutex_unlock(&priv->map_lock);
+
+ return ret;
}

static int rtl8365mb_phy_ocp_write(struct realtek_priv *priv, int phy,
@@ -669,32 +675,38 @@ static int rtl8365mb_phy_ocp_write(struct realtek_priv *priv, int phy,
u32 val;
int ret;

+ mutex_lock(&priv->map_lock);
+
ret = rtl8365mb_phy_poll_busy(priv);
if (ret)
- return ret;
+ goto out;

ret = rtl8365mb_phy_ocp_prepare(priv, phy, ocp_addr);
if (ret)
- return ret;
+ goto out;

/* Set PHY register data */
- ret = regmap_write(priv->map, RTL8365MB_INDIRECT_ACCESS_WRITE_DATA_REG,
- data);
+ ret = regmap_write(priv->map_nolock,
+ RTL8365MB_INDIRECT_ACCESS_WRITE_DATA_REG, data);
if (ret)
- return ret;
+ goto out;

/* Execute write operation */
val = FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_MASK,
RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_VALUE) |
FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_RW_MASK,
RTL8365MB_INDIRECT_ACCESS_CTRL_RW_WRITE);
- ret = regmap_write(priv->map, RTL8365MB_INDIRECT_ACCESS_CTRL_REG, val);
+ ret = regmap_write(priv->map_nolock, RTL8365MB_INDIRECT_ACCESS_CTRL_REG,
+ val);
if (ret)
- return ret;
+ goto out;

ret = rtl8365mb_phy_poll_busy(priv);
if (ret)
- return ret;
+ goto out;
+
+out:
+ mutex_unlock(&priv->map_lock);

return 0;
}
--
2.35.0