Re: [PATCH net-next 10/16] net: pcs: xpcs: Add generic DW XPCS MDIO-device support

From: Vladimir Oltean
Date: Fri Dec 08 2023 - 11:33:57 EST


On Fri, Dec 08, 2023 at 05:11:20PM +0300, Serge Semin wrote:
> My idea was to reuse the mdio_device which has already been created
> either by means of the MDIO-bus OF-subnode or by means of the MDIO-bus
> board_info infrastructure (can be utilized in the SJA1105 or Wangxun
> Tx GBE). The xpcs_create() method then either probes the device on the MDIO
> bus and gets ID from there, or just uses the custom IDs based on the
> OF compatible match table or on the platform_data. If no MDIO-device
> was created my patchset is supposed to preserve the previous
> semantics: create MDIO-device, probe the device on the MDIO-bus, get
> device IDs from there. See the next patch for more details:
> https://lore.kernel.org/netdev/20231205103559.9605-11-fancer.lancer@xxxxxxxxx/
>
> > That was attempted a while ago by
> > Sean Anderson with the Lynx PCS. Are you aware of the fact that even in
> > the good case in which binding the driver actually works, the user can
> > then come along and unbind it from the PCS device, and phylink isn't
> > prepared to handle that, so it will crash the kernel upon the next
> > phylink_pcs call?
>
> To be honest I didn't consider the driver bind/unbind option. But my
> case a bit different. DW XPCS MDIO-device is supposed to be created
> automatically by means of the DW XPCS MI driver from the DT-nodes
> hierarchy like this:
> mdio@1f05d000 {
> compatible = "snps,dw-xpcs-mi";
> reg = <0 0x1f05d000 0 0x1000>;
>
> xgmac_pcs: ethernet-pcs@0 {
> compatible = "snps,dw-xpcs";
> reg = <0>;
> };
> };
> The platform-device is created for the mdio@1f05d000 node for which
> the DW XPCS MI driver is loaded, which calls the
> devm_of_mdiobus_register() in the probe() method which registers the
> MDIO-bus and then creates the MDIO-device from the ethernet-pcs@0
> node. The DW XPCS MDIO-device driver is attached to that MDIO-device
> then. In such model the PCS can be supplied to the DW *MAC via the
> "pcs-handle = &xgmac_pcs" property.
>
> Regarding the current semantics it's preserved in the framework of the
> xpcs_create_byaddr() method (former xpcs_create_mdiodev()) by means of
> the next code snippet:
> if (mdiobus_is_registered_device(bus, addr)) {
> mdiodev = bus->mdio_map[addr];
> mdio_device_get(mdiodev);
> } else {
> mdiodev = mdio_device_create(bus, addr);
> if (IS_ERR(mdiodev))
> return ERR_CAST(mdiodev);
> }
> Device can be automatically created if before registering the MDIO-bus
> the xpcs_create_byaddr() caller registered the MDIO-device board info
> by means of the mdiobus_register_board_info() method. In addition to
> that it's now possible to supply some custom data (custom device IDs
> in my implementation) to the XPCS driver by means of the
> mdio_board_info.platform_data field. See the next patch for
> reference:
> https://lore.kernel.org/netdev/20231205103559.9605-14-fancer.lancer@xxxxxxxxx
>
> So what the difference with the Lynx PCS is that in my case the
> MDIO-device is created automatically as a result of the DW XPCS MI
> MDIO-bus registration. Additionally I implemented the MDIO-device
> creation based on the MDIO-board-info, thus there won't be need in the
> calling mdio_device_create() on each xpcs_create_mdiodev() invocation.
> The later part isn't that important in the framework of this
> conversation, but just so you be aware.

It's not really different, though. You can connect to the Lynx PCS both
ways, see dpaa2_pcs_create() which also searches for a pcs-handle before
calling lynx_pcs_create_fwnode(). What's subtly different is that we
don't (yet) have "fsl,lynx-pcs" compatible strings in the device tree.
So the MDIO controller will register the PCS devices as struct phy_device
(which still have an underlying struct mdio_device). The PCS layer
connects to the underlying struct mdio_device, and the phy_device on top
remains unconnected to any phylib/phylink MAC driver. That is confusing,
I should really get to adding those compatible strings to suppress the
phy_device creation.

> Regarding the driver bind/unbind. As I said I didn't actually consider
> that option. On the other hand my DW XPCS MDIO-device driver doesn't
> do actual probe() or remove(). The only implemented thing is the
> of_device_id table, which is used to assign PCS and PMA IDs if
> required based on the DT compatible property. So I can easily drop any
> MDIO device-driver part and parse the of_device_id table right in the
> xpcs_create_bynode(). From that perspective my implementation won't
> differ much from the Lynx PCS design. The only difference will be is
> the way the MDIO-bus is created and registered. In case of Lynx PCS
> the bus is created by the MAC-driver itself.

Nope, not true. Follow the pcs-handle in arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi.

> In my case DW XPCS MI is currently created in the framework of the
> separate platform driver. Do you think it would be better to follow
> the Lynx design pattern in order to get rid from the possibility of
> the DW XPCS MI driver being unbound behind the STMMAC+XPCS couple
> back?

I think you actually pointed out a flaw in the Lynx PCS design too.
Actually, it is a larger flaw in the kernel. You can also unbind the
MDIO bus which holds the phy_device, and phylib (and therefore also
phylink) won't expect that either, so it will crash.

> In this case the Dw MAC DT-node hierarchy would look like this:
>
> xgmac: ethernet@1f054000 {
> compatible = "snps,dwxgmac";
> reg = <0 0x1f054000 0 0x4000>;
> reg-names = "stmmaceth";
> ranges;
>
> ...
>
> pcs-handle = &xgmac_pcs;
>
> // DW XPCS MI to access the DW XPCS attached to the device
> mdio@1f05d000 {
> compatible = "snps,dwmac-mi";
> reg = <0 0x1f05d000 0 0x1000>;
>
> xgmac_pcs: ethernet-pcs@0 {
> compatible = "snps,dw-xpcs";
> reg = <0>;
> };
> };
>
> // Normal MDIO-bus to access external PHYs (it's also called
> // as SMA - Station Management Agent - by Synopsys)
> mdio {
> compatible = "snps,dwmac-mdio";
> #address-cells = <1>;
> #size-cells = <0>;
> };
> };
>
> I actually thought to use that hardware description pattern instead,
> but after some meditation around that I decided that having the DW
> XPCS device defined separately from the DW MAC node seemed better at
> least from the code separation point of view. Now I think that it
> wasn't the best decision. DW XPCS is always attached to the DW XGMAC
> controller. So it would be more correct having it defined as a
> sub-node. It would also helped to avoid the platform device driver
> bind/unbind problem.
>
> What do you think? Should I re-design my patchset to be supporting the
> design above? (After having conversion with you I am more inclined to
> do that now than to stick with the currently implemented solution.)

I think that the placement of the "mdio" node as lateral vs subordinate
to the "ethernet" node would have fixed the issue by mistake. We should
be looking at it as a structural problem of the kernel instead. Don't
let it influence what you believe should be the correct design.

> > The pcs-rzn1-miic.c driver puts a device_link to the MAC to at least
> > tear down the whole thing when the PCS is unbound, which is saner than
> > crashing the kernel. I don't see the equivalent protection mechanism here?
>
> You are right. I don't have any equivalent protection here. Thanks for
> suggesting a solution.

I think that a device link between the "ethernet" device and the "mdio"
device (controller, parent of the PHY or PCS), if the Ethernet is not a
parent of the MDIO controller, could also solve that. But it would also
require ACK from PHY maintainers, who may have grander plans to address
this snag.

> > Can't the xpcs continue to live without a bound driver? Having a
> > compatible string in the OF description is perfectly fine though,
> > and should absolutely not preclude that.
>
> As I explained above Dw XPCS device can live without a bound driver
> because the DW XPCS MDIO-driver doesn't do much but merely gets to be
> bound based on the of_device_id table. In my case the problem is in
> the DW XPCS MI driver which indeed can be detached. Please see my
> long-read text above.

Yeah, common design, common problem.