Re: [RFC PATCH net-next 0/9] net: pcs: Add support for devices probed in the "usual" manner

From: Sean Anderson
Date: Fri Jul 29 2022 - 18:15:52 EST


On 7/21/22 5:41 PM, Sean Anderson wrote:
> On 7/20/22 9:53 AM, Vladimir Oltean wrote:
>> On Tue, Jul 19, 2022 at 03:34:45PM -0400, Sean Anderson wrote:
>>> We could do it, but it'd be a pretty big hack. Something like the
>>> following. Phylink would need to be modified to grab the lock before
>>> every op and check if the PCS is dead or not. This is of course still
>>> not optimal, since there's no way to re-attach a PCS once it goes away.
>>
>> You assume it's just phylink who operates on a PCS structure, but if you
>> include your search pool to also cover include/linux/pcs/pcs-xpcs.h,
>> you'll see a bunch of exported functions which are called directly by
>> the client drivers (stmmac, sja1105). At this stage it gets pretty hard
>> to validate that drivers won't attempt from any code path to do
>> something stupid with a dead PCS. All in all it creates an environment
>> with insanely weak guarantees; that's pretty hard to get behind IMO.
>
> Right. To do this properly, we'd need wrapper functions for all the PCS
> operations. And the super-weak guarantees is why I referred to this as a
> "hack". But we could certainly make it so that removing a PCS might not
> bring down the MAC.
>
>>> IMO a better solution is to use devlink and submit a patch to add
>>> notifications which the MAC driver can register for. That way it can
>>> find out when the PCS goes away and potentially do something about it
>>> (or just let itself get removed).
>>
>> Not sure I understand what connection there is between devlink (device
>> links) and PCS {de}registration notifications.
>
> The default action when a supplier is going to be removed is to remove
> the consumers. However, it'd be nice to notify the consumer beforehand.
> If we used device links, this would need to be integrated (since otherwise
> we'd only find out that a PCS was gone after the MAC was gone too).
>
>> We could probably add those
>> notifications without any intervention from the device core: we would
>> just need to make this new PCS "core" to register an blocking_notifier_call_chain
>> to which interested drivers could add their notifier blocks. How a> certain phylink user is going to determine that "hey, this PCS is
>> definitely mine and I can use it" is an open question. In any case, my
>> expectation is that we have a notifier chain, we can at least continue
>> operating (avoid unbinding the struct device), but essentially move our
>> phylink_create/phylink_destroy calls to within those notifier blocks.
>> Again, retrofitting this model to existing drivers, phylink API (and
>> maybe even its internal structure) is something that's hard to hop on
>> board of; I think it's a solution waiting for a problem, and I don't
>> have an interest to develop or even review it.
>
> I don't either. I'd much rather just bring down the whole MAC when any
> PCS gets removed. Whatever we decide on doing here should also be done
> for (serdes) phys as well, since they have all the same pitfalls. For
> that reason I'd rather use a generic, non-intrusive solution like device
> links. I know Russell mentioned composite devices, but I think those
> would have similar advantages/drawbacks as a device-link-based solution
> (unbinding of one device unbinds the rest).

OK, I've thought about this a bit more. Right now, you can crash the
kernel by unbinding a phy (either serdes or networking) and waiting for
a state change. The serdes problem could probably be solved by
strengthening the existing device_link_add calls. This will of course
unbind the netdev if you unbind the serdes. I think this is not a common
use case; if a user does this they probably know what they're doing (or
not).

The problem with ethernet phys is a bit worse. It's very common to have
something like

+ netdev
|
+-+ mdiobus
|
+-- phy

which poses a problem for device links. The phy is a child of the
netdev, so it should be unbound first. device_link_add will see this and
refuse to create a link, since such a link implies that netdev should be
unbound before phy.

One solution might be something like:

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index a74b320f5b27..05894e1c3e59 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -27,6 +27,7 @@
#include <linux/phy.h>
#include <linux/phy_led_triggers.h>
#include <linux/property.h>
+#include <linux/rtnetlink.h>
#include <linux/sfp.h>
#include <linux/skbuff.h>
#include <linux/slab.h>
@@ -3111,6 +3112,13 @@ static int phy_remove(struct device *dev)
{
struct phy_device *phydev = to_phy_device(dev);

+ // I'm pretty sure this races with multiple unbinds...
+ rtnl_lock();
+ device_unlock(dev);
+ dev_close(phydev->attached_dev);
+ device_lock(dev);
+ rtnl_unlock();
+ WARN_ON(phydev->attached_dev);
+
cancel_delayed_work_sync(&phydev->state_queue);

mutex_lock(&phydev->lock);

which is probably the least intrusive we can get. But this isn't very
nice for PCSs.

First, PCSs are not always used by netdevs. So there's no generic way to
ask the user "please clean up this PCS." Additionally, most PCS users
expect the PCS to be around for the lifetime of the driver (or at least
until they're done using it).

Maybe the best solution is to provide some helper functions to use with
bus_register_notifier and just let the drivers fend for themselves. Or
possibly default to a devlink, but allow registering a notifier instead.

--Sean