Re: [PATCH net-next 3/3] net: dsa: mv88e6xxx: mac-auth/MAB implementation

From: Vladimir Oltean
Date: Wed Mar 23 2022 - 07:54:31 EST


On Wed, Mar 23, 2022 at 12:43:03PM +0100, Hans Schultz wrote:
> On ons, mar 23, 2022 at 13:21, Vladimir Oltean <olteanv@xxxxxxxxx> wrote:
> > On Wed, Mar 23, 2022 at 11:57:16AM +0100, Hans Schultz wrote:
> >> >> >> Another issue I see, is that there is a deadlock or similar issue when
> >> >> >> receiving violations and running 'bridge fdb show' (it seemed that
> >> >> >> member violations also caused this, but not sure yet...), as the unit
> >> >> >> freezes, not to return...
> >> >> >
> >> >> > Have you enabled lockdep, debug atomic sleep, detect hung tasks, things
> >> >> > like that?
> >> >>
> >> >> I have now determined that it is the rtnl_lock() that causes the
> >> >> "deadlock". The doit() in rtnetlink.c is under rtnl_lock() and is what
> >> >> takes care of getting the fdb entries when running 'bridge fdb show'. In
> >> >> principle there should be no problem with this, but I don't know if some
> >> >> interrupt queue is getting jammed as they are blocked from rtnetlink.c?
> >> >
> >> > Sorry, I forgot to respond yesterday to this.
> >> > By any chance do you maybe have an AB/BA lock inversion, where from the
> >> > ATU interrupt handler you do mv88e6xxx_reg_lock() -> rtnl_lock(), while
> >> > from the port_fdb_dump() handler you do rtnl_lock() -> mv88e6xxx_reg_lock()?
> >>
> >> If I release the mv88e6xxx_reg_lock() before calling the handler, I need
> >> to get it again for the mv88e6xxx_g1_atu_loadpurge() call at least. But
> >> maybe the vtu_walk also needs the mv88e6xxx_reg_lock()?
> >> I could also just release the mv88e6xxx_reg_lock() before the
> >> call_switchdev_notifiers() call and reacquire it immediately after?
> >
> > The cleanest way to go about this would be to have the call_switchdev_notifiers()
> > portion of the ATU interrupt handling at the very end of mv88e6xxx_g1_atu_prob_irq_thread_fn(),
> > with no hardware access needed, and therefore no reg_lock() held.
>
> So something like?
> mv88e6xxx_reg_unlock(chip);
> rtnl_lock();
> err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, brport, &info.info, NULL);
> rtnl_unlock();
> mv88e6xxx_reg_lock(chip);

No, call_switchdev_notifiers() should be the very end, no reg_lock() afterwards.
Do all the hardware handling you need, populate some variables to denote
that you need to notify switchdev, and if you do, lock the rtnetlink
mutex and do it.