Re: [PATCH v8 net-next 05/12] net: dsa: propagate the locked flag down through the DSA layer

From: Vladimir Oltean
Date: Thu Oct 20 2022 - 11:26:35 EST


On Thu, Oct 20, 2022 at 05:58:42PM +0300, Ido Schimmel wrote:
> My understanding is that mv88e6xxx only reads the SMAC and FID/VID from
> hardware and notifies them to the bridge driver. It does not need to
> parse them out of the Ethernet frame that triggered the "violation".
> This is the "ugly" part (in my opinion).

I think that the Marvell approach is uglier, but maybe that's just me.
Between parsing a MAC SA/VLAN ID from an Ethernet frame than having to
concern myself with rate limiting IRQs which need MDIO access, I'd
rather parse Ethernet frames all day long.

With Ethernet we have all sorts of coping mechanisms, NAPI, IRQ
coalescing. The Ethernet interrupts are designed to be very high
bandwidth. You can even put a storm policer on Ethernet traffic and rate
limit the learn frames. I don't like where the Marvell specific impl is
going, I don't think it is a good first implementation of a new feature,
since it will inevitably shape the way in which other hardware with CPU
assisted learning will do things. For example, not sure if blackhole FDB
entries are going to be needed by other implementations as well.

I kind of thought that the Linux bridge would be more resilient to DoS
than it actually is. Now I'm not sure if me and Andrew gave bad advice
with the whole protection mechanisms put in place as UAPI for mv88e6xxx's
quirks.

> > The learn frames are "special" in the sense that they don't belong to
> > the data path of the software bridge*, they are just hardware specific
> > information which the driver must deal with, using a channel that
> > happens to be Ethernet and not an IRQ/MDIO.
>
> I think we misunderstand each other because I don't understand why you
> call them "special" nor "hardware specific information" :/

I call them special because there is no need to present these packets to
application software. Understood and agreed that they are identical to
the original packet which triggered the trap (plus some metadata which
denotes the trap reason, presumably), although I don't think this really
matters too much.

> We don't inject to the software data path some hardware specific
> frames, but rather the original Ethernet frames that triggered the
> violation. The same thing happens with packets that encountered a
> neighbour miss during routing or whose TTL was decremented to zero.
> The hardware can't generate ARP or ICMP packets, so the original
> packet is injected to the Rx path so that the kernel will generate the
> necessary control packets in response.

Can't speak for IP forwarding offload unfortunately, but it seems like
you presented a different/unrelated situation here. CPU assisted
learning is not slow path processing, because nothing needs to be done
further with that packet except for extracting its MAC SA/VID, and
learning it. The rest of the original packet is really not necessary.

> > *in other words, a bridge with proper RX filtering should not even
> > receive these frames, or would need special casing for BR_PORT_MAB to
> > not drop them in the first place.
> >
> > I would incline towards an unified approach for CPU assisted learning,
> > regardless of this (minor, IMO) difference between Marvell and other
> > vendors.
>
> OK, understood. Assuming you don't like the above, I need to check if we
> can do something similar to what mv88e6xxx is doing (because I don't
> think mv88e6xxx can do anything else).

No no, I like having an Ethernet channel (see the first reply to this
email), I think it has benefits and I don't want to make Spectrum follow
an inferior route just because that's the model.

But on the other hand, nobody right now needs the mechanism that Hans
put in place for setting BR_FDB_LOCKED in software, and notifying it
back to the driver. Moreover, when Ethernet-based CPU assisted learning
will come, this mechanism will not be the only possibility, and that
should be a separate discussion. I still think that generic helpers to
emit SWITCHDEV_FDB_ADD_TO_BRIDGE based on an skb are an equally valid
approach, and would diverge significantly less from Marvell without
imposing any real limitation. In the implementation proposed here, we
have variation for the sake of variation, and we come up with
hypothetical examples of how they might be useful. At least half this
patch set is full of maybes, I can't really say I like that.