Re: [PATCH net-next 1/3] net: dsa: don't pass cloned skb's to drivers xmit function

From: Vladimir Oltean
Date: Sat Oct 17 2020 - 17:36:01 EST


On Sat, Oct 17, 2020 at 10:56:24PM +0200, Christian Eggers wrote:
> The status page seems to be out of date:
> http://vger.kernel.org/~davem/net-next.html

Yeah, it can do that sometimes. Extremely rarely, but it happens. But
net-next is still closed, nonetheless.

> The FAQ says: "Do not send new net-next content to netdev...". So there is no
> possibility for code review, is it?

You can always send patches as RFC (Request For Comments). In fact
that's what I'm going to do right now.

> > - Actually I was asking you this because sja1105 PTP no longer works
> > after this change, due to the change of txflags.
> The tail taggers seem to be immune against this change.

How?

> > Do you want me to try and send a version using pskb_expand_head and you
> > can test if it works for your tail-tagging switch?
> I already wanted to ask... My 2nd try (checking for !skb_cloned()) was already
> sufficient (for me). Hacking linux-net is very interesting, but I have many
> other items open... Testing would be no problem.

Ok, incoming.....

> > I think it would be best to use the unlikely(tail_tag) approach though.
> > The reallocation function should still be in the common code path. Even
> > for a non-1588 switch, there are other code paths that clone packets on
> > TX. For example, the bridge does that, when flooding packets.
> You already mentioned that you don't want to pass cloned packets to the tag
> drivers xmit() functions. I've no experience with the problems caused by
> cloned packets, but would cloned packets work anyway? Or must cloned packets
> not be changed (e.g. by tail-tagging)? Is there any value in first cloning in
> dsa_skb_tx_timestamp() and then unsharing in dsa_slave_xmit a few lines later?
> The issue I currently have only affects a very minor number of packets (cloned
> AND < ETH_ZLEN AND CONFIG_SLOB), so only these packets would need a copying.

Yes, we need to clone and then unshare immediately afterwards because
sja1105_xmit calls sja1105_defer_xmit, which schedules a workqueue. The
sja1105 driver assumes that the skb has already been cloned by then. So
basically, the sja1105 driver introduces a strict ordering requirement
that dsa_skb_tx_timestamp needs to be first, then p->xmit second. So we
necessarily must reallocate freshly cloned skbs, as things stand now.
I'll think about avoiding that, but not now. We were always reallocating
those frames before, using skb_cow_head. The only difference now is that
the skb, as it is passed to the tagger's xmit() function, is directly
writable. You'll see...