Re: [RFC PATCH net] packets: Always register packet sk in the same order

From: Willem de Bruijn
Date: Fri Mar 15 2019 - 10:14:29 EST


On Fri, Mar 15, 2019 at 5:03 AM Maxime Chevallier
<maxime.chevallier@xxxxxxxxxxx> wrote:
>
> Hi Willem,
>
> On Thu, 14 Mar 2019 11:24:54 -0400
> Willem de Bruijn <willemdebruijn.kernel@xxxxxxxxx> wrote:
>
> >On Thu, Mar 14, 2019 at 8:27 AM Maxime Chevallier
> ><maxime.chevallier@xxxxxxxxxxx> wrote:
> >>
> >> When using fanouts with AF_PACKET, the demux functions such as
> >> fanout_demux_cpu will return an index in the fanout socket array, which
> >> corresponds to the selected socket.
> >>
> >> The ordering of this array depends on the order the sockets were added
> >> to a given fanout group, so for FANOUT_CPU this means sockets are bound
> >> to cpus in the order they are configured, which is OK.
> >>
> >> However, when stopping then restarting the interface these sockets are
> >> bound to, the sockets are reassigned to the fanout group in the reverse
> >> order, due to the fact that they were inserted at the head of the
> >> interface's AF_PACKET socket list.
> >>
> >> This means that traffic that was directed to the first socket in the
> >> fanout group is now directed to the last one after an interface restart.
> >>
> >> In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
> >> socket that used to receive traffic from the last CPU after an interface
> >> restart.
> >
> >The above assumes that sockets are added to a fanout group in the
> >order in which they are created. That is a reasonable assumption,
> >though not strictly required. This can be worked around in userspace
> >by inserting in the inverse order. Still, good to fix.
> >
> >It does change the order of output of proc and diag, which is an
> >unfortunately side effect. But insertion order is less surprising and
> >I don't see a simple option to only traverse the sklist in inverse
> >order on packet_notifier NETDEV_UP (which would avoid this).
>
> Thanks for the review ! Clearly the solution proposed by this patch has
> side-effects, and as you said I don't see an easy way to do that in
> another manner.
>
> >> This commit introduces a helper to add a socket at the tail of a list,
> >> then uses it to register AF_PACKET sockets.
> >>
> >> Fixes: 808f5114a920 ("packet: convert socket list to RCU (v3)")
> >
> >Before this patch, insertion with sk_add_node is also at the head.
> >This does not seem like the right commit? This behavior has probably
> >existed as long as fanout.
>
> Yes I agree, I'm not even sure if this patch should be targeted to -net
> or -net-next. If this is OK after reviews, I can resend it without RFC
> and without this misleading Fixes tag.

Sounds good.

Since current behavior can split packets of the same flow across
multiple sockets, even for fanout_demux_hash, I think this warrants
net. Then the Fixes tag would be dc99f600698d ("packet: Add fanout
support.").

Please mention the reordering of proc and diag output in the commit message.

Thanks