Re: [PATCH net-next] packet: Account for VLAN_HLEN in csum_start when virtio_net_hdr is enabled

From: Willem de Bruijn
Date: Fri Nov 24 2023 - 11:24:46 EST


Mike Pattrick wrote:
> On Thu, Nov 23, 2023 at 7:06 PM Willem de Bruijn
> <willemdebruijn.kernel@xxxxxxxxx> wrote:
> >
> > Mike Pattrick wrote:
> > > On Thu, Nov 23, 2023 at 4:25 PM Willem de Bruijn
> > > <willemdebruijn.kernel@xxxxxxxxx> wrote:
> > > >
> > > > Mike Pattrick wrote:
> > > > > Af_packet provides checksum offload offsets to usermode applications
> > > > > through struct virtio_net_hdr when PACKET_VNET_HDR is enabled on the
> > > > > socket. For skbuffs with a vlan being sent to a SOCK_RAW socket,
> > > > > af_packet will include the link level header and so csum_start needs
> > > > > to be adjusted accordingly.
> > > >
> > > > Is this patch based on observing an incorrect offset in a workload,
> > > > or on code inspection?
> > >
> > > Based on an incorrect offset in a workload. The setup involved sending
> > > vxlan traffic though a veth interface configured with a vlan. The
> > > vnet_hdr's csum_start value was off by 4, and this problem went away
> > > when the vlan was removed.
> > >
> > > I'll take another look at this patch.
> >
> > This is a vlan device on top of a veth device? On which device and at
> > which point (ingress or egress) are you receiving the packet over the
> > packet socket?
>
> Just for maximum clarity I'll include the extracted commands below,
> but roughly there is a vlan device on top of a vxlan device on top of
> a vlan device on top of a veth, in a namespace.
>
> ip netns add at_ns0
> ip netns exec at_ns0 ip link add dev at_vxlan1 type vxlan remote
> 172.31.1.100 id 0 dstport 4789
> ip netns exec at_ns0 ip addr add dev at_vxlan1 10.2.1.1/24
> ip netns exec at_ns0 ip link set dev at_vxlan1 mtu 1450 up
> ip link add p0 type veth peer name ovs-p0
> ethtool -K p0 sg on
> ethtool -K p0 tso on
> ip link set p0 netns at_ns0
> ip link set dev ovs-p0 up
> ip netns exec at_ns0 ip addr add "172.31.2.1/24" dev p0
> ip netns exec at_ns0 ip link set dev p0 up
> ip netns exec at_ns0 ip link add link at_vxlan1 name at_vxlan1.100
> type vlan proto 802.1q id 100
> ip netns exec at_ns0 ip link set dev at_vxlan1.100 up
> ip netns exec at_ns0 ip addr add dev at_vxlan1.100 "10.1.1.1/24"
> ip netns exec at_ns0 ip link add link p0 name p0.42 type vlan proto 802.1q id 42
> ip netns exec at_ns0 ip link set dev p0.42 up
> ip netns exec at_ns0 ip addr add dev p0.42 "172.31.1.1/24"
> ip addr add "172.31.1.100/24" dev p0
> ip link set dev p0 up

Are these two lines intended to be ovs-p0 rather than p0?
As p0 lives inside the netns (and already has an address)

> ip netns exec at_ns0 ping 10.1.1.100
>
> An AF_PACKET socket on ovs-p0 receives the incorrect csum_start.
> Setting up the same with a geneve tunnel and udpcsum enabled produces
> the same result. Removing vlan 100 also yields an incorrect
> csum_start. Removing only vlan 42 yields a correct csum_start.

I wonder if a simple checksum offloaded UDP packet sent over a vlan
directly on top of veth, like .100 has a correct offset before your
patch, but incorrect offset after.

The issue you are encountering might be that vxlan does not support
vlan offload, so validate_xmit_vlan will pull the vlan into the
header with __vlan_hwaccel_push_inside, but this does not adjust
csum_start. Not sure, but perhaps you can instrument your kernel
and see where the offset goes wrong. It is not inside the pf_packet
socket, in any case.

> >
> > From a quick glance, in all cases that I see the VLAN tag is kept in
> > skb->vlan_tci, so is never part of the packet payload.
> >
> > But checksum offload with VXLAN can be non-trivial on its own. If
> > type & SKB_GSO_UDP_TUNNEL_CSUM | SKB_GSO_TUNNEL_REMCSUM, say. Then
> > csum_start will point to the checksum in vxlanhdr.
> >
>