Re: [PATCH net-next] packet: Account for VLAN_HLEN in csum_start when virtio_net_hdr is enabled

From: Mike Pattrick
Date: Thu Nov 23 2023 - 18:10:11 EST


On Thu, Nov 23, 2023 at 4:25 PM Willem de Bruijn
<willemdebruijn.kernel@xxxxxxxxx> wrote:
>
> Mike Pattrick wrote:
> > Af_packet provides checksum offload offsets to usermode applications
> > through struct virtio_net_hdr when PACKET_VNET_HDR is enabled on the
> > socket. For skbuffs with a vlan being sent to a SOCK_RAW socket,
> > af_packet will include the link level header and so csum_start needs
> > to be adjusted accordingly.
>
> Is this patch based on observing an incorrect offset in a workload,
> or on code inspection?

Based on an incorrect offset in a workload. The setup involved sending
vxlan traffic though a veth interface configured with a vlan. The
vnet_hdr's csum_start value was off by 4, and this problem went away
when the vlan was removed.

I'll take another look at this patch.

>
> As the referenced patch mentions, VLAN_HLEN adjustment is needed
> in macvtap because it pulls the vlan header from skb->vlan_tci. At
> which point skb->csum_start is wrong.
>
> "Commit f09e2249c4f5 ("macvtap: restore vlan header on user read")
> added this feature to macvtap. Commit 3ce9b20f1971 ("macvtap: Fix
> csum_start when VLAN tags are present") then fixed up csum_start."
>
> But the commit also mentions "Virtio, packet and uml do not insert
> the vlan header in the user buffer.". This situation has not changed.
>
> Packet sockets may receive packets with VLAN headers present, but
> unless they were inserted manually before passing to user, as macvtap
> does, this does not affect csum_start.
>
> Packet sockets support reading those skb->vlan_tci stored VLAN
> headers using AUXDATA.
>
> > Fixes: fd3a88625844 ("net: in virtio_net_hdr only add VLAN_HLEN to csum_start if payload holds vlan")
>
> The fix should target net, not net-next.
>
> > Signed-off-by: Mike Pattrick <mkp@xxxxxxxxxx>
> > ---
> > net/packet/af_packet.c | 36 ++++++++++++++++++++++++++----------
> > 1 file changed, 26 insertions(+), 10 deletions(-)
> >
> > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> > index a84e00b5904b..f6b602ffe383 100644
> > --- a/net/packet/af_packet.c
> > +++ b/net/packet/af_packet.c
> > @@ -2092,15 +2092,23 @@ static unsigned int run_filter(struct sk_buff *skb,
> > }
> >
> > static int packet_rcv_vnet(struct msghdr *msg, const struct sk_buff *skb,
> > - size_t *len, int vnet_hdr_sz)
> > + size_t *len, int vnet_hdr_sz,
> > + const struct sock *sk)
> > {
> > struct virtio_net_hdr_mrg_rxbuf vnet_hdr = { .num_buffers = 0 };
> > + int vlan_hlen;
> >
> > if (*len < vnet_hdr_sz)
> > return -EINVAL;
> > *len -= vnet_hdr_sz;
> >
> > - if (virtio_net_hdr_from_skb(skb, (struct virtio_net_hdr *)&vnet_hdr, vio_le(), true, 0))
> > + if (sk->sk_type == SOCK_RAW && skb_vlan_tag_present(skb))
> > + vlan_hlen = VLAN_HLEN;
> > + else
> > + vlan_hlen = 0;
> > +
> > + if (virtio_net_hdr_from_skb(skb, (struct virtio_net_hdr *)&vnet_hdr,
> > + vio_le(), true, vlan_hlen))
> > return -EINVAL;
> >
> > return memcpy_to_msg(msg, (void *)&vnet_hdr, vnet_hdr_sz);
> > @@ -2368,13 +2376,21 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
> > __set_bit(slot_id, po->rx_ring.rx_owner_map);
> > }
> >
> > - if (vnet_hdr_sz &&
> > - virtio_net_hdr_from_skb(skb, h.raw + macoff -
> > - sizeof(struct virtio_net_hdr),
> > - vio_le(), true, 0)) {
> > - if (po->tp_version == TPACKET_V3)
> > - prb_clear_blk_fill_status(&po->rx_ring);
> > - goto drop_n_account;
> > + if (vnet_hdr_sz) {
> > + int vlan_hlen;
> > +
> > + if (sk->sk_type == SOCK_RAW && skb_vlan_tag_present(skb))
> > + vlan_hlen = VLAN_HLEN;
> > + else
> > + vlan_hlen = 0;
> > +
> > + if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
> > + sizeof(struct virtio_net_hdr),
> > + vio_le(), true, vlan_hlen)) {
> > + if (po->tp_version == TPACKET_V3)
> > + prb_clear_blk_fill_status(&po->rx_ring);
> > + goto drop_n_account;
> > + }
> > }
> >
> > if (po->tp_version <= TPACKET_V2) {
> > @@ -3464,7 +3480,7 @@ static int packet_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
> > packet_rcv_try_clear_pressure(pkt_sk(sk));
> >
> > if (vnet_hdr_len) {
> > - err = packet_rcv_vnet(msg, skb, &len, vnet_hdr_len);
> > + err = packet_rcv_vnet(msg, skb, &len, vnet_hdr_len, sk);
> > if (err)
> > goto out_free;
> > }
> > --
> > 2.40.1
> >
>
>