Re: [PATCH RFC 1/4] net: skb: use line number to trace dropped skb

From: David Ahern
Date: Thu Feb 03 2022 - 10:48:25 EST


On 2/3/22 8:37 AM, Dongli Zhang wrote:
> Sometimes the kernel may not directly call kfree_skb() to drop the sk_buff.
> Instead, it "goto drop" and call kfree_skb() at 'drop'. This make it
> difficult to track the reason that the sk_buff is dropped.
>
> The commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()") has
> introduced the kfree_skb_reason() to help track the reason. However, we may
> need to define many reasons for each driver/subsystem.
>
> To avoid introducing so many new reasons, this is to use line number
> ("__LINE__") to trace where the sk_buff is dropped. As a result, the reason
> will be generated automatically.
>

I don't agree with this approach. It is only marginally better than the
old kfree_skb that only gave the instruction pointer. That tells you the
function that dropped the packet, but not why the packet is dropped.
Adding the line number only makes users have to consult the source code.

When I watch drop monitor for kfree_skb I want to know *why* the packet
was dropped, not the line number in the source code. e.g., dropmon
showing OTHERHOST means too many packets are sent to this host (e.g.,
hypervisor) that do not belong to the host or the VMs running on it, or
packets have invalid checksum (IP, TCP, UDP). Usable information by
everyone, not just someone with access to the source code for that
specific kernel.