Re: [PATCH v2] sock: add tracepoint for send recv length

From: Paolo Abeni
Date: Thu Jan 05 2023 - 07:03:59 EST


On Thu, 2023-01-05 at 18:00 +0800, Yunhui Cui wrote:
> Add 2 tracepoints to monitor the tcp/udp traffic
> of per process and per cgroup.
>
> Regarding monitoring the tcp/udp traffic of each process, the existing
> implementation is https://www.atoptool.nl/netatop.php.
> This solution is implemented by registering the hook function at the hook
> point provided by the netfilter framework.
>
> These hook functions may be in the soft interrupt context and cannot
> directly obtain the pid. Some data structures are added to bind packets
> and processes. For example, struct taskinfobucket, struct taskinfo ...
>
> Every time the process sends and receives packets it needs multiple
> hashmaps,resulting in low performance and it has the problem fo inaccurate
> tcp/udp traffic statistics(for example: multiple threads share sockets).
>
> Based on these 2 tracepoints, we have optimized and tested performance.
> Time Per Request as an indicator, without monitoring: 50.95ms,
> netatop: 63.27 ms, Hook on these tracepoints: 52.24ms.
> The performance has been improved 10 times. The tcp/udp traffic of each
> process has also been accurately counted.
>
> We can obtain the information with kretprobe, but as we know, kprobe gets
> the result by trappig in an exception, which loses performance compared
> to tracepoint. We did a test for performance comparison. The results are
> as follows. Time per request, sock_sendmsg(k,kr): 12.382ms,
> tcp_send_length(tracepoint): 11.887ms,without hook:11.222ms

12 ms per packet? I hope there is a an error in the unit of
measurement.

I'm unsure the delta wrt kreprobe justifies this change.

In any case you need to include the netdev ML into the recipients list,
and even Cong Wang, as he provided feedback on v1.

Thanks,

Paolo