Re: [PATCH net-next v5 2/2] net: stmmac: use per-queue 64 bit statistics where necessary

From: Guenter Roeck
Date: Thu Sep 21 2023 - 16:41:25 EST


On 9/21/23 12:56, Uwe Kleine-König wrote:
Hello Guenter,

On Thu, Sep 21, 2023 at 11:34:09AM -0700, Guenter Roeck wrote:
On Tue, Jul 18, 2023 at 12:06:30AM +0800, Jisheng Zhang wrote:
Currently, there are two major issues with stmmac driver statistics
First of all, statistics in stmmac_extra_stats, stmmac_rxq_stats
and stmmac_txq_stats are 32 bit variables on 32 bit platforms. This
can cause some stats to overflow after several minutes of
high traffic, for example rx_pkt_n, tx_pkt_n and so on.

Secondly, if HW supports multiqueues, there are frequent cacheline
ping pongs on some driver statistic vars, for example, normal_irq_n,
tx_pkt_n and so on. What's more, frequent cacheline ping pongs on
normal_irq_n happens in ISR, this makes the situation worse.

To improve the driver, we convert those statistics to 64 bit, implement
ndo_get_stats64 and update .get_ethtool_stats implementation
accordingly. We also use per-queue statistics where necessary to remove
the cacheline ping pongs as much as possible to make multiqueue
operations faster. Those statistics which are not possible to overflow
and not frequently updated are kept as is.

Signed-off-by: Jisheng Zhang <jszhang@xxxxxxxxxx>

Your patch results in lockdep splats. This is with the orangepi-pc
emulation in qemu.

[ 11.126950] dwmac-sun8i 1c30000.ethernet eth0: PHY [mdio_mux-0.1:01] driver [Generic PHY] (irq=POLL)
[ 11.127912] dwmac-sun8i 1c30000.ethernet eth0: No Safety Features support found
[ 11.128294] dwmac-sun8i 1c30000.ethernet eth0: No MAC Management Counters available
[ 11.128511] dwmac-sun8i 1c30000.ethernet eth0: PTP not supported by HW
[ 11.138990] dwmac-sun8i 1c30000.ethernet eth0: configuring for phy/mii link mode
[ 11.144387] INFO: trying to register non-static key.
[ 11.144483] The code is fine but needs lockdep annotation, or maybe
[ 11.144568] you didn't initialize this object before use?
[ 11.144640] turning off the locking correctness validator.
[ 11.144845] CPU: 2 PID: 688 Comm: ip Tainted: G N 6.6.0-rc2 #1
[ 11.144956] Hardware name: Allwinner sun8i Family
[ 11.145137] unwind_backtrace from show_stack+0x10/0x14
[ 11.145610] show_stack from dump_stack_lvl+0x68/0x90
[ 11.145692] dump_stack_lvl from register_lock_class+0x99c/0x9b0
[ 11.145779] register_lock_class from __lock_acquire+0x6c/0x2244
[ 11.145861] __lock_acquire from lock_acquire+0x11c/0x368
[ 11.145938] lock_acquire from stmmac_get_stats64+0x350/0x374
[ 11.146021] stmmac_get_stats64 from dev_get_stats+0x3c/0x160
[ 11.146101] dev_get_stats from rtnl_fill_stats+0x30/0x118
[ 11.146179] rtnl_fill_stats from rtnl_fill_ifinfo.constprop.0+0x82c/0x1770
[ 11.146273] rtnl_fill_ifinfo.constprop.0 from rtmsg_ifinfo_build_skb+0xac/0x138
[ 11.146370] rtmsg_ifinfo_build_skb from rtmsg_ifinfo+0x44/0x7c
[ 11.146452] rtmsg_ifinfo from __dev_notify_flags+0xac/0xd8
[ 11.146531] __dev_notify_flags from dev_change_flags+0x48/0x54
[ 11.146612] dev_change_flags from do_setlink+0x244/0xe6c
[ 11.146689] do_setlink from rtnl_newlink+0x514/0x838
[ 11.146761] rtnl_newlink from rtnetlink_rcv_msg+0x170/0x5b0
[ 11.146841] rtnetlink_rcv_msg from netlink_rcv_skb+0xb4/0x10c
[ 11.146925] netlink_rcv_skb from netlink_unicast+0x190/0x254
[ 11.147006] netlink_unicast from netlink_sendmsg+0x1dc/0x460
[ 11.147086] netlink_sendmsg from ____sys_sendmsg+0xa0/0x2a0
[ 11.147168] ____sys_sendmsg from ___sys_sendmsg+0x68/0x94
[ 11.147245] ___sys_sendmsg from sys_sendmsg+0x4c/0x88
[ 11.147329] sys_sendmsg from ret_fast_syscall+0x0/0x1c
[ 11.147439] Exception stack(0xf23edfa8 to 0xf23edff0)
[ 11.147558] dfa0: 00000000 00000000 00000003 bef9a8d8 00000000 00000000
[ 11.147668] dfc0: 00000000 00000000 ffffffff 00000128 00000001 00000002 bef9af4a bef9af4d
[ 11.147769] dfe0: bef9a868 bef9a858 b6f9ddac b6f9d228
[ 11.150020] dwmac-sun8i 1c30000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx

My apologies for the noise if this has already been reported.

This seems to be the issue I reported earlier. So you might want to test
the patch that fixed it for me:
https://lore.kernel.org/netdev/20230917165328.3403-1-jszhang@xxxxxxxxxx/


That just showed up in mainline and, yes, of course it fixes the problem.
As I said, sorry for the noise.

Guenter