RE: [PATCH] bonding:avoid repeated display of same link status change

From: Manish Kumar Singh
Date: Thu Oct 25 2018 - 05:21:22 EST




> -----Original Message-----
> From: Michal Kubecek [mailto:mkubecek@xxxxxxx]
> Sent: 23 ààààààà 2018 22:08
> To: Eric Dumazet
> Cc: Mahesh Bandewar (àààà ààààààà); Manish Kumar Singh; linux-netdev; Jay
> Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH] bonding:avoid repeated display of same link status
> change
>
> On Tue, Oct 23, 2018 at 06:26:14PM +0200, Michal Kubecek wrote:
> > On Tue, Oct 23, 2018 at 09:10:44AM -0700, Eric Dumazet wrote:
> > >
> > >
> > > On 10/23/2018 08:54 AM, Mahesh Bandewar (àààà ààààààà) wrote:
> > >
> > > > Atomic operations are expensive (on certain architectures) and miimon
> > > > runs quite frequently. Is the added cost of these atomic operations
> > > > even worth just to avoid *duplicate info* messages? This seems like a
> > > > overkill!
> > >
> > > atomic_read() is a simple read, no atomic operation involved.
> > >
> > > Same remark for atomic_set()
> >
> > Which makes me wonder if the patch really needs atomic_t.
>
> IMHO it does not. AFAICS multiple instances of bond_mii_monitor() cannot
> run simultaneously for the same bond so that there doesn't seem to be
> anything to collide with. (And if they could, we would need to test and
> set the flag atomically in bond_miimon_inspect().)
>
Yes, Michal, we are inline with your understanding.
when the -original- patch was posted to upstream there was no -synchronization- nor -racing- addressing code was in read/write of this added filed, as we -never- saw need for either.

-only- writer of the added field is bond_mii_monitor.
-only- reader of the added field is bond_miimon_inspect.
-this writer & reader -never- can run concurrently.
-writer invokes the reader.

hence, imo uint_8 rtnl_needed is all what is needed; with bond_mii_monitor doing rtnl_needed = 1; and bond_miimon_inspect doing if rtnl_needed.

here is the gravity of the situation with multiple customers whose names including machine names redacted:

4353 May 31 02:38:57 hostname kernel: ixgbe 0000:03:00.0: removed PHC on p2p1
4354 May 31 02:38:57 hostname kernel: public: link status down for active interface p2p1, disabling it in 100 ms
4355 May 31 02:38:57 hostname kernel: public: link status down for active interface p2p1, disabling it in 100 ms
4356 May 31 02:38:57 hostname kernel: public: link status definitely down for interface p2p1, disabling it
4357 May 31 02:38:57 hostname kernel: public: making interface p2p2 the new active one
4358 May 31 02:38:59 hostname kernel: ixgbe 0000:03:00.0: registered PHC device on p2p1
4359 May 31 02:39:00 hostname kernel: ixgbe 0000:03:00.0 p2p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX
4360 May 31 02:39:00 hostname kernel: public: link status up for interface p2p1, enabling it in 200 ms
4361 May 31 02:39:00 hostname kernel: public: link status definitely up for interface p2p1, 10000 Mbps full duplex
4362 May 31 02:45:37 hostname journal: Missed 217723 kernel messages
4363 May 31 02:45:37 hostname kernel: public: link status down for active interface p2p2, disabling it in 100 ms
---------------------
11000+ APPROX SAME REPEATED MESSAGES in second
---------------------
15877 May 31 02:45:37 hostname kernel: public: link status down for active interface p2p2, disabling it in 100 ms
15878 May 31 02:45:37 hostname kernel: public: link status definitely down for interface p2p2, disabling it
15879 May 31 02:45:37 hostname kernel: public: making interface p2p1 the new active one

Thanks,
Manish

> Michal Kubecek