Re: Fwd: RCU stalls with wireguard over bonding over igb on Linux 6.3.0+

From: Bagas Sanjaya
Date: Sun Jul 02 2023 - 10:08:59 EST


On 7/2/23 19:37, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 02.07.23 13:57, Bagas Sanjaya wrote:
>> [also Cc: original reporter]
>
> BTW: I think you CCed too many developers here. There are situations
> where this can makes sense, but it's rare. And if you do this too often
> people might start to not really look into your mails or might even
> ignore them completely.
>
> Normally it's enough to write the mail to (1) the people in the
> signed-off-by-chain, (2) the maintainers of the subsystem that merged a
> commit, and (3) the lists for all affected subsystems; leave it up to
> developers from the first two groups to CC the maintainers of the third
> group.
>

Hi,

In this case I had to also Cc: wireguard, bonding, RCU, and x86 people,
since this issue spans these subsystems (I naively thought). Anyway,
thanks for detailed tip (honestly /me wonder if I forgot this later, as
is often the case).

>> On 7/2/23 10:31, Bagas Sanjaya wrote:
>>> I notice a regression report on Bugzilla [1]. Quoting from it:
>>>
>>>> I've spent the last week on debugging a problem with my attempt to upgrade my kernel from 6.2.8 to 6.3.8 (now also with
>> [...]
>>> See Bugzilla for the full thread.
>>>
>>> Anyway, I'm adding it to regzbot to make sure it doesn't fall through cracks
>>> unnoticed:
>>>
>>> #regzbot introduced: fed8d8773b8ea6 https://bugzilla.kernel.org/show_bug.cgi?id=217620
>>> #regzbot title: correcting acpi_is_processor_usable() check causes RCU stalls with wireguard over bonding+igb
>>> #regzbot link: https://bugs.gentoo.org/909066
>
>> satmd: Can you repeat bisection to confirm that fed8d8773b8ea6 is
>> really the culprit?
>
> I'd be careful to ask people that, as that might mean a lot of work for
> them. Best to leave things like that to developers, unless it's pretty
> obvious that something went sideways.
>

OK.

>> Thorsten: It seems like the reporter concluded bisection to the
>> (possibly) incorrect culprit.
>
> What makes your think so? I just looked at bugzilla and it (for now)
> seems reverting fed8d8773b8ea6 ontop of 6.4 fixed things for the
> reporter, which is a pretty strong indicator that this change really
> causes the trouble somehow.
>

OK too.

> /me really wonders what's he's missing
>
>> What can I do in this case besides
>> asking to repeat bisection?
>
> Not much apart from updating regzbot state (e.g. something like "regzbot
> introduced v6.3..v6.4") and a reply to your initial report (ideally with
> a quick apology) to let everyone know it was a false alarm.
>

OK.

--
An old man doll... just what I always wanted! - Clara