Re: Fwd: vmalloc error: btrfs-delalloc btrfs_work_helper [btrfs] in kernel 6.3.x

From: Forza
Date: Fri Jul 07 2023 - 06:49:11 EST




On 2023-07-06 12:54, Linux regression tracking (Thorsten Leemhuis) wrote:
On 06.07.23 10:08, Forza wrote:
On Wed, May 24, 2023 at 11:13:57AM +0200, David Sterba wrote:
[...]
A small update.

Thx for this.

I have been able test 6.2.16, all 6.3.x and 6.4.1 and they all show
the same issue.

I am now trying 6.1.37 since two days and have not been able to
reproduce this issue on any of my virtual qemu/kvm machines. Perhaps
this information is helpful in finding the root cause?

That means it's most likely a regression between v6.1..v6.2 (or
v6.1..v6.2.16 if we are unlucky) somewhere (from earlier in the thread
it sounds like it might not be Btrfs).

Agreed, I do not think this specific bug (cpuidle / default_enter_idle leaked IRQ state) is Btrfs related. Some of the virtual machines I test on do not use Btrfs.

Which makes we wonder: how long do you usually need to reproduce the
issue? If it's not too long it might mean that a bisection is the best
way forward, unless some developer sits down and looks closely at the
logs. With a bit of luck some dev will do that; but if we are unlucky we
likely will need a bisection.


It has varied. Sometimes immediately upon boot, but can take several hours or a day before showing up.


Also, I forgot to say I was basing my kernels on gentoo-kernels, which has some patches against vanilla. Therefore I will I will compile a set of vanilla kernels from 6.1.37 until 6.4.2 and run them in my testing machines to see where the problem is happening.

This is not a fast system, so it will likely take several days. But I will keep you posted.

Meanwhile, if you think of any specific kernel debug options, tracing, etc, that I should enable, let me know

Should we change the Subject line of this email thread?

Thanks

~Forza

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.