Re: [PATCH] jfs: fix shift-out-of-bounds in dbJoin

From: Dave Kleikamp
Date: Mon Jan 29 2024 - 16:18:00 EST


On 1/29/24 12:29PM, Matthew Wilcox wrote:
On Mon, Jan 29, 2024 at 09:00:56AM -0600, Dave Kleikamp wrote:
On 1/29/24 8:55AM, Matthew Wilcox wrote:
On Mon, Jan 29, 2024 at 08:39:18AM -0600, Dave Kleikamp wrote:
On 1/28/24 2:49PM, Matthew Wilcox wrote:
On Wed, Oct 11, 2023 at 08:09:37PM +0530, Manas Ghandat wrote:
Currently while joining the leaf in a buddy system there is shift out
of bound error in calculation of BUDSIZE. Added the required check
to the BUDSIZE and fixed the documentation as well.

This patch causes xfstests to fail frequently. The one this trace is
from was generic/074.

Thanks for catching this. The sanity test is not right, so we need to revert
that one.

Unfortunately, my overnight test run with this patch reverted crashed
again with the same signature. I also reverted the parent commit,
and when that crashed I also reverted the parent of that. Which also
crashed.

So maybe there's something else that makes this unstable. Or maybe my
bisect went wrong. Or _something_. Anyway, I'm going to spend much of
today hammering on generic/074 with various kernel versions and see what
I can deduce.

So far I see no evidence that v6.7 crashes with g/074. And I know that
next-20240125 does crash with g/074. I'm pretty sure that v6.8-rc1 also
crashes with g/074, but will confirm that.

I'll try to beat on it too and see what I find.

Sasha, maybe hold up on to all the jfs patches for the time being.

I have it reproducing easily on cca974daeb6c. I ran it a lot on
e0e1958f4c36 and have not reproduced it. So I'm going back to my
earlier assertion that cca974daeb6c is bad. Now, maybe other commits
are also bad?

I was able to reproduce it too, but not after reverting that one. I believe it is the only one causing problems.

I only asked Sasha to hold the other ones as a precaution until we were more confident that this one was the problem.

Shaggy