Re: [PATCH] jfs: fix shift-out-of-bounds in dbJoin

From: Matthew Wilcox
Date: Mon Jan 29 2024 - 13:29:57 EST


On Mon, Jan 29, 2024 at 09:00:56AM -0600, Dave Kleikamp wrote:
> On 1/29/24 8:55AM, Matthew Wilcox wrote:
> > On Mon, Jan 29, 2024 at 08:39:18AM -0600, Dave Kleikamp wrote:
> > > On 1/28/24 2:49PM, Matthew Wilcox wrote:
> > > > On Wed, Oct 11, 2023 at 08:09:37PM +0530, Manas Ghandat wrote:
> > > > > Currently while joining the leaf in a buddy system there is shift out
> > > > > of bound error in calculation of BUDSIZE. Added the required check
> > > > > to the BUDSIZE and fixed the documentation as well.
> > > >
> > > > This patch causes xfstests to fail frequently. The one this trace is
> > > > from was generic/074.
> > >
> > > Thanks for catching this. The sanity test is not right, so we need to revert
> > > that one.
> >
> > Unfortunately, my overnight test run with this patch reverted crashed
> > again with the same signature. I also reverted the parent commit,
> > and when that crashed I also reverted the parent of that. Which also
> > crashed.
> >
> > So maybe there's something else that makes this unstable. Or maybe my
> > bisect went wrong. Or _something_. Anyway, I'm going to spend much of
> > today hammering on generic/074 with various kernel versions and see what
> > I can deduce.
> >
> > So far I see no evidence that v6.7 crashes with g/074. And I know that
> > next-20240125 does crash with g/074. I'm pretty sure that v6.8-rc1 also
> > crashes with g/074, but will confirm that.
>
> I'll try to beat on it too and see what I find.
>
> Sasha, maybe hold up on to all the jfs patches for the time being.

I have it reproducing easily on cca974daeb6c. I ran it a lot on
e0e1958f4c36 and have not reproduced it. So I'm going back to my
earlier assertion that cca974daeb6c is bad. Now, maybe other commits
are also bad?