Re: s390x: kernel BUG at fs/ext4/inode.c:1591!

From: Dmitry Monakhov
Date: Fri Mar 29 2013 - 06:08:50 EST


On Fri, 29 Mar 2013 04:53:43 -0400 (EDT), CAI Qian <caiqian@xxxxxxxxxx> wrote:
>
>
> ----- Original Message -----
> > From: "Dmitry Monakhov" <dmonakhov@xxxxxxxxxx>
> > To: "Theodore Ts'o" <tytso@xxxxxxx>, "CAI Qian" <caiqian@xxxxxxxxxx>
> > Cc: "LKML" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-s390" <linux-s390@xxxxxxxxxxxxxxx>, "Steve Best"
> > <sbest@xxxxxxxxxx>, linux-ext4@xxxxxxxxxxxxxxx
> > Sent: Thursday, March 28, 2013 10:56:37 PM
> > Subject: Re: s390x: kernel BUG at fs/ext4/inode.c:1591!
> >
> > On Thu, 28 Mar 2013 08:05:17 -0400, Theodore Ts'o <tytso@xxxxxxx>
> > wrote:
> > > On Thu, Mar 28, 2013 at 02:40:33AM -0400, CAI Qian wrote:
> > > > System hung when running xfstests-dev 013 test case on an s390x
> > > > guest. Never saw
> > > > this on 3.9-rc3 before but need to double-check. Any idea?
> > > >
> > > > Ã 1113.795759Â ------------Ã cut here Â------------
> > > > Ã 1113.795771Â kernel BUG at fs/ext4/inode.c:1591!
> > >
> > > thanks for the report. What kernel version did this come from?
> > > Was
> > > it 3.9-rc4? (line 1591 for 3.9-rc3 doesn't contain a BUG_ON).
> > >
> > > If it is indeed 3.9-rc4, it would be helpful, since you can
> > > reproduce
> > > the problem, to insert a debugging printk which fires when
> > > bh->b_blocknr != pblock before the BUG_ON, and have it print the
> > > b_blocknr and pblock values.
> > I've triggered this bug on before at the time i've worked on
> > e4defrag functionality, but AFAIK all related issues was aready fixed
> > and 013 has nothing with e4defrag.
> > But still bh->b_blocknr under us. So other obvious place I suspect is
> > puch_hole but this also not true because 013 use fsstress
> > test in vegetarian mode: "-f rmdir=10 -f link=10 -f creat=10 -f
> > mkdir=10
> > -f rename=30 -f stat=30 -f unlink=30 -f truncate=20"
> > So the only place I suspect is some unknown bug in extent status tree
> > Can you please enable ES_AGGRESSIVE_TEST and rerun xfstest.
> What is ES_AGGRESSIVE_TEST and how can it enable it?
Please apply patch. It should helps to spot an issue
diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index d8e2d4d..70233a6 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -24,7 +24,7 @@
* With ES_AGGRESSIVE_TEST defined, the result of es caching will be
* checked with old map_block's result.
*/
-#define ES_AGGRESSIVE_TEST__
+#define ES_AGGRESSIVE_TEST

/*
* These flags live in the high bits of extent_status.es_pblk
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index b3a5213..676c3e1 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1588,7 +1588,8 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
}
if (buffer_unwritten(bh) ||
buffer_mapped(bh))
- BUG_ON(bh->b_blocknr != pblock);
+ if (bh->b_blocknr != pblock)
+ goto map_corruption;
if (map->m_flags & EXT4_MAP_UNINIT)
set_buffer_uninit(bh);
clear_buffer_unwritten(bh);
@@ -1627,6 +1628,17 @@ static int mpage_da_submit_io(struct mpage_da_data *mpd,
}
ext4_io_submit(&io_submit);
return ret;
+
+map_corruption:
+ printk(KERN_ERR "mpage_da_submit_io failed block=%llu != b_blocknr=%llu\n",
+ (unsigned long long)pblock, (unsigned long long)bh->b_blocknr);
+ printk(KERN_ERR "ino:%ld lbkl:%lu, b_state=0x%08lx, b_size=%zu\n",
+ inode->i_ino, cur_logical, bh->b_state, bh->b_size);
+ /* We have triggered emergency situation. Do not waste our time on
+ * useless cleanup in order to pretend what situation is under controll.
+ * Just panic. */
+ BUG();
+ return -EIO;
}

static void ext4_da_block_invalidatepages(struct mpage_da_data *mpd)

> > >
> > > Thanks,
> > >
> > > - Ted
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe
> > > linux-kernel" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > Please read the FAQ at http://www.tux.org/lkml/
> >