Re: xfsdump hangs - 2.6.6 && 2.6.7-rc1-bk3

From: dag
Date: Thu May 27 2004 - 03:11:00 EST


One failure, one success, one question :-)

On Thu, 27 May 2004 15:58:29 +1000, Nathan Scott wrote:

>
> On Wed, May 26, 2004 at 09:13:14AM -0700, dag@xxxxxxxxx wrote:
> >
> > I experience hangs with xfsdump, when dumping my rootfs to a USB 2.0
> > ...
> > http://thaifood.homeip.net/xfsdumphang/xfsdump.dmesg.txt
> > ...
>
> The xfsdump stack trace in there is the important one.
> Can you try this patch and let me know how it goes?
>
> --- fs/xfs/linux/xfs_buf.c.orig 2004-05-27 14:06:59.992936144 +1000
> +++ fs/xfs/linux/xfs_buf.c 2004-05-27 14:08:21.548537808 +1000
> @@ -370,8 +370,12 @@
> retry:
> page = find_or_create_page(mapping, first + i, gfp_mask);
> if (unlikely(page == NULL)) {
> - if (flags & PBF_READ_AHEAD)
> + if (flags & PBF_READ_AHEAD) {
> + for (--i; i >= 0; i--)
> + page_cache_release(bp->pb_pages[i]);
> + _pagebuf_free_pages(bp);
> return -ENOMEM;
> + }
>
> /*
> * This could deadlock.

xfsdump completes "successfully", but prematurely. And I get an oops.

Unable to handle kernel NULL pointer dereference at virtual address
00000000
printing eip:
c02381b3
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
Modules linked in: 3c589_cs
CPU: 0
EIP: 0060:[<c02381b3>] Not tainted
EFLAGS: 00010206 (2.6.7-rc1-bk3)
EIP is at _pagebuf_lookup_pages+0x2b3/0x2e0
eax: ffffffff ebx: 0000ffff ecx: cba8db84 edx: 00000000
esi: cba8dac0 edi: 000389b4 ebp: 00002000 esp: cefa5cc8
ds: 007b es: 007b ss: 0068
Process xfsdump (pid: 5769, threadinfo=cefa4000 task=cf54b6f0)
Stack: cffa6afc 000389b4 00001200 00000000 cefa4000 00000000 00000000
000389b4
00000002 00001200 00000000 00001000 00000009 cffa6afc cba8dac0
cba8dac0
00019011 00000002 c023867b cba8dac0 00019011 00000000 00002000
00019011
Call Trace:
[<c023867b>] pagebuf_get+0x19b/0x1b0
[<c01f05e1>] xfs_btree_reada_bufs+0x71/0x90
[<c0217668>] xfs_bulkstat+0xd18/0x1000
[<c023c43b>] xfs_ioc_bulkstat+0x10b/0x1f0
[<c0216360>] xfs_bulkstat_one+0x0/0x5f0
[<c023c0bd>] xfs_ioctl+0x68d/0x840
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c022f9f7>] xfs_inactive_free_eofblocks+0x107/0x2d0
[<c01657e1>] dput+0x31/0x220
[<c023acad>] linvfs_ioctl+0x3d/0x70
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c0160d60>] sys_ioctl+0x100/0x270
[<c0105865>] setup_rt_frame+0x1b5/0x2d0
[<c014e1d1>] sys_close+0x61/0xa0
[<c0105d59>] sysenter_past_esp+0x52/0x71
[<c0105865>] setup_rt_frame+0x1b5/0x2d0

Code: 8b 02 f6 c4 08 75 f0 8b 42 04 40 74 14 83 42 04 ff 0f 98 c0

xfsrestore: restore complete: 403 seconds elapsed
xfsrestore: Restore Status: SUCCESS

Filesystem Size Used Avail Use% Mounted on
/dev/hda3 3.3G 2.0G 1.3G 62% /
/dev/scsi/host0/bus0/target0/lun0/part3
9.4G 562M 8.8G 6% /mnt/target



But: his patch from hch Works For Me:

--- 1.111/fs/xfs/linux/xfs_buf.c 2004-04-28 06:45:14 +02:00
+++ edited/fs/xfs/linux/xfs_buf.c 2004-05-26 18:58:14 +02:00
@@ -370,8 +370,12 @@
retry:
page = find_or_create_page(mapping, first + i, gfp_mask);
if (unlikely(page == NULL)) {
- if (flags & PBF_READ_AHEAD)
+ if (flags & PBF_READ_AHEAD) {
+ bp->pb_page_count = i;
+ for (i = 0; i < bp->pb_page_count; i++)
+ unlock_page(bp->pb_pages[i]);
return -ENOMEM;
+ }

/*
* This could deadlock.


Tested two dumps now, and both completes successfully. And for real.
I have yet to boot on the new root fs, though. :-)

The one remaining question is: why does xfsrestore print
xfsrestore: WARNING: open_by_handle of mnt failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of bin failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of dev/rd failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of dev/ida failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of dev failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of sys failed:Bad file descriptor
xfsrestore: WARNING: open_by_handle of tftpboot failed:Bad file descriptor

etc. etc. for what appears to be every directory in the source fs? This
is at the end of the dump, just prior to the

xfsrestore: restore complete: 403 seconds elapsed
xfsrestore: Restore Status: SUCCESS

message?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/