Re: [OOPS] amrestore dies in kmem_cache_free 2.6.16.18 - cannotrestore backups!

From: Kai Makisara
Date: Sat May 27 2006 - 05:34:32 EST


On Thu, 25 May 2006, Kai Makisara wrote:

> I am adding linux-scsi to recipients (and quoting the whole message for
> readers of that list).
>
> On Tue, 23 May 2006, James Lamanna wrote:
>
> > So I was able to recreate this problem on a vanilla 2.6.16.18 with the
> > following oops..
> > I'd say this is a serious regression since I cannot restore backups
> > anymore (I could with 2.6.14.x, but that kernel series had other
> > issues...)
> >
> > amrestore does manage to read 1 32k block from tape before dying.
> >
> > Any help would be greatly appreciated.
> >
> I have tried 'amrestore' on my machine with 2.6.16.18 but was not able to
> reproduce the problem.

OK. Now I think I have found something, thanks to Mike Christie's reminder
yesterday in another thread that the patch at the end of this message has
not been merged into 2.6.16 (and 2.6.17-rcx) ;-)

I did strace amrestore here and found out that the tape buffer addresses
were not aligned at 512 byte boundaries:

read(0x3, 0x523380, 0x8000) = 0x8000
read(0x3, 0x51b370, 0x8000) = 0x8000

This meant that st used the internal driver buffer aligned at page
boundary as input to scsi_execute_async and the problem fixed by the patch
did not occur.

(BTW, I hate this. The SCSI HBA would be perfectly capable to do direct
transfers from/to these addresses but the default alignment restrictions
prevent this and the HBA driver does not modify the defaults.)

Next I made a test program reading to a buffer with start address I could
control. When the offset from page boundary was 0 or not a multiple of
512, no errors occurred. When I set the offset to 512, I got the following
OOPS (this is from 2.6.17-rc5 with CONFIG_DEBUG_SLAB set but the
similarity is obvious):

kfree_debugcheck: out of range ptr fffffffffffffff8h.
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:2590
invalid opcode: 0000 [1]
CPU 0
Modules linked in: st snd_seq snd_pcm_oss snd_mixer_oss w83627hf hwmon_vid
i2c_isa snd_via82xx snd_ac97_codec snd_ac97_bus snd_pcm snd_timer
snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd i2c_viapro
i2c_core ohci1394 ieee1394
Pid: 11174, comm: talign Not tainted 2.6.17-rc5-g705af309 #7
RIP: 0010:[<ffffffff8025dd2a>] <ffffffff8025dd2a>{kfree_debugcheck+70}
RSP: 0018:ffff810011057c68 EFLAGS: 00010096
RAX: 0000000000000039 RBX: fffffffffffffff8 RCX: ffffffff80558f98
RDX: ffff810039dbae60 RSI: 0000000000000046 RDI: ffffffff80558f80
RBP: ffff81003ff991c0 R08: ffffffff80558f98 R09: 0000000000000020
R10: 0000000000000010 R11: 0000000000000010 R12: fffffffffffffff8
R13: 0000000000000246 R14: ffffffff80266d70 R15: ffff8100390f28e0
FS: 00002ba29f0aeb00(0000) GS:ffffffff806c8000(0000)
knlGS:00000000563b80c0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00002ba29eed407b CR3: 000000000cdbb000 CR4: 00000000000006e0
Process talign (pid: 11174, threadinfo ffff810011056000, task
ffff810039dbae60)
Stack: 0000000000000000 ffffffff8025e634 0000000000000000 ffff81003ff991c0
ffff810001fee098 0000000000000246 0000000000000200 ffffffff8025f2a8
ffff81003ff991c0 ffff81000b94e8b0
Call Trace: <ffffffff8025e634>{cache_free_debugcheck+35}
<ffffffff8025f2a8>{kmem_cache_free+41}
<ffffffff80266d70>{bio_free+48}
<ffffffff803fa0e5>{scsi_execute_async+374}
<ffffffff880edec7>{:st:st_do_scsi+504}
<ffffffff880ed045>{:st:st_sleep_done+0}
<ffffffff880ed89e>{:st:setup_buffering+516}
<ffffffff880f17cd>{:st:st_read+845}
<ffffffff8022381d>{__wake_up+54}
<ffffffff80262af3>{vfs_read+168} <ffffffff802634b0>{sys_read+69}
<ffffffff802095de>{system_call+126}

Code: 0f 0b 68 37 bd 4e 80 c2 1e 0a 48 b8 ff ff ff 7f ff ff ff ff
RIP <ffffffff8025dd2a>{kfree_debugcheck+70} RSP <ffff810011057c68>


Next thing was to patch 2.6.16.18 with the patch at the end: No more
oopses with any alignment.

James, does this fix your problem ?

Kai

--------------------------------8<------------------------------------------

Excerpt from a message from Brian Holty to linux-scsi and linux-kernel on
Wed, 22 Mar 2006 06:35:39:

...
Based on above, I think the most intuitive fix would be the offset addition of
the first entry to the initialization of nr_pages.

Without this change, for instance, with 4K io's every sg io that is
dma_aligned for direct io, but not page aligned will cause slab corruption
and an oops

I am able to run a number of tests with sg that cause the boundary to be
crossed, and with this fix there is no slab corruption or data corruption.

Thanks Dan, I had been hunting for this for a couple of days!!

Thoughts??

Signed-off-by: Bryan Holty <lgeek@xxxxxxxxxxxxxxx>

--- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600
+++ b/drivers/scsi/scsi_lib.c 2006-03-22 06:09:09.669599539 -0600
@@ -368,7 +368,7 @@
int nsegs, unsigned bufflen, gfp_t gfp)
{
struct request_queue *q = rq->q;
- int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
unsigned int data_len = 0, len, bytes, off;
struct page *page;
struct bio *bio = NULL;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/