Re: 2.6.20.3: kernel BUG at mm/slab.c:597 try#2

From: Mike Christie
Date: Mon Mar 19 2007 - 15:07:32 EST


James Bottomley wrote:
> On Mon, 2007-03-19 at 12:49 -0500, Mike Christie wrote:
>>> I can't even say if the tapes are written correctly as I can't read them
>>> (one does not reboot production machines back to 2.4.x just to try to
>>> read a backup tape - I don't have 2.6.x older than 2.6.20 on these
>>> machines).
>> Could you try this patch
>> http://marc.info/?l=linux-scsi&m=116464965414878&w=2
>> I thought st was modified to not send offsets in the last elements but
>> it looks like it wasn't.
>
> Actually, there are two patches in the email referred to. If the
> analysis that we're passing NULL to mempool_free is correct, it should
> be the second one that fixes the problem (the one that checks
> bio->bi_io_vec before freeing it). Which would mean we have a
> nr_vecs==0 bio generated by the tar somehow.
>

I think we might only need the first patch if the problem is similar to
what the lsi guys were seeing. I thought the problem is that we are not
estimating how large the transfer is correctly because we do not take
into account offsets at the end. This results in nr_vecs being zero when
it should be a valid value. I thought Kai's patch:
http://bugzilla.kernel.org/show_bug.cgi?id=7919
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=9abe16c670bd3d4ab5519257514f9f291383d104
fixed the problem on st's side, but I guess not so you are probably right.

Here is a patch that dumps the sgl we are getting from st so we can see
for sure what we are getting and can decide if we need the first patch,
second patch or both.
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 5f95570..81005aa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -306,6 +306,10 @@ static int scsi_req_map_sg(struct reques
struct bio *bio = NULL;
int i, err, nr_vecs = 0;

+ for (i = 0; i < nsegs; i++)
+ printk(KERN_INFO "sg length %u offset %u\n", sgl[i].length,
+ sgl[i].offset);
+
for (i = 0; i < nsegs; i++) {
page = sgl[i].page;
off = sgl[i].offset;