Re: repeatable slab corruption with LTP msgctl08

From: Pekka J Enberg
Date: Thu Jun 12 2008 - 06:35:33 EST


Hi Andrew,

On Wed, 11 Jun 2008, Andrew Morton wrote:
> version is ltp-full-20070228 (lots of retro-computing there).
>
> Config is at http://userweb.kernel.org/~akpm/config-vmm.txt
>
> ./testcases/bin/msgctl08 crashes after ten minutes or so:
>
> slab: Internal list corruption detected in cache 'size-128'(26), slabp f2905000(20). Hexdump:
>
> 000: 00 e0 12 f2 88 32 c0 f7 88 00 00 00 88 50 90 f2
> 010: 14 00 00 00 0f 00 00 00 00 00 00 00 ff ff ff ff
> 020: fd ff ff ff fd ff ff ff fd ff ff ff fd ff ff ff
> 030: fd ff ff ff fd ff ff ff fd ff ff ff fd ff ff ff
> 040: fd ff ff ff fd ff ff ff 00 00 00 00 fd ff ff ff
> 050: fd ff ff ff fd ff ff ff 19 00 00 00 17 00 00 00
> 060: fd ff ff ff fd ff ff ff 0b 00 00 00 fd ff ff ff
> 070: fd ff ff ff fd ff ff ff fd ff ff ff fd ff ff ff
> 080: 10 00 00 00

Looking at the above dump, slabp->free is 0x0f and the bufctl it points to
is 0xff ("BUFCTL_END") which marks the last element in the chain. This is
wrong as the total number of objects in the slab (cachep->num) is 26 but
the number of objects in use (slabp->inuse) is 20. So somehow you have
managed to lost 6 objects from the bufctl chain.

I really don't understand how your bufctl chains has so many BUFCTL_END
elements in the first place. It's doesn't look like the memory has been
stomped on (slab->s_mem, for example, is 0xf2906088), so I'd look for a
double kfree() of size 128 somewhere...

Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/