Re: [RFC/PATCH] SLQB: Mark the allocator as broken PowerPC and S390

From: Nick Piggin
Date: Thu Sep 17 2009 - 14:28:52 EST


On Thu, Sep 17, 2009 at 07:18:32PM +0100, Mel Gorman wrote:
> > Ahh... it's pretty lame of me. Sachin has been a willing tester :(
> > I have spent quite a few hours looking at it but I never found
> > many good leads. Much appreciated if you can make more progress on
> > it.
>
> Nothing much so far. I've reproduced the problem based on 2.6.31 and slqb-core
> from Pekka's tree but not a whole pile else. I don't know SLQB at all so the
> investigation is fuzzy. It appears to initialise SLQB ok but crashes later when
> setting up SCSI. Not 100% sure what the triggering event is but it might be
> userspace starting up and other CPUs get involved, possibly corrupting lists.
>
> This machine has two CPUs (0, 1) and two nodes with actual memory (2,3).
> After applying a patch to kmem_cache_create, I see in the console
>
> MEL::Creating cache pgd_cache CPU 0 Node 0
> MEL::Creating cache pmd_cache CPU 0 Node 0
> MEL::Creating cache pid_namespace CPU 0 Node 0
> MEL::Creating cache shmem_inode_cache CPU 0 Node 0
> MEL::Creating cache scsi_data_buffer CPU 1 Node 0
>
> It crashes at this point during creation before the struct kmem_cache has
> been allocated from kmem_cache_cache. Note it's kmem_cache_cache we are
> failing to allocate from, not scsi_data_buffer.

Yes, it's crashing in kmem_cache_create, when trying to allocate from
kmem_cache_cache.

I didn't get much further. I had thought something must be NULL or
not set up correctly in kmem_cache_cache, but I didn't work out what.

If you can identify the precondition which cases the crash (or even
just have a static counter of the number of caches created, to trigger
at the crashing cache create), then perhaps you can dump some more
details of the kmem_cache_cache.

Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/