Re: [Oops] i386 mm/slab.c (cache_flusharray)

From: Nathan Scott
Date: Mon Dec 08 2003 - 21:06:21 EST


On Tue, Dec 09, 2003 at 01:57:35AM +0100, pinotj@xxxxxxxxxxxxxxxx wrote:
> Results about testing on test11 this week-end.

thanks.

> ---
> ld: page allocation failure. order:0, mode:0x8d0
> Unable to handle kernel NULL pointer dereference at virtual address 00000074
> c01d4cbd
> *pde = 00000000
> Oops: 0002 [#1]
> CPU: 0
> EIP: 0060:[_xfs_trans_alloc+149/160] Not tainted
> EIP: 0060:[<c01d4cbd>] Not tainted
> ---

Ah, yes, I know what this is (and can reproduce it) and I don't
have a fix as yet. This is an unrelated problem - basically,
we are doing kmem_cache_alloc with __GFP_NOFAIL set within XFS,
but the allocations are failing and returning NULL (but we are
assuming they will not).

You (and I) are hitting this more frequently now because Linus'
patch bypasses the slab allocator and is more expensive in terms
of memory used. However, I have heard of one person who hit this
"in the wild" too, so its something we will need to address.

[ Christoph, is this failure expected? I think you/Steve made
some changes there to use __GFP_NOFAIL and assume it wont fail?
(in 2.4 we do memory allocations differently to better handle
failures, but that code was removed...) ]

>
> B. With Ext3 (and without XFS)
>
> 1. no patch
> same as I.A.1
> 2. patch-xfs & patch-slab
> Compilations looked good but I got a lot of errors in my logs:
>
> ---
> kernel: ld: page allocation failure. order:0, mode:0x50
> last message repeated 31 times
> klogd: page allocation failure. order:0, mode:0x50
> last message repeated 63 times
> kswapd0: page allocation failure. order:0, mode:0x50
> ENOMEM in journal_alloc_journal_head, retrying.
> ion failure. order:0, mode:0x50
> kswapd0: page allocation failure. order:0, mode:0x50
> last message repeated 291 times

I expect that this is a similar issue to the above, but from
ext2/3's point of view.

>
> Tests in I. confirm that it's not an XFS-only problem but seems to
> affect page allocation for fs in general.
> I hope these oops will be clearer to you. I still have no problem with test9.

FWIW and IIRC, we didn't push any XFS changes (nothing major
anyway, that I could foresee causing this) on to Linus in
between -test9 and -test11.

cheers.

--
Nathan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/