Re: [Bug #13648] nfsd: page allocation failure

From: David Rientjes
Date: Thu Jul 30 2009 - 17:31:01 EST


On Mon, 27 Jul 2009, Stephan von Krawczynski wrote:

> This is no regression between 2.6.29 and 2.6.30.
> In fact we could reproduce the problem with kernel versions:
>
> 2.6.27.26 < X <= 2.6.30.3
>
> (Meaning 2.6.27.26 is the last one _not_ showing the problem).
>

And 2.6.28.10 is showing the exact same problem as initially reported,
right?

I noticed your /var/log/messages is showing you're using slub as opposed
to slab (which Justin was using, and causing order-0 allocations errors).
SLUB uses order-1 allocations for this cache growth and it's failing
because of memory fragmentation, not because you're truly oom.

The only thing that is immediately apparent that changed in this path over
these kernel versions (there were significant changes to e1000e) is the
CRC stripping. If it's loaded as a module, perhaps you could try

modprobe e1000e CrcStripping=0,0

(assuming you have two adapters).

I've cc'd some relevant e1000e driver people in the hopes they'll be able
to diagnose this problem. Memory fragmentation as the result of page
group changes wouldn't affect order-0 allocations such as this on slab, so
it's doubtful the VM regressed if you can reproduce the problem with
CONFIG_SLAB.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/