Re: [Myricom help #56546] Re: 2.6.24 Page Allocation Failure

From: Andrew Gallatin
Date: Fri Feb 01 2008 - 10:08:19 EST


AndrewL733 wrote:
> The cause of this problem seems to be compiling the Myricom driver with
> the ALLOC_ORDER=2 option. When I use the in-kernel driver, (1.3.2) or
> recompile the Myricom 1.4.0 driver WITHOUT the option, the problem seems
> to go away even after heavy hammering of the system.
>
> The ALLOC_ORDER=2 compiling option doesn't seem to cause any problem for
> the Myricom 1.4.0 driver in the 2.6.22 kernel but it does cause the
> problem when I run it in 2.6.24.

High order allocations are much harder to satisfy than single page
allocations, which is why we default MYRI10GE_ALLOC_ORDER to zero.
Part of the reason why we receive into pages rather than plain skbs is
to be able to use jumbo frames without high order allocations. In
your case, I suspect that "something" changed enough between 2.6.22
and 2.6.24 that caused the high order allocations to start failing
under the same workload. I don't know enough about the VM system
to know what changed.

In some quick experiments here, it seems that most of the cost of
these failures is the console messages that they generate. The driver
is written such that failures like this can be tolerated. I think the
warning should just be disabled for our rx buffer allocations.

I've attached a patch (against 1.4.0) which does this. When running a
memory hog which repeatedly reads one byte per page of an array larger
than physical memory, I still see line rate with this patch applied.
Without the patch, I see a few hundred Mb/s and constant page
allocation warnings. Please note that this disables the warnings just
for our rx buffer allocations. Other allocations done elsewhere in
the system may still generate warnings if the system is under heavy
memory pressure.

Drew Index: myri10ge.c
===================================================================
RCS file: /repository/myrige/linux/myri10ge.c,v
retrieving revision 1.313
diff -u -r1.313 myri10ge.c
--- myri10ge.c 17 Jan 2008 15:04:30 -0000 1.313
+++ myri10ge.c 1 Feb 2008 15:00:39 -0000
@@ -1655,7 +1655,8 @@
get_page(rx->page);
} else {
/* we need a new page */
- page = myri10ge_alloc_pages(GFP_ATOMIC | __GFP_COMP, MYRI10GE_ALLOC_ORDER);
+ page = myri10ge_alloc_pages(GFP_ATOMIC | __GFP_COMP | __GFP_NOWARN,
+ MYRI10GE_ALLOC_ORDER);
if (unlikely(page == NULL)) {
if (rx->fill_cnt - rx->cnt < 16)
rx->watchdog_needed = 1;