[PATCH] Fix small vmalloc per allocation limit

From: Andi Kleen
Date: Tue Feb 08 2005 - 05:34:58 EST



The vmap vmalloc rework in 2.5 had a unintended side effect.
vmalloc uses kmalloc now to allocate an array with a list of pages.
kmalloc has a 128K maximum. This limits the vmalloc maximum size
to 64MB on a 64bit system with 4K pages. That limit causes problems
with other subsystems, e.g. iptables relies on allocating large
vmallocs for its rule sets.

This is a bug IMHO - on 64bit platforms there shouldn't be
such a low limit on the vmalloc size. And even on 32bit it's too
small for custom kernels with enlarged vmalloc area.

Another problem is that this makes vmalloc unreliable. After the
system has been running for some time it is unlikely that kmalloc
will be able to allocate >order 2 pages due to memory fragmentation.

This patch takes the easy way out for fixing this by just allocating
this array with vmalloc when it is larger than a page.
While more complicated and intrusive solutions would be possible
they didn't use vmalloc recursively they didn't seem it worth
to handle this very infrequent case.

Please note that the vmalloc recursion is strictly bounded
because each nested allocation will generate a much smaller
stack frame. Also the kernel stack can handle even a few
recursion steps easily because vmalloc has only a small
stack frame.

Signed-off-by: Andi Kleen <ak@xxxxxxx>

diff -u linux-2.6.11rc3/mm/vmalloc.c-o linux-2.6.11rc3/mm/vmalloc.c
--- linux-2.6.11rc3/mm/vmalloc.c-o 2005-02-04 09:13:51.000000000 +0100
+++ linux-2.6.11rc3/mm/vmalloc.c 2005-02-08 10:59:58.000000000 +0100
@@ -378,7 +378,10 @@
__free_page(area->pages[i]);
}

- kfree(area->pages);
+ if (area->nr_pages > PAGE_SIZE/sizeof(struct page *))
+ vfree(area->pages);
+ else
+ kfree(area->pages);
}

kfree(area);
@@ -482,7 +485,12 @@
array_size = (nr_pages * sizeof(struct page *));

area->nr_pages = nr_pages;
- area->pages = pages = kmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM));
+ /* Please note that the recursion is strictly bounded. */
+ if (array_size > PAGE_SIZE)
+ pages = __vmalloc(array_size, gfp_mask, PAGE_KERNEL);
+ else
+ pages = kmalloc(array_size, (gfp_mask & ~__GFP_HIGHMEM));
+ area->pages = pages;
if (!area->pages) {
remove_vm_area(area->addr);
kfree(area);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/