Re: [PATCH] fix spurious OOM kills

From: Andrew Morton
Date: Tue Nov 16 2004 - 22:46:55 EST


Chris Ross <chris@xxxxxxxxxxxx> wrote:
>
> the oom killer strikes at the linking stage.

Can you beat on this patch a bit?

--- 25/mm/vmscan.c~a 2004-11-16 19:25:55.360041112 -0800
+++ 25-akpm/mm/vmscan.c 2004-11-16 19:26:45.791374384 -0800
@@ -918,11 +918,11 @@ int try_to_free_pages(struct zone **zone
lru_pages += zone->nr_active + zone->nr_inactive;
}

- for (priority = DEF_PRIORITY; priority >= 0; priority--) {
+ for (priority = DEF_PRIORITY; priority >= -1; priority--) {
sc.nr_mapped = read_page_state(nr_mapped);
sc.nr_scanned = 0;
sc.nr_reclaimed = 0;
- sc.priority = priority;
+ sc.priority = (priority < 0) ? 0 : priority;
shrink_caches(zones, &sc);
shrink_slab(sc.nr_scanned, gfp_mask, lru_pages);
if (reclaim_state) {
_


It just adds another priority-0 scanning pass before declaring oom. It
works for me.

See, when redoing the 2.5 scanning code a couple of years ago I reduced the
amount of scanning which we do before declaring oom (compared with 2.4) by
quite a lot. It was basically a "lets try this and see who complains"
exercise.

And since that time, the way in which `priority' is interpreted has
changed, which may have worsened things.

Presently we're scanning the entire active list twice and the entire
inactive list twice. I suspect that if the inactive list is full of
referenced pages, that just isn't enough. However it's hard to work out
what _is_ enough. Still, it doesn't hurt to do a bit more scanning before
going off killing things, so the above seems a safe approach.

If the above still doesn't work, try replacing -1 with -2, etc.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/