Re: [PATCH 10/12] mm: page_alloc_wait

From: Andrew Morton
Date: Thu Apr 05 2007 - 18:58:49 EST


On Thu, 05 Apr 2007 19:42:19 +0200
root@xxxxxxxxxxxxxxxxxxxxxxxxx wrote:

> Introduce a mechanism to wait on free memory.
>
> Currently congestion_wait() is abused to do this.

Such a very small explanation for such a terrifying change.

> ...
>
> --- linux-2.6-mm.orig/mm/vmscan.c 2007-04-05 16:29:46.000000000 +0200
> +++ linux-2.6-mm/mm/vmscan.c 2007-04-05 16:29:49.000000000 +0200
> @@ -1436,6 +1436,7 @@ static int kswapd(void *p)
> finish_wait(&pgdat->kswapd_wait, &wait);
>
> balance_pgdat(pgdat, order);
> + page_alloc_ok();
> }
> return 0;
> }

For a start, we don't know that kswapd freed pages which are in a suitable
zone. And we don't know that kswapd freed pages which are in a suitable
cpuset.

congestion_wait() is similarly ignorant of the suitability of the pages,
but the whole idea behind congestion_wait is that it will throttle page
allocators to some speed which is proportional to the speed at which the IO
systems can retire writes - view it as a variable-speed polling operation,
in which the polling frequency goes up when the IO system gets faster.
This patch changes that philosophy fundamentally. That's worth more than a
2-line changelog.

Also, there might be situations in which kswapd gets stuck in some dark
corner. Perhaps the process which is waiting in the page allocator holds
filesystem locks which kswapd is blocked on. Or kswapd might be blocked on
a particular request queue, or a dead NFS server or something. The timeout
will save us, but things will be slow.

There could be other problems too, dunno - this stuff is tricky. Why are
you changing it, what problems are being solved, etc?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/