Re: [URGENT ASSISTANCE REQUESTED] production machines dying

Rik van Riel (H.H.vanRiel@fys.ruu.nl)
Tue, 25 Nov 1997 16:39:58 +0100 (MET)


On Tue, 25 Nov 1997, Mike Jagdis wrote:

> > Better than crashing, but still bad... We need a general
> > memory defragmenter, but for that we'll need a way to find
> > the pte of a physical page... Some kind of phys_to_virt without
> > knowing the PID ??
>
> You can do it by looking at the vm mappings which hang off the
> inode which hangs off the page.

Do all pages have the ->inode filled in?
What about stack pages and the like?

[snip nice piece of code]

> Once you can go page to ptes you should be able to implement a
> two handed clock that makes the page table scanning of try_to_free
> pretty well redundant.

That's one of the main goals...

> For extra brownie points you can change
> page allocation to first use pages that are unused, then pages
> that are stale, then pages that are "going off", and, as a last
> resort, steal a page which is "hot" but not dirty. (i.e. you don't
> need to remove pages from the page cache until you allocate them
> and can reclaim pages by changing the ptes that reference it. It
> becomes a question of managing the order of reuse, not tracking
> what pages are free and trying to force a number of pages to be
> free at all times).

This would also get rid of those pesty signal-7 and -9s that
kill your programs when kswapd can't keep up...
As a temporary measure, we could make the vmalloc routine
wake up kswapd when there's less than free_pages_low of free
memory. This may cost a little bit of performance, but it'll
always outperform a crashed app :)

> It's complicated by the fact that the page
> allocator needs to offer groups of pages which makes it a little
> more "fun" :-).

Not really, we just use an ext2-balloc like mechanism to look
for the xx-sized area with the most free/stale pages. When it
only does this below 16Megs, we only need to scan 4096 pages.
Some kind of bitmap for DMA-able memory is easily manageable.
And to make sure that there will be enough stale memory <16M,
we just 'charge more' for using lower memory, ie we age those
pages more agressive and/or migrate process pages to higher
memory (in combination with page-colouring?)

Rik.

----------
Send Linux memory-management wishes to me: I'm currently looking
for something to hack...